Message263569
| Author |
serhiy.storchaka |
| Recipients |
arbyter, ezio.melotti, mrabarnett, pitrou, serhiy.storchaka |
| Date |
2016年04月16日.17:41:33 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1460828493.99.0.935073707321.issue26784@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
First, in the context of Python a crash means a core dump or an analogue on Windows. In this case the code just works not as you expected.
The short answer: s should be a unicode.
In your code "ä" is encoded as 8-bit string '\xc3\xa4'. When matched, every bytes is independently expanded to Unicode range. The first byte becomes u'\xc3' = u'Ã', the second byte becomes u'¤', non-alphanumeric. '[\s\w]*' doesn't match u'ä'.
"ü" is encoded as 8-bit string '\xc3\xbc'. The second byte becomes u'1⁄4', numeric. '[\s\w]*' matches u'Ã1⁄4'. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2016年04月16日 17:41:34 | serhiy.storchaka | set | recipients:
+ serhiy.storchaka, pitrou, ezio.melotti, mrabarnett, arbyter |
| 2016年04月16日 17:41:33 | serhiy.storchaka | set | messageid: <1460828493.99.0.935073707321.issue26784@psf.upfronthosting.co.za> |
| 2016年04月16日 17:41:33 | serhiy.storchaka | link | issue26784 messages |
| 2016年04月16日 17:41:33 | serhiy.storchaka | create |
|