Message106774
| Author |
ezio.melotti |
| Recipients |
PeterL, ezio.melotti |
| Date |
2010年05月30日.19:12:33 |
| SpamBayes Score |
1.5570284e-07 |
| Marked as misclassified |
No |
| Message-id |
<1275246756.15.0.893458000522.issue8859@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Both on Linux and Windows I get:
>>> '\xa0'.isspace()
False
>>> u'\xa0'.isspace()
True
The Unicode char u'\xa0' is U+00A0 NO-BREAK SPACE, so unicode.split correctly considers it a whitespace.
However '\xa0' is not a whitespace, so str.split ignores it.
The correct solution is to convert your string to Unicode and then split.
I'd close this as invalid but I'd like you to confirm that the example I posted and that 'split' return the same result on both Linux and Windows before doing so (the fact that on Linux works it's probably caused by something else -- e.g. the label is already Unicode). |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2010年05月30日 19:12:36 | ezio.melotti | set | recipients:
+ ezio.melotti, PeterL |
| 2010年05月30日 19:12:36 | ezio.melotti | set | messageid: <1275246756.15.0.893458000522.issue8859@psf.upfronthosting.co.za> |
| 2010年05月30日 19:12:33 | ezio.melotti | link | issue8859 messages |
| 2010年05月30日 19:12:33 | ezio.melotti | create |
|