Message191709
| Author |
belopolsky |
| Recipients |
belopolsky, cvrebert, eric.araujo, eric.smith, ezio.melotti, lemburg, mark.dickinson, ncoghlan, skrah, vstinner |
| Date |
2013年06月23日.16:59:57 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1372006798.06.0.82188600287.issue10581@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Martin v. Löwis wrote at #18236 (msg191687):
> int conversion ultimately uses Py_ISSPACE, which conceptually could
> deviate from the Unicode properties (as it is byte-based). This is not
> really an issue, since they indeed match.
Py_ISSPACE matches Unicode White_Space property in the ASII range (first 128 code points) it differs for byte (code point) values from 128 through 255. This leads to the following discrepancy:
>>> int('123\xa0')
123
but
>>> int(b'123\xa0')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 3: invalid start byte
>>> int('123\xa0'.encode())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '123\xa0' |
|