Message175408
| Author |
vstinner |
| Recipients |
ezio.melotti, vstinner |
| Date |
2012年11月11日.23:34:24 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1352676864.76.0.63024120757.issue16455@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Attached patch works around the CODESET issue on OpenIndiana and FreeBSD. If the LC_CTYPE locale is "C" and nl_langinfo(CODESET) returns ASCII (or an alias of this encoding), b"\xE9" is decoded from the locale encoding: if the result is U+00E9, the patch Python uses ISO-8859-1. (If decoding fails, the locale encoding is really ASCII, the workaround is not used.)
If the result is different (b'\xe9' is not decoded from the locale encoding to U+00E9), a ValueError is raised. I wrote this test to detect bugs. I hope that our buildbots will validate the code. We may choose a different behaviour (ex: keep ASCII).
Example on FreeBSD 8.2, original Python 3.4:
$ ./python
>>> import sys, locale
>>> sys.getfilesystemencoding()
'ascii'
>>> locale.getpreferredencoding()
'US-ASCII'
Example on FreeBSD 8.2, patched Python 3.4:
$ ./python
>>> import sys, locale
>>> sys.getfilesystemencoding()
'iso8859-1'
>>> locale.getpreferredencoding()
'iso8859-1' |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2012年11月11日 23:34:24 | vstinner | set | recipients:
+ vstinner, ezio.melotti |
| 2012年11月11日 23:34:24 | vstinner | set | messageid: <1352676864.76.0.63024120757.issue16455@psf.upfronthosting.co.za> |
| 2012年11月11日 23:34:24 | vstinner | link | issue16455 messages |
| 2012年11月11日 23:34:24 | vstinner | create |
|