Message284647
| Author |
vstinner |
| Recipients |
Jan Niklas Hasse, abarry, akira, barry, deleted250130, ezio.melotti, lemburg, methane, ncoghlan, r.david.murray, vstinner, yan12125 |
| Date |
2017年01月04日.16:06:08 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<CAMpsgwYA3Cs8ofigMC+NoUEJmbi9Gia45CFeD-WdY2mEd501JQ@mail.gmail.com> |
| In-reply-to |
<1483541173.33.0.931483633089.issue28180@psf.upfronthosting.co.za> |
| Content |
> The default encoding in the C/POSIX locale is ASCII (which is the entire source of the problem).
The reality is more complex than that :-) It depends on the OS.
Some OS uses Latin1 for the POSIX locale. Some OS announces to use
Latin1 for the POSIX locale, but use ASCII in practice :-) On these
lying OS, Python decodes bytes 0x80..0xff using mbstowcs() to check if
we get ASCII or Latin1: see the check_force_ascii() function.
/* Workaround FreeBSD and OpenIndiana locale encoding issue with the C locale.
On these operating systems, nl_langinfo(CODESET) announces an alias of the
ASCII encoding, whereas mbstowcs() and wcstombs() functions use the
ISO-8859-1 encoding. The problem is that os.fsencode() and os.fsdecode() use
locale.getpreferredencoding() codec. For example, if command line arguments
are decoded by mbstowcs() and encoded back by os.fsencode(), we get a
UnicodeEncodeError instead of retrieving the original byte string.
The workaround is enabled if setlocale(LC_CTYPE, NULL) returns "C",
nl_langinfo(CODESET) announces "ascii" (or an alias to ASCII), and at least
one byte in range 0x80-0xff can be decoded from the locale encoding. The
workaround is also enabled on error, for example if getting the locale
failed.
(...) */ |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2017年01月04日 16:06:09 | vstinner | set | recipients:
+ vstinner, lemburg, barry, ncoghlan, ezio.melotti, r.david.murray, methane, akira, deleted250130, yan12125, abarry, Jan Niklas Hasse |
| 2017年01月04日 16:06:09 | vstinner | link | issue28180 messages |
| 2017年01月04日 16:06:08 | vstinner | create |
|