Message289340
| Author |
benjamin.peterson |
| Recipients |
Arfrever, benjamin.peterson, lemburg, loewis, serhiy.storchaka |
| Date |
2017年03月10日.07:37:03 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1489131424.74.0.828824726275.issue20087@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Do you believe this program should work?
import locale, os
for l in open("/usr/share/i18n/SUPPORTED"):
alias, encoding = l.strip().split()
locale.setlocale(locale.LC_ALL, alias)
try:
enc = locale.getlocale()[1]
except ValueError:
continue # not in table
normalized = enc.replace("ISO", "ISO-"). \
replace("_", "-"). \
replace("euc", "EUC-"). \
replace("big5", "big5-").upper()
assert normalized == locale.nl_langinfo(locale.CODESET)
After my change it does—the encoding returned from getlocale() is the one actually being used by glibc. It fails dramatically on earlier versions of Python (for example on the en_IN example from #29571.) I don't understand why Python needs to editorialize whatever choices libc or the system administrator has made.
Is getlocale() expected to return something different from the underlying C locale?
In fact, why have this table at all instead of using nl_langinfo to return the encoding for the current locale? |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2017年03月10日 07:37:04 | benjamin.peterson | set | recipients:
+ benjamin.peterson, lemburg, loewis, Arfrever, serhiy.storchaka |
| 2017年03月10日 07:37:04 | benjamin.peterson | set | messageid: <1489131424.74.0.828824726275.issue20087@psf.upfronthosting.co.za> |
| 2017年03月10日 07:37:04 | benjamin.peterson | link | issue20087 messages |
| 2017年03月10日 07:37:03 | benjamin.peterson | create |
|