This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年10月20日 15:31 by ixokai, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue10154.patch | ronaldoussoren, 2011年05月07日 07:19 | review | ||
| Messages (15) | |||
|---|---|---|---|
| msg119213 - (view) | Author: Stephen Hansen (ixokai) (Python triager) | Date: 2010年10月20日 15:31 | |
In the course of investigating issue10092, Georg discovered that the behavior of locale.normalize() on Mac is bad. Basically, "en_US.UTF-8" is how the "correct" locale string should be spelled on the Mac. If you drop the dash, it fails: which locale.normalize does, so you can't pass the return value of the function to setlocale, even though that's what its documented to be for. If that isn't clear, this should demonstrate (from /branches/py3k): Top-2:build pythonbuildbot$ ./python.exe Python 3.2a3+ (py3k:85631, Oct 17 2010, 06:45:22) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import locale [51767 refs] >>> locale.normalize("en_US.UTF-8") 'en_US.UTF8' [51770 refs] >>> locale.setlocale(locale.LC_TIME, 'en_US.UTF8') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/pythonbuildbot/test/build/Lib/locale.py", line 538, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting [51816 refs] >>> locale.setlocale(locale.LC_TIME, 'en_US.UTF-8') 'en_US.UTF-8' [51816 refs] The precise same behavior exists on my stock/system Python 2.6, too, fwiw. (Not that it can be fixed on 2.6, but maybe 2.7?) |
|||
| msg119216 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2010年10月20日 15:46 | |
This patch solves the immediate failure: Index: Lib/locale.py =================================================================== --- Lib/locale.py (revision 85743) +++ Lib/locale.py (working copy) @@ -396,6 +396,9 @@ else: encoding = defenc #print 'found encoding %r' % encoding + if sys.platform == 'darwin' and encoding == 'UTF8': + encoding = 'UTF-8' + if encoding: return langname + '.' + encoding else: I'm not happy about hardcoding this specific exception though, there should be a better solution than this. Ronald |
|||
| msg119236 - (view) | Author: Marc-Andre Lemburg (lemburg) * (Python committer) | Date: 2010年10月20日 21:47 | |
Ronald Oussoren wrote: > > Ronald Oussoren <ronaldoussoren@mac.com> added the comment: > > This patch solves the immediate failure: > > Index: Lib/locale.py > =================================================================== > --- Lib/locale.py (revision 85743) > +++ Lib/locale.py (working copy) > @@ -396,6 +396,9 @@ > else: > encoding = defenc > #print 'found encoding %r' % encoding > + if sys.platform == 'darwin' and encoding == 'UTF8': > + encoding = 'UTF-8' > + > if encoding: > return langname + '.' + encoding > else: > > I'm not happy about hardcoding this specific exception though, there should be a better solution than this. Could you tell me the values of localename, code, langname and encoding at that step in the process ? We may need to add an locale_encoding_alias from 'UTF8' to 'UTF-8', since the version with the hyphen is what the C lib uses. |
|||
| msg119298 - (view) | Author: Stephen Hansen (ixokai) (Python triager) | Date: 2010年10月21日 13:53 | |
Mark, the locals() right before "if encoding:" (line 399) are:
>>> locale.normalize("en_US.UTF-8")
{'code': 'en_US.ISO8859-1', 'langname': 'en_US', 'encoding': 'UTF8', 'norm_encoding': 'utf_8', 'defenc': 'ISO8859-1', 'localename': 'en_US.UTF-8', 'lookup_name': 'en_us.utf-8', 'fullname': 'en_us.utf-8'}
'en_US.UTF8'
|
|||
| msg119301 - (view) | Author: Marc-Andre Lemburg (lemburg) * (Python committer) | Date: 2010年10月21日 14:15 | |
Stephen Hansen wrote: > > Stephen Hansen <me+python@ixokai.io> added the comment: > > Mark, the locals() right before "if encoding:" (line 399) are: > >>>> locale.normalize("en_US.UTF-8") > {'code': 'en_US.ISO8859-1', 'langname': 'en_US', 'encoding': 'UTF8', 'norm_encoding': 'utf_8', 'defenc': 'ISO8859-1', 'localename': 'en_US.UTF-8', 'lookup_name': 'en_us.utf-8', 'fullname': 'en_us.utf-8'} > 'en_US.UTF8' Thanks. Line 646 in the alias table is wrong: 'utf_8': 'UTF8', should read: 'utf_8': 'UTF-8', I wonder why this wasn't reported earlier - did the GlibC change the UTF-8 spelling at some point ? I do vaguely remember that I had to remove the hyphen due to problems with setlocale() not accepting 'UTF-8', but that was at the time I wrote that part of locale.py, i.e. many years ago. It doesn't appear to be necessary anymore. I checked on openSUSE 10.3 and 11.3. Both work fine with 'UTF-8' and 'UTF8'. |
|||
| msg119309 - (view) | Author: Georg Brandl (georg.brandl) * (Python committer) | Date: 2010年10月21日 15:27 | |
If other Posix-y systems accept both spellings and only Macs insist on the dash, we should probably indeed change the alias entry to use it. |
|||
| msg122374 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月25日 15:42 | |
Mandriva and Debian also work fine with both "UTF8" and "UTF-8". For the record, the canonical spelling inside /usr/share/locale is "UTF-8". I suppose glibc does its own normalization. |
|||
| msg123553 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2010年12月07日 14:29 | |
UTF-8 works on SuSE Enterprise Linux 9 and 10 as well. BTW, neither UTF8 nor UTF-8 work on HPUX 10. That platform requires spelling it as utf8. This sadly enought means that this code doesn't work on HPUX 10: >>> locale.setlocale(locale.LC_ALL, locale.getdefaultlocale()) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/python2.7/lib/python2.7/locale.py", line 531, in setlocale return _setlocale(category, locale) locale.Error: unsupported locale setting That's because getdefaultlocale returns 'UTF8' as the encoding, even though LANG is set to 'nl_NL.utf8' (which is a working locale on the machine I tested). BTW. I'm +1 on changing the alias table as Marc-Andre proposed. |
|||
| msg123667 - (view) | Author: MunSic JEONG (ruseel) | Date: 2010年12月09日 02:34 | |
Ubuntu 10.4.1 LTS also work fine with both "UTF8" and "UTF-8" |
|||
| msg129662 - (view) | Author: Boris FELD (Boris.FELD) * | Date: 2011年02月27日 22:00 | |
Bug confirmed on python2.5+ and python3.2-. If it works with the dash, is agree with the Marc-Andre solution. |
|||
| msg134271 - (view) | Author: Piotr Sikora (PiotrSikora) | Date: 2011年04月22日 16:52 | |
It's the same on OpenBSD (and I'm pretty sure it's true for other BSDs as well). >>> locale.resetlocale() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.6/locale.py", line 523, in resetlocale _setlocale(category, _build_localename(getdefaultlocale())) locale.Error: unsupported locale setting >>> locale._build_localename(locale.getdefaultlocale()) 'en_US.UTF8' Works fine with Marc-Andre's alias table fix. Any chances this will be eventually fixed in 2.x? |
|||
| msg134450 - (view) | Author: Marc-Andre Lemburg (lemburg) * (Python committer) | Date: 2011年04月26日 10:18 | |
Piotr Sikora wrote: > > Piotr Sikora <piotr.sikora@frickle.com> added the comment: > > It's the same on OpenBSD (and I'm pretty sure it's true for other BSDs as well). > >>>> locale.resetlocale() > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/usr/local/lib/python2.6/locale.py", line 523, in resetlocale > _setlocale(category, _build_localename(getdefaultlocale())) > locale.Error: unsupported locale setting >>>> locale._build_localename(locale.getdefaultlocale()) > 'en_US.UTF8' > > Works fine with Marc-Andre's alias table fix. > > Any chances this will be eventually fixed in 2.x? This can go into Python 2.7, and, of course, into the 3.x branches. |
|||
| msg135406 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2011年05月07日 07:19 | |
The attached patch implements the change that Marc-Andre proposed. I intend to apply this patch to all active branches later today (after some more testing) |
|||
| msg136150 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2011年05月17日 12:10 | |
New changeset 932de36903e7 by Ronald Oussoren in branch '2.7': (backport)Fix #10154 and #10090: locale normalizes the UTF-8 encoding to "UTF-8" instead of "UTF8" http://hg.python.org/cpython/rev/932de36903e7 New changeset 28e410eb86af by Ronald Oussoren in branch '3.1': Fix #10154 and #10090: locale normalizes the UTF-8 encoding to "UTF-8" instead of "UTF8" http://hg.python.org/cpython/rev/28e410eb86af New changeset 454d13e535ff by Ronald Oussoren in branch '3.2': (merge) Fix #10154 and #10090: locale normalizes the UTF-8 encoding to "UTF-8" instead of "UTF8" http://hg.python.org/cpython/rev/454d13e535ff |
|||
| msg136154 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2011年05月17日 12:49 | |
New changeset 3d7cb852a176 by Ronald Oussoren in branch 'default': Fix for issue 10154, merge from 3.2 http://hg.python.org/cpython/rev/3d7cb852a176 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:07 | admin | set | github: 54363 |
| 2014年10月02日 08:28:35 | serhiy.storchaka | link | issue1176504 superseder |
| 2011年05月17日 12:49:51 | python-dev | set | messages: + msg136154 |
| 2011年05月17日 12:14:26 | ronaldoussoren | set | status: open -> closed |
| 2011年05月17日 12:14:03 | ronaldoussoren | set | resolution: fixed stage: needs patch -> resolved |
| 2011年05月17日 12:10:12 | python-dev | set | nosy:
+ python-dev messages: + msg136150 |
| 2011年05月07日 08:07:34 | vstinner | set | nosy:
+ vstinner |
| 2011年05月07日 07:19:43 | ronaldoussoren | set | files:
+ issue10154.patch keywords: + patch messages: + msg135406 |
| 2011年04月26日 10:18:54 | lemburg | set | messages:
+ msg134450 title: locale.normalize strips "-" from UTF-8, which fails on Mac -> locale.normalize strips "-" from UTF-8, which fails on Mac |
| 2011年04月23日 15:46:10 | eric.araujo | set | title: locale.normalize strips "-" from UTF-8, which fails on Mac -> locale.normalize strips "-" from UTF-8, which fails on Mac stage: needs patch versions: + Python 3.3, - Python 2.6, Python 2.5 |
| 2011年04月22日 16:52:24 | PiotrSikora | set | nosy:
+ PiotrSikora messages: + msg134271 |
| 2011年02月27日 22:00:08 | Boris.FELD | set | nosy:
+ Boris.FELD messages: + msg129662 versions: + Python 2.6, Python 2.5 |
| 2010年12月09日 02:34:31 | ruseel | set | messages: + msg123667 |
| 2010年12月07日 14:29:42 | ronaldoussoren | set | messages: + msg123553 |
| 2010年11月25日 15:42:48 | pitrou | set | nosy:
+ pitrou messages: + msg122374 |
| 2010年11月25日 02:12:40 | ruseel | set | nosy:
+ ruseel |
| 2010年10月22日 17:37:08 | eric.araujo | link | issue10090 dependencies |
| 2010年10月21日 15:27:04 | georg.brandl | set | nosy:
+ georg.brandl messages: + msg119309 |
| 2010年10月21日 14:15:06 | lemburg | set | messages: + msg119301 |
| 2010年10月21日 13:53:57 | ixokai | set | messages: + msg119298 |
| 2010年10月20日 21:47:40 | lemburg | set | nosy:
+ lemburg title: locale.normalize strips "-" from UTF-8, which fails on Mac -> locale.normalize strips "-" from UTF-8, which fails on Mac messages: + msg119236 |
| 2010年10月20日 15:49:01 | ronaldoussoren | set | files: - smime.p7s |
| 2010年10月20日 15:46:22 | ronaldoussoren | set | files:
+ smime.p7s messages: + msg119216 |
| 2010年10月20日 15:31:23 | ixokai | create | |