This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年07月01日 10:51 by mark.dickinson, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue6393-fix.patch | ronaldoussoren, 2009年07月09日 08:31 | |||
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 1555 | merged | vstinner, 2017年05月12日 09:28 | |
| Messages (21) | |||
|---|---|---|---|
| msg89972 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2009年07月01日 10:51 | |
There was a report[1] on c.l.p. that python3 from the OS X Python 3.1 dmg download at www.python.org/download/releases/3.1/ crashes on startup. I can reproduce this with the python.org download (using the OS X Terminal) only with a bad locale setting: newton:~ dickinsm$ LANG=utf-8 python3 Fatal Python error: Py_Initialize: can't initialize sys standard streams LookupError: unknown encoding: Abort trap (core dumped) The core dump isn't useful: just lots of 'No symbol table info available.' This is on OS X 10.5.7/Intel. I can't reproduce it with either the py3k branch or the release31-maint branch, built from scratch. I suspect that this has to do with the behaviour of nl_langinfo(CODESET) on OS X: namely, after doing (in C) setlocale(LC_CTYPE, ""), the result of nl_langinfo(CODESET) appears to be "UTF-8" for well-defined utf-8 locales (e.g., 'en_US.UTF-8'), "US-ASCII" for meaningless locales (e.g., 'invalid'), but one just gets "" for locales like 'utf-8' or 'en_US'. This in turn affects Python's locale.getpreferredencoding function. See also issue 2173, which may be related. Ronald, any ideas? [1] http://mail.python.org/pipermail/python-list/2009-June/718255.html |
|||
| msg90285 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2009年07月08日 21:58 | |
This is a side effect of the fix for Issue6202. Prior to r73268, locale.getpreferredencoding always returned "mac-roman" regardless of the setting of LANG, so this wasn't a problem in py3k (or 3.0.x builds) up through 3.1rc1. I can reproduce it on current py3k and release31-maint. |
|||
| msg90302 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2009年07月09日 03:55 | |
Note, you can produce the same error on OS X or linux by setting PYTHONIOENCODING="", which effectively overrides the value returned nl_langinfo(CODESET). In pythonrun.c, create_stdio passes PYTHONENCODING, if set, on as the "encoding" value to TextIOWrapper. If no encoding was specified, TextIOWrapper uses the value returned by locale.getpreferrencoding(). It then calls PyCodec_IncrementalDecoder and the unknown (or empty) encoding is finally detected. That raises the question of how far python should go in protecting the user. One *could* add a check in pythonrun.c to substitute some suitable default (UTF-8) if nl_langinfo(CODESET) returns an empty value. Or perhaps just abort there with a more meaningful error message. |
|||
| msg90303 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2009年07月09日 03:58 | |
"... create_stdio passes PYTHONIOENCODING ..." |
|||
| msg90308 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2009年07月09日 07:51 | |
> One *could* add a check in pythonrun.c to substitute some suitable > default (UTF-8) if nl_langinfo(CODESET) returns an empty value. While googling for the source of this problem, I found other software projects that take this approach. It doesn't seem totally unreasonable. I just wish I understood *why* nl_langinfo(CODESET) is returning "" in these cases. I've looked for the source at http://www.opensource.apple.com, but can't find it; maybe that part of Darwin isn't open source. It seems that a lot of people end up with an OS X Terminal setup such that LC_CTYPE is 'UTF-8' (perhaps this is a 10.4 thing---I haven't encountered this myself); I don't think these people should have to deal with a confusing error on startup; defaulting to UTF-8 on OS X seems like a reasonable compromise. |
|||
| msg90310 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2009年07月09日 08:02 | |
The manpage says that nl_langinfo returns an empty string when there is an invalid setting. There is validity in saying that 'LANG=utf-8' is an invalid setting, the LANG variable is supposed to a locale name, which would be a language setting (possibly combined with a codeset definition). "utf-8" is not a language. I wouldn't mind falling back to utf-8 as the default codeset when nl_langinfo returns an empty string because utf-8 is the default character set on OSX, and furthermore defaulting to some value is way better than crashing. I do wonder how the user ended up with LANG=utf-8 in the first place. |
|||
| msg90312 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2009年07月09日 08:11 | |
> There is validity in saying that 'LANG=utf-8' is an invalid setting Agreed. But that doesn't really explain why e.g. LANG=en_US also produces "", while LANG=invalid produces "US-ASCII". > I do wonder how the user ended up with LANG=utf-8 in the first place. Me too. As far as I can gather, it's a result of setting the Terminal preferences (particularly the character encoding and 'Set LANG environment variable on startup' checkbox) in some particular way, on some versions of OS X, for users in some countries, at some particular phases of the moon, etc... |
|||
| msg90314 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2009年07月09日 08:31 | |
The attached patch (issue6393-fix.patch) seems to fix the issue. Could you please test and have a look at the patch? It basicly tests if the output of nl_langinfo(CODESET) is the empty string and defaults to 'UTF-8' in that case (but only on OSX). I intent to apply this patch unless someone objects to that. |
|||
| msg90320 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2009年07月09日 09:55 | |
Thanks, Ronald! The patch fixes the problem for me. (I directly patched the locale.py file installed from the Python dmg, since I still haven't figured out how to build a python executable that exhibits this problem.) The patch doesn't look quite right, though: in the else clause, it looks as though you're testing 'result' before it exists. Shouldn't the 'result = nl_langinfo(CODESET)' line come before the 'if not result and ....' line? On the subject of Terminal and LANG, LC_CTYPE settings, I found an interesting link: http://pastie.textmate.org/111807 Indeed, after setting my region to 'South Africa' in Preferences -> International -> Formats, a newly opened Terminal window gives me: newton:~ dickinsm$ locale LANG= LC_COLLATE="C" LC_CTYPE="UTF-8" LC_MESSAGES="C" LC_MONETARY="C" LC_NUMERIC="C" LC_TIME="C" LC_ALL= And then python3 crashes on startup as above. This is on a newborn (3- week old) MacBook Pro that's been barely changed from default settings (and no transfer of files and settings from an old Mac, either). |
|||
| msg90323 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2009年07月09日 10:16 | |
Good catch, the code in the else is indeed in the wrong order. |
|||
| msg90373 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2009年07月10日 03:35 | |
Looks good and the "patched" patch also works in a py3k installer build. BTW, Mark, I was curious as to why you were unable to reproduce the problem with your own build. I should have mentioned that my testing was with complete installer (framework) builds. I subsequently experimented with a non-framework build and found that I could not reproduce the problem running from the ./python in the build directory. Stepping through gdb showed that, during the calls from create_stdio, the import of locale fails in textio.c, so it falls back to using "ascii" as the default encoding (~line 899) and avoids the crash. If I do a make install, the unpatched installed bin/python3 does crash in the same way as with the installer python3. |
|||
| msg90445 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年07月12日 12:49 | |
Once this patch is checked in, should we do an emergency 3.1.1 release? |
|||
| msg90447 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2009年07月12日 13:00 | |
I'm don't know whether this is really worth a 3.1.1, all by itself. There's an easy workaround, which is for affected users to set their locale properly. |
|||
| msg90608 - (view) | Author: Graham Dumpleton (grahamd) | Date: 2009年07月17日 07:39 | |
I see this problem on both MacOS X 10.5 and on Windows. This is when using Python embedded inside of Apache/mod_wsgi. On MacOS X the error is: Fatal Python error: Py_Initialize: can't initialize sys standard streams ImportError: No module named encodings.utf_8 On Windows the error is: Fatal Python error: Py_Initialize: can't initialize sys standard streams LookupError: unknown encoding: cp0 The talk about the fix mentioned it only addressing MacOS X. What about Windows case I am seeing. Will it help with that at all? |
|||
| msg90609 - (view) | Author: Graham Dumpleton (grahamd) | Date: 2009年07月17日 07:41 | |
Hmmm, actually my MacOS X error is different, although Windows one is same, except that encoding is listed and isn't empty. |
|||
| msg90610 - (view) | Author: Graham Dumpleton (grahamd) | Date: 2009年07月17日 07:49 | |
You can ignore my MacOS X example as that was caused by something else. My question still stands as to whether the fix will address the similar problem I saw on Windows. |
|||
| msg90617 - (view) | Author: Graham Dumpleton (grahamd) | Date: 2009年07月17日 10:24 | |
I have created issue6501 for my Windows variant of this problem given that it appears to be subtly different due to there being an encoding where as the MacOS X variant doesn't have one. Seeing that the fix for the MacOS X issue is in Python code, I will when I have a chance look at whether can work out any fix for the Windows variant. Not sure I have right tools to compile Python from C code on Windows, so if a C code problem, not sure can really investigate. |
|||
| msg92322 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2009年09月06日 14:02 | |
I've applied the fixed version of my patch in r74687 (3.x) and r74688 (3.1). |
|||
| msg93174 - (view) | Author: Svetoslav Agafonkin (slavi) | Date: 2009年09月27日 15:58 | |
There is an error in r74687 (3.x) and r74688 (3.1) fixes - in the 'else' clause there should be 'return result' at the end. |
|||
| msg95124 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2009年11月10日 18:08 | |
The missing return result in the else case has been subsequently fixed in r75539 (py3k) and r75541 (3.0) so this issue should be re-closed. |
|||
| msg293537 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2017年05月12日 09:51 | |
New changeset 94a3694c3dda97e3bcb51264bf47d948c5424d84 by Victor Stinner in branch '2.7': bpo-6393: Fix locale.getprerredencoding() on macOS (#1555) https://github.com/python/cpython/commit/94a3694c3dda97e3bcb51264bf47d948c5424d84 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:50 | admin | set | github: 50642 |
| 2017年05月12日 09:51:40 | vstinner | set | nosy:
+ vstinner messages: + msg293537 |
| 2017年05月12日 09:28:12 | vstinner | set | pull_requests: + pull_request1651 |
| 2010年02月10日 18:14:42 | srid | set | nosy:
+ srid |
| 2009年11月24日 16:37:44 | ronaldoussoren | set | status: open -> closed |
| 2009年11月10日 18:08:14 | ned.deily | set | messages: + msg95124 |
| 2009年09月27日 15:58:14 | slavi | set | status: pending -> open nosy: + slavi messages: + msg93174 |
| 2009年09月06日 14:02:36 | ronaldoussoren | set | status: open -> pending resolution: fixed messages: + msg92322 stage: resolved |
| 2009年07月17日 10:24:41 | grahamd | set | messages: + msg90617 |
| 2009年07月17日 07:49:10 | grahamd | set | messages: + msg90610 |
| 2009年07月17日 07:41:40 | grahamd | set | messages: + msg90609 |
| 2009年07月17日 07:39:44 | grahamd | set | nosy:
+ grahamd messages: + msg90608 |
| 2009年07月12日 13:01:00 | mark.dickinson | set | messages: + msg90447 |
| 2009年07月12日 12:49:47 | pitrou | set | priority: critical versions: + Python 3.2 nosy: + pitrou, benjamin.peterson messages: + msg90445 |
| 2009年07月10日 03:35:39 | ned.deily | set | messages: + msg90373 |
| 2009年07月09日 10:16:09 | ronaldoussoren | set | messages: + msg90323 |
| 2009年07月09日 09:55:02 | mark.dickinson | set | messages: + msg90320 |
| 2009年07月09日 08:31:24 | ronaldoussoren | set | keywords:
+ needs review, patch files: + issue6393-fix.patch messages: + msg90314 |
| 2009年07月09日 08:11:50 | mark.dickinson | set | messages: + msg90312 |
| 2009年07月09日 08:02:54 | ronaldoussoren | set | messages: + msg90310 |
| 2009年07月09日 07:51:47 | mark.dickinson | set | messages: + msg90308 |
| 2009年07月09日 03:58:05 | ned.deily | set | messages: + msg90303 |
| 2009年07月09日 03:55:40 | ned.deily | set | messages: + msg90302 |
| 2009年07月08日 21:58:28 | ned.deily | set | nosy:
+ ned.deily messages: + msg90285 |
| 2009年07月08日 19:50:34 | Phil | set | nosy:
+ Phil |
| 2009年07月01日 10:51:55 | mark.dickinson | create | |