This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年02月01日 19:24 by skip.montanaro, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| pydoc_encoding.patch | serhiy.storchaka, 2015年02月02日 15:21 | review | ||
| pydoc_encoding_2.patch | serhiy.storchaka, 2015年02月15日 12:59 | review | ||
| Messages (12) | |||
|---|---|---|---|
| msg235200 - (view) | Author: Skip Montanaro (skip.montanaro) * (Python triager) | Date: 2015年02月01日 19:24 | |
I'm probably doing something wrong, but I've tried everything I can think of without any success. In Python 2.7, the pydoc command successfully displays help for the sqlite3 package, though it muffs the output of Gerhard Häring's name, spitting out the original Latin-1 spelling. In Python 3.x, I get a UnicodeEncodeError for my trouble, and it hoses my tty settings to boot, requiring a LF reset LF sequence to put right unless I set PAGER to "cat". Here's a sample run: % PAGER=cat pydoc3.5 sqlite3 Traceback (most recent call last): File "/Users/skip/local/bin/pydoc3.5", line 5, in <module> pydoc.cli() File "/Users/skip/local/lib/python3.5/pydoc.py", line 2591, in cli help.help(arg) File "/Users/skip/local/lib/python3.5/pydoc.py", line 1874, in help elif request: doc(request, 'Help on %s:', output=self._output) File "/Users/skip/local/lib/python3.5/pydoc.py", line 1612, in doc pager(render_doc(thing, title, forceload)) File "/Users/skip/local/lib/python3.5/pydoc.py", line 1412, in pager pager(text) File "/Users/skip/local/lib/python3.5/pydoc.py", line 1428, in <lambda> return lambda text: pipepager(text, os.environ['PAGER']) File "/Users/skip/local/lib/python3.5/pydoc.py", line 1455, in pipepager pipe.write(text) UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 600: ordinal not in range(128) I understand the error, but I see no way to convince it to use any codec other than "ascii". Stuff I tried: * setting PYTHONIOENCODING to "UTF-8" (suggested by Peter Otten on c.l.py) * setting LANG to "en_US.utf8" This is on a Mac running Yosemite with pydoc invoked in Apple's Terminal app. Display is fine in my browser when I run pydoc as a web server. The source it is attempting to display has a coding cookie, so it should know that the code is encoded using Latin-1. The problem seems to all be about generating output. |
|||
| msg235201 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年02月01日 19:51 | |
What are sys.getfilesystemencoding(), locale.getpreferredencoding(False), os.popen('cat', 'w').encoding?
|
|||
| msg235202 - (view) | Author: Skip Montanaro (skip.montanaro) * (Python triager) | Date: 2015年02月01日 20:15 | |
Without setting any environment variables:
>>> import sys
>>> sys.getfilesystemencoding()
'utf-8'
>>> import locale
>>> locale.getpreferredencoding(False)
'US-ASCII'
>>> import os
>>> os.popen('cat', 'w').encoding
'US-ASCII'
If I set PYTHONIOENCODING=UTF-8:
>>> import sys, locale, os
>>> sys.getfilesystemencoding()
'utf-8'
>>> locale.getpreferredencoding(False)
'US-ASCII'
>>> os.popen('cat', 'w').encoding
'US-ASCII'
If I set LANG=en_US.utf8:
>>> import sys, locale, os
>>> sys.getfilesystemencoding()
'utf-8'
>>> locale.getpreferredencoding(False)
'US-ASCII'
>>> os.popen('cat', 'w').encoding
'US-ASCII'
It appears neither of these environment variables does much in my environment.
I should point out that I just updated to Mac OS X 10.10.2 a couple
days ago. I have no idea if this problem existed before that upgrade.
Realizing that perhaps something had changed in the underlying
operating system support, I rebuilt Python 2.6 through 3.5 from
scratch. Same result.
|
|||
| msg235203 - (view) | Author: Skip Montanaro (skip.montanaro) * (Python triager) | Date: 2015年02月01日 20:19 | |
Peter Otten posted a solution on c.l.py. The issue is that I didn't mix my case properly when setting LANG: hgpython% LANG=en_US.UTF-8 python3.5 -c 'import locale; print(locale.getpreferredencoding(False))' UTF-8 hgpython% LANG=en_US.utf8 python3.5 -c 'import locale; print(locale.getpreferredencoding(False))' US-ASCII |
|||
| msg235204 - (view) | Author: Skip Montanaro (skip.montanaro) * (Python triager) | Date: 2015年02月01日 20:26 | |
On Sun, Feb 1, 2015 at 2:19 PM, Skip Montanaro <report@bugs.python.org> wrote: > The issue is that I didn't > mix my case properly when setting LANG: Actually, it's that the hyphen is required in "utf-8" or "UTF-8". |
|||
| msg235206 - (view) | Author: Skip Montanaro (skip.montanaro) * (Python triager) | Date: 2015年02月01日 20:59 | |
Final note here. Peter also did a bit of digging. Here's his note about what he found on c.l.py: The pager is invoked by os.popen(), and after some digging I find that it uses a io.TestIOWrapper() to write the help text. This in turn uses locale.getpreferredencoding(False), i. e. you were right to set LANG and PYTHONIOENCODING is not relevant. I was also able to provoke this problem on an openSuSE 12.2 system with 3.2.3 installed. In that environment (confirmed by Chris Angelico on his Linux system), the case of "utf" didn't matter, nor did it matter if "utf-8" was hyphenated or not. Obviously the Mac continues to be a rather touchy system w.r.t. locale. I don't know if Python should try to be accommodating here, but my inclination is "no". OTOH, maybe io.TestIOWrapper should look at PYTHONIOENCODING, or the pager should be invoked through something other than os.popen (assuming there is a suitable replacement which does pay attention to PYTHONIOENCODING). |
|||
| msg235208 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年02月01日 21:32 | |
Maybe because a pager sends its bytes more-or-less straight throught from input to output, the PYTHONIOENCODING (sys.stdout.encoding?) should be used for the TextIOWrapper to the pager’s input in this case. I’m not so sure this should be assumed in general though. |
|||
| msg235263 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年02月02日 15:21 | |
There are few levels of this issue: 1) pydoc doesn't escape characters according to output encoding. It escapes characters uneencodable with sys.getfilesystemencoding(), but this encoding can differ from the encoding of sys.stdout or default encoding. 2) Default encoding for io.TestIOWrapper() and open() can be different from sys.getfilesystemencoding(). And it unexpectedly can be ASCII. 3) Mac OS doesn't support locales with the utf8 encoding (without hyphen). Here is a patch which solves first level -- makes pydoc using appropriate encoding with the backslashreplace error handler. |
|||
| msg236036 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年02月15日 12:59 | |
Added a test. |
|||
| msg236076 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年02月15日 22:49 | |
Patch looks sensible to me. This is another example of where Issue 15216 would be useful (a standard way to modify the encoding settings of a stream). |
|||
| msg236078 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年02月15日 23:11 | |
In the case of this issue pydoc needs change not the encoding of stdout, but errors handler of stdout. There is similar issue with pprint (issue19100). |
|||
| msg236334 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年02月20日 21:48 | |
New changeset e7b6b1f57268 by Serhiy Storchaka in branch '3.4': Issue #23374: Fixed pydoc failure with non-ASCII files when stdout encoding https://hg.python.org/cpython/rev/e7b6b1f57268 New changeset affe167a45f3 by Serhiy Storchaka in branch 'default': Issue #23374: Fixed pydoc failure with non-ASCII files when stdout encoding https://hg.python.org/cpython/rev/affe167a45f3 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:12 | admin | set | github: 67563 |
| 2015年02月20日 21:49:08 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2015年02月20日 21:48:31 | python-dev | set | nosy:
+ python-dev messages: + msg236334 |
| 2015年02月15日 23:11:53 | serhiy.storchaka | set | messages: + msg236078 |
| 2015年02月15日 22:49:05 | martin.panter | set | messages: + msg236076 |
| 2015年02月15日 12:59:01 | serhiy.storchaka | set | files:
+ pydoc_encoding_2.patch messages: + msg236036 |
| 2015年02月15日 12:08:41 | serhiy.storchaka | set | assignee: serhiy.storchaka |
| 2015年02月02日 15:21:25 | serhiy.storchaka | set | files:
+ pydoc_encoding.patch versions: - Python 3.2, Python 3.3 messages: + msg235263 keywords: + patch type: crash -> behavior stage: patch review |
| 2015年02月01日 23:00:19 | r.david.murray | set | nosy:
+ r.david.murray |
| 2015年02月01日 21:32:48 | martin.panter | set | nosy:
+ martin.panter messages: + msg235208 |
| 2015年02月01日 20:59:39 | skip.montanaro | set | messages: + msg235206 |
| 2015年02月01日 20:26:39 | skip.montanaro | set | messages: + msg235204 |
| 2015年02月01日 20:19:07 | skip.montanaro | set | messages: + msg235203 |
| 2015年02月01日 20:15:41 | skip.montanaro | set | messages: + msg235202 |
| 2015年02月01日 19:51:30 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg235201 |
| 2015年02月01日 19:24:18 | skip.montanaro | create | |