This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年11月04日 14:37 by stefanholek, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| input_readline.patch | pitrou, 2011年11月04日 17:25 | |||
| Messages (15) | |||
|---|---|---|---|
| msg147005 - (view) | Author: Stefan Holek (stefanholek) | Date: 2011年11月04日 14:37 | |
The input builtin always uses "strict" error handling for Unicode conversions. This means that when I enter a latin-1 string in a utf-8 environment, input breaks with a UnicodeDecodeError. Now don't tell me not to do that, I have a valid use-case. ;-) While "strict" may be a good default choice, it is clearly not sufficient. I would like to propose an optional 'errors' argument to input, similar to the 'errors' argument the decode and encode methods have. I have in fact implemented such an input method for my own use: https://github.com/stefanholek/rl/blob/surrogate-input/rl/input.c While this solves my immediate needs, the fact that my implementation is basically just a copy of bltinmode.input with one additional argument, makes me think that this could be fixed in Python proper. There cannot be a reason input() should be confined to "strict", or can there? ;-) |
|||
| msg147007 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2011年11月04日 14:49 | |
There's no reason you couldn't write your own input() function in Python to do this. |
|||
| msg147008 - (view) | Author: Stefan Holek (stefanholek) | Date: 2011年11月04日 15:00 | |
I am not quite sure how I would write a custom, readline-using input function in Python (access to PyOS_Readline seems required), that's why I did it in C. Have an example? |
|||
| msg147010 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年11月04日 15:06 | |
> There cannot be a reason input() should be confined to "strict", or can > there? ;-) Actually, there's a good reason: in the non-interactive case, input() simply calls sys.stdin.read(), which doesn't have encoding or errors attributes. You want to override sys.stdin so that it has the right error handler. However, there is a bug in input() in that it ignores sys.stdin's error handler in interactive mode (where it delegates to the readline library, if present): >>> import sys, io >>> sys.stdin = io.TextIOWrapper(sys.stdin.detach(), "ascii", "replace") >>> sys.stdin.read() héhé 'h��h��\n' >>> input() héhé Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128) If you don't mind losing GNU readline functionality, the immediate workaround for you is to use sys.stdin.read() directly. |
|||
| msg147020 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年11月04日 17:25 | |
Here is a patch. The bugfix itself is quite pedestrian, but the test is more interesting. I did what I could to fork a subprocess into a pseudoterminal so as to trigger the GNU readline code path. The only limitation I've found is that I'm unable to read further on the child's stdout after input() has been called. The test therefore uses a pipe to do the return checking. |
|||
| msg147029 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2011年11月04日 20:20 | |
> The bugfix itself is quite pedestrian, but the test is more interesting. Indeed. Looks good to me. |
|||
| msg147035 - (view) | Author: Stefan Holek (stefanholek) | Date: 2011年11月04日 20:51 | |
Thank you Antoine, this looks good. However when I try your example I get sys.stdin = io.TextIOWrapper( sys.stdin.detach(), 'ascii', 'replace') ValueError: underlying buffer has been detached </helpforum> |
|||
| msg147036 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年11月04日 20:52 | |
> However when I try your example I get > > sys.stdin = io.TextIOWrapper( > sys.stdin.detach(), 'ascii', 'replace') > ValueError: underlying buffer has been detached Which version of Python (and which OS?). It works fine here on latest 3.2 and 3.3 branches. |
|||
| msg147038 - (view) | Author: Stefan Holek (stefanholek) | Date: 2011年11月04日 20:59 | |
This is with Python 3.2.2 on Mac OS X 10.6 (SL). I have built Python from source with: ./configure; make; make install. |
|||
| msg147045 - (view) | Author: Stefan Holek (stefanholek) | Date: 2011年11月04日 21:35 | |
Python 3.2.2 (default, Nov 4 2011, 22:28:55) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import sys, io >>> w = io.TextIOWrapper(sys.stdin.detach(), 'ascii', 'replace') >>> input <built-in function input> >>> input() Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: underlying buffer has been detached |
|||
| msg147047 - (view) | Author: Stefan Holek (stefanholek) | Date: 2011年11月04日 21:37 | |
Oops, the last one wasn't meant for the bug tracker. <blush> |
|||
| msg147049 - (view) | Author: Stefan Holek (stefanholek) | Date: 2011年11月04日 21:47 | |
I can make it work at the interpreter prompt with your patch applied. Sorry for cluttering up the ticket. ;-) |
|||
| msg147050 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年11月04日 21:56 | |
> I can make it work at the interpreter prompt with your patch applied. > Sorry for cluttering up the ticket. ;-) That's ok, thanks a lot for testing. |
|||
| msg147123 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2011年11月05日 23:46 | |
New changeset 421c8e291221 by Antoine Pitrou in branch '3.2': Issue #13342: input() used to ignore sys.stdin's and sys.stdout's unicode http://hg.python.org/cpython/rev/421c8e291221 New changeset 992ba03d60a8 by Antoine Pitrou in branch 'default': Issue #13342: input() used to ignore sys.stdin's and sys.stdout's unicode http://hg.python.org/cpython/rev/992ba03d60a8 |
|||
| msg147124 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年11月05日 23:47 | |
Committed. I hope the test won't disturb the buildbots. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:23 | admin | set | github: 57551 |
| 2011年11月05日 23:47:21 | pitrou | set | status: open -> closed resolution: fixed messages: + msg147124 stage: patch review -> resolved |
| 2011年11月05日 23:46:58 | python-dev | set | nosy:
+ python-dev messages: + msg147123 |
| 2011年11月04日 21:56:21 | pitrou | set | messages: + msg147050 |
| 2011年11月04日 21:47:01 | stefanholek | set | messages: + msg147049 |
| 2011年11月04日 21:37:08 | stefanholek | set | messages: + msg147047 |
| 2011年11月04日 21:35:08 | stefanholek | set | messages: + msg147045 |
| 2011年11月04日 20:59:58 | stefanholek | set | messages: + msg147038 |
| 2011年11月04日 20:52:34 | pitrou | set | messages: + msg147036 |
| 2011年11月04日 20:51:04 | stefanholek | set | messages: + msg147035 |
| 2011年11月04日 20:20:45 | neologix | set | messages: + msg147029 |
| 2011年11月04日 17:25:45 | pitrou | set | files:
+ input_readline.patch nosy: + neologix messages: + msg147020 keywords: + patch stage: needs patch -> patch review |
| 2011年11月04日 15:08:06 | vstinner | set | nosy:
+ vstinner |
| 2011年11月04日 15:06:19 | pitrou | set | versions:
- Python 3.4 nosy: + pitrou messages: + msg147010 components: + Interpreter Core, - Unicode stage: needs patch |
| 2011年11月04日 15:00:57 | stefanholek | set | messages: + msg147008 |
| 2011年11月04日 14:49:17 | benjamin.peterson | set | nosy:
+ benjamin.peterson messages: + msg147007 |
| 2011年11月04日 14:37:14 | stefanholek | create | |