This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年10月09日 20:09 by kunkku, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| readline-wide-char-index.patch | kunkku, 2012年10月09日 20:09 | Patch for adjusting the scope indices | review | |
| readline_locale.patch | serhiy.storchaka, 2016年05月23日 06:02 | review | ||
| readline_locale.v2.patch | martin.panter, 2016年05月31日 13:53 | review | ||
| readline_locale.v3.patch | martin.panter, 2016年06月02日 09:46 | review | ||
| readline_locale.v4.patch | martin.panter, 2016年06月12日 10:57 | review | ||
| Messages (18) | |||
|---|---|---|---|
| msg172517 - (view) | Author: Kaarle Ritvanen (kunkku) * | Date: 2012年10月09日 20:09 | |
Tab completion in the readline module does not seem to work well with Unicode terminals. The get_line_buffer function converts the line buffer to the str type (which are Unicode strings in Python 3), but the indices returned by get_begidx and get_endidx are not adjusted with respect to possible wide characters in the buffer, and hence are not very useful. The documentation is a bit vague on the index functions, but I think they should be relative to code points, regardless of the encoding used by the C library. The suggested correction is attached. My second point of complaint is related to the use of PyUnicode_FromString in the module. The strings returned by the readline library use the current locale encoding, which is not necessarily UTF-8. I wonder if PyUnicode_DecodeLocale should be used instead for more portable code. |
|||
| msg223105 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2014年07月15日 13:00 | |
@Kaarle please accept our apologies for the delay in getting back to you. Can one of our unicode gurus comment please. |
|||
| msg266124 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2016年05月23日 06:02 | |
Yes, the readline module is broken in Python 3. Underlying C library operates C strings and use locale-depended C functions to split it on Unicode characters. The Python wrapper always uses the UTF-8 encoding for converting between Python strings and C strings. It works only on UTF-8 locales. get_begidx() and get_endidx() don't correctly work at all for non-ASCII data. We should use locale encoding for converting. Proposed patch makes the readline module to use locale depending coding functions instead of default UTF-8. It also corrects indices for get_begidx() and get_endidx(). |
|||
| msg266318 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年05月25日 07:43 | |
I’m a bit worried about flex_complete() covering up errors. If I add calls to PyErr_WriteUnraisable(), I can see both the s1 and s2 decodes failing. Input the following line: >>> "©"; import x; and then move the cursor back one place, so it is directly after the "x", but not after the semicolon (;). Then press Tab. Both errors are: ValueError: embedded null byte It looks like the reason is that PyUnicode_DecodeLocaleAndSize() requires that str[len] is a null character (the error message is misleading). It seems the len parameter is mainly there to verify that there are no embedded null characters, i.e. you cannot use it to give a truncated string. It looks like Py_DecodeLocale() is used underneath; maybe it is simpler to call that directly. But it does not solve the string truncating problem. A test case (perhaps using a pseudo-terminal) might also help pick this kind of thing up, if we can’t report errors any other way. |
|||
| msg266752 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年05月31日 13:51 | |
Attached patch adds a possible test, and fixes the truncation problem I mentioned above.
I tried testing set_completer_delims() with a UTF-8 locale, but I suspect Gnu Readline does not support it. I called set_completer_delims("\xF6"), which encodes as C3 B6, but it seems to be breaking any UTF-8 sequence in half at a C3 byte. In other words, it is treating the delimiter list as a list of bytes, not code points. So I changed to an ASCII delimiter.
|
|||
| msg266881 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年06月02日 09:46 | |
V3 finishes what I started in v2: * Changed unchecked PyBytes_AsString() → PyBytes_AS_STRING() * Testing more functions for non-ASCII characters I tried to test it with Editline on Linux (using my patch for Issue 13501). There seem to be many quirks with my version of Editline, some of which are not easy to work around: * Initial CR swallowed when entering non-ASCII * set_completion_display_matches_hook(), set_pre_input_hook(), set_completer_delims() all do nothing useful * get_history_item() not updated straight after read_history_file() I suspect Apple has patched their version of Editline, but if these quirks exist on Apple as well, it might be simplest to skip the test for Editline. |
|||
| msg267070 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2016年06月03日 07:22 | |
Thank you for your comments and updated patches, and especially for tests Martin. Added some comments on Rietveld. |
|||
| msg268230 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2016年06月11日 16:40 | |
Martin? |
|||
| msg268361 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年06月12日 10:57 | |
I updated the patch to fix the error handling and memory leak. it also now skips the test in case the locale cannot encode the test data. |
|||
| msg268374 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2016年06月12日 13:15 | |
LGTM. But the test fails with PYTHONIOENCODING=ascii. $ PYTHONIOENCODING=ascii ./python -m test test_readline Run tests sequentially 0:00:00 [1/1] test_readline test test_readline failed -- Traceback (most recent call last): File "/home/serhiy/py/cpython/Lib/test/test_readline.py", line 194, in test_nonascii self.assertIn(b"result " + expected + b"\r\n", output) AssertionError: b"result '[\\xefnserted]|t\\xebxt[after]'\r\n" not found in bytearray(b'^A^B^B^B^B^B^B^B\t\tx\t\r\n[\xc3\xafnserted]|t\xc3\xab[after]\x08\x08\x08\x08\x08\x08\x08text \'t\\xeb\'\r\nline \'[\\xefnserted]|t\\xeb[after]\'\r\nindexes 11 13\r\n\x07text \'t\\xeb\'\r\nline \'[\\xefnserted]|t\\xeb[after]\'\r\nindexes 11 13\r\nsubstitution \'t\\xeb\'\r\nmatches [\'t\\xebnt\', \'t\\xebxt\']\r\nx[after]\x08\x08\x08\x08\x08\x08\x08t[after]\x08\x08\x08\x08\x08\x08\x08\r\nTraceback (most recent call last):\r\n File "<string>", line 39, in <module>\r\nUnicodeDecodeError: \'ascii\' codec can\'t decode byte 0xc3 in position 1: ordinal not in range(128)\r\n') This is minor problem, since buildbots rarely configured with PYTHONIOENCODING=ascii. |
|||
| msg268499 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年06月14日 01:44 | |
I get two other test suite failures if I set PYTHONIOENCODING, so I am not going to bother addressing this in test_readline :) FAIL: test_forced_io_encoding (test.test_capi.EmbeddingTests) FAIL: test_7 (test.test_pkg.TestPkg) |
|||
| msg268506 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2016年06月14日 03:03 | |
New changeset 5122b3465a38 by Martin Panter in branch '3.5': Issue #16182: Fix readline begidx, endidx, and use locale encoding https://hg.python.org/cpython/rev/5122b3465a38 New changeset 2ae2657d87a6 by Martin Panter in branch 'default': Issue #16182: Merge readline locale fix from 3.5 https://hg.python.org/cpython/rev/2ae2657d87a6 |
|||
| msg268522 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年06月14日 05:50 | |
Failures from AMD64 Snow Leop buildbots: ====================================================================== FAIL: test_nonascii_history (test.test_readline.TestHistoryManipulation) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.5.murray-snowleopard/build/Lib/test/test_readline.py", line 102, in test_nonascii_history self.assertEqual(readline.get_history_item(1), "entrée 1") AssertionError: None != 'entrée 1' ====================================================================== FAIL: test_nonascii (test.test_readline.TestReadline) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.5.murray-snowleopard/build/Lib/test/test_readline.py", line 174, in test_nonascii self.assertIn(b"line '[\\xefnserted]|t\\xeb[after]'\r\n", output) AssertionError: b"line '[\\xefnserted]|t\\xeb[after]'\r\n" not found in bytearray(b"^A^B^B^B^B^B^B^B\t\tx\t\r\n|t\xc3\xab[after]\x08\x08\x08\x08\x08\x08\x08text \'t\\xeb\'\r\nline \'|t\\xeb[after]\'\r\nindexes 1 3\r\n\x07text \'t\\xeb\'\r\nline \'|t\\xeb[after]\'\r\nindexes 1 3\r\n\r\nt\xc3\xabxt t\xc3\xabnt \r\n\r\r\n|t\xc3\xab[after]\r|t\xc3\xabx[after]\r|t\xc3\xabxt[after]\r|t\xc3\xabxt\r\nresult \'|t\\xebxt[after]\'\r\nhistory \'|t\\xebxt[after]\'\r\n") |
|||
| msg268523 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2016年06月14日 05:51 | |
New changeset 0794bbfceec6 by Martin Panter in branch '3.5': Issue #16182: Attempted workarounds for Apple Editline https://hg.python.org/cpython/rev/0794bbfceec6 New changeset a1ca9c0ebc05 by Martin Panter in branch 'default': Issue #16182: Merge test_readline from 3.5 https://hg.python.org/cpython/rev/a1ca9c0ebc05 |
|||
| msg268541 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年06月14日 08:00 | |
Also need a fix for missing set_pre_input_hook() on the AIX buildbot: ====================================================================== FAIL: test_nonascii (test.test_readline.TestReadline) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/shager/cpython-buildarea/3.5.edelsohn-aix-ppc64/build/Lib/test/test_readline.py", line 173, in test_nonascii self.assertIn(b"text 't\\xeb'\r\n", output) AssertionError: b"text 't\\xeb'\r\n" not found in bytearray(b'\x01\x02\x02\x02\x02\x02\x02\x02 x \r\n\x1b[?1034hTraceback (most recent call last):\r\n File "<string>", line 18, in <module>\r\nAttributeError: module \'readline\' has no attribute \'set_pre_input_hook\'\r\n') |
|||
| msg268545 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2016年06月14日 09:02 | |
New changeset 005cab4f5629 by Martin Panter in branch '3.5': Issue #16182: set_pre_input_hook() may not exist; document, and update test https://hg.python.org/cpython/rev/005cab4f5629 New changeset c4dd384ee3fa by Martin Panter in branch 'default': Issue #16182: Merge readline update from 3.5 https://hg.python.org/cpython/rev/c4dd384ee3fa New changeset cff695a0b449 by Martin Panter in branch '2.7': Issue #16182: Backport documentation of set_pre_input_hook() availability https://hg.python.org/cpython/rev/cff695a0b449 |
|||
| msg268549 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2016年06月14日 11:37 | |
New changeset ef234a5c5817 by Martin Panter in branch '3.5': Issue #16182: One more check for set_pre_input_hook() https://hg.python.org/cpython/rev/ef234a5c5817 New changeset 241bae60cef8 by Martin Panter in branch 'default': Issue #16182: Merge test_readline from 3.5 https://hg.python.org/cpython/rev/241bae60cef8 |
|||
| msg268598 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年06月15日 01:08 | |
Think I got all the bugs fixed here. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:37 | admin | set | github: 60386 |
| 2016年06月15日 01:08:50 | martin.panter | set | status: open -> closed resolution: fixed messages: + msg268598 stage: patch review -> resolved |
| 2016年06月14日 11:37:20 | python-dev | set | messages: + msg268549 |
| 2016年06月14日 09:02:09 | python-dev | set | messages: + msg268545 |
| 2016年06月14日 08:00:21 | martin.panter | set | messages: + msg268541 |
| 2016年06月14日 05:51:15 | python-dev | set | messages: + msg268523 |
| 2016年06月14日 05:50:29 | martin.panter | set | messages: + msg268522 |
| 2016年06月14日 03:03:43 | python-dev | set | nosy:
+ python-dev messages: + msg268506 |
| 2016年06月14日 01:44:25 | martin.panter | set | messages: + msg268499 |
| 2016年06月12日 13:15:50 | serhiy.storchaka | set | messages: + msg268374 |
| 2016年06月12日 10:57:35 | martin.panter | set | files:
+ readline_locale.v4.patch messages: + msg268361 |
| 2016年06月11日 16:40:54 | serhiy.storchaka | set | messages: + msg268230 |
| 2016年06月03日 07:22:37 | serhiy.storchaka | set | messages: + msg267070 |
| 2016年06月02日 09:46:05 | martin.panter | set | files:
+ readline_locale.v3.patch messages: + msg266881 |
| 2016年05月31日 13:53:07 | martin.panter | set | files: + readline_locale.v2.patch |
| 2016年05月31日 13:52:46 | martin.panter | set | files: - readline_locale.v2.patch |
| 2016年05月31日 13:51:44 | martin.panter | set | files:
+ readline_locale.v2.patch messages: + msg266752 |
| 2016年05月28日 21:11:56 | BreamoreBoy | set | nosy:
- BreamoreBoy |
| 2016年05月25日 08:03:01 | martin.panter | link | issue25419 dependencies |
| 2016年05月25日 07:43:02 | martin.panter | set | messages: + msg266318 |
| 2016年05月23日 06:02:47 | serhiy.storchaka | set | files:
+ readline_locale.patch versions: + Python 3.6, - Python 3.4 nosy: + martin.panter messages: + msg266124 |
| 2015年10月17日 14:44:55 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka |
| 2014年07月15日 13:00:38 | BreamoreBoy | set | versions:
+ Python 3.5, - Python 3.2, Python 3.3 nosy: + BreamoreBoy, ezio.melotti, lemburg, loewis messages: + msg223105 components: + Unicode |
| 2012年10月09日 20:18:27 | pitrou | set | nosy:
+ vstinner stage: patch review versions: + Python 3.4 |
| 2012年10月09日 20:09:44 | kunkku | create | |