This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014年01月23日 15:15 by serhiy.storchaka, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| tkinter_null_character.patch | serhiy.storchaka, 2014年01月23日 15:15 | review | ||
| tkinter_null_character_2.patch | serhiy.storchaka, 2014年02月03日 10:08 | review | ||
| Messages (7) | |||
|---|---|---|---|
| msg208954 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年01月23日 15:15 | |
Tcl/Tk uses modified UTF-8 encoding to represent strings as C strings (char*). Because C strings are NUL-terminated, the null character represented as illegal UTF-8 sequence \xc0\x80. Current Tkinter code is not very aware about this. It has special handling the "\xc0\x80" string (i.e. encoded single null character) in one place, but doesn't handle encoded null character contained in larger string. As result Tkinter may truncate strings contained the null character, or return wrong result. The proposed patch fixes many issues with the null character (converting from Tcl to Python strings). NUL is still forbidden in string arguments of many methods. Also the patch enhances error handling for variable-related commands. |
|||
| msg210012 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年02月02日 20:44 | |
If there are no objections I'll commit this patch tomorrow. |
|||
| msg210075 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2014年02月03日 02:21 | |
The core of the patch is a wrapper that traps UnicodeDecodeErrors, corrects the strings, and re-decodes. A Python version might look like def unicodeFromTclStringAndSize(s, size): try: return <PyUnicode_DecodeUTF8(s, size, NULL)> except UnicodeDecodeError: if b'\xc0\x80' in s: s.replace(b'\xc0\x80', b'\x00') return <PyUnicode_DecodeUTF8(s, size, NULL)> else: raise This is used in a couple of additional wrappers and all direct decode calls are replaced with wrappers. New tests are added. Overall, a great idea, and I want to see this patch in 3.4. But, how many of the replacement sites are exercised by the tests? There are a few changes that seem unrelated to nulls, which might have been left for another patch. Example: -#if TCL_UTF_MAX==3 return PyUnicode_FromKindAndData( - PyUnicode_2BYTE_KIND, Tcl_GetUnicode(value), + sizeof(Tcl_UniChar), Tcl_GetUnicode(value), Tcl_GetCharLength(value)); -#else - return PyUnicode_FromKindAndData( - PyUnicode_4BYTE_KIND, Tcl_GetUnicode(value), - Tcl_GetCharLength(value)); -#endif Do you know if this code block is tested. |
|||
| msg210106 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年02月03日 10:08 | |
> But, how many of the replacement sites are exercised by the tests? I added tests for most the replacement sites and updated tests has even more tests. split() and splitlist() -- tested. Unfortunately they are tested only for bytes argument because these methods reject unicode string argument with NUL. Tcl_Obj.string, Tcl_Obj.typename and Tcl_Obj.__str__() -- not tested. There are no explicit tests for these properties and methods. Seems as Tcl_Obj.typename can't be tested for NUL. eval(), evalfile() -- tested. Variable's methods -- tested. exprstring() -- tested. I added tests for exprstring(), exprdouble(), exprlong(), exprboolean() in the patch. record() -- not tested. There are no explicit tests for record() and I have no ideas how it can be used in Python. C functions: FromObj() and Tkapp_CallResult() -- implicitly tested in a lot of tests, in particular in test_passing_values and test_user_command. PythonCmd() -- tested in test_user_command. > There are a few changes that seem unrelated to nulls, which might have been left for another patch. They are just make code more robust. For example Tcl can be compiled with TCL_UTF_MAX=6. In this case Python will work correctly most time but can work incorrectly or crash on specific rare data. With proposed changes it will raise SystemError early. Yes, it is worth separate issue. > Do you know if this code block is tested. It is implicitly tested in many tests which tests non-ASCII strings. |
|||
| msg210118 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2014年02月03日 12:14 | |
With the additional tests, it seems reasonable to apply. |
|||
| msg210155 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2014年02月03日 19:39 | |
New changeset a6ba6db9edb4 by Serhiy Storchaka in branch '2.7': Issue #20368: Add tests for Tkinter methods exprstring(), exprdouble(), http://hg.python.org/cpython/rev/a6ba6db9edb4 New changeset 825c8db8b1e2 by Serhiy Storchaka in branch '3.3': Issue #20368: Add tests for Tkinter methods exprstring(), exprdouble(), http://hg.python.org/cpython/rev/825c8db8b1e2 New changeset 28ec384e7dcc by Serhiy Storchaka in branch 'default': Issue #20368: Add tests for Tkinter methods exprstring(), exprdouble(), http://hg.python.org/cpython/rev/28ec384e7dcc New changeset 65c29c07bb31 by Serhiy Storchaka in branch '2.7': Issue #20368: The null character now correctly passed from Tcl to Python (in http://hg.python.org/cpython/rev/65c29c07bb31 New changeset 08e3343f01a5 by Serhiy Storchaka in branch '3.3': Issue #20368: The null character now correctly passed from Tcl to Python. http://hg.python.org/cpython/rev/08e3343f01a5 New changeset 321b714653e3 by Serhiy Storchaka in branch 'default': Issue #20368: The null character now correctly passed from Tcl to Python. http://hg.python.org/cpython/rev/321b714653e3 |
|||
| msg210278 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2014年02月04日 23:31 | |
New changeset d83ce3a2d954 by Christian Heimes in branch '3.3': Issue #20515: Fix NULL pointer dereference introduced by issue #20368 http://hg.python.org/cpython/rev/d83ce3a2d954 New changeset 145032f626d3 by Christian Heimes in branch 'default': Issue #20515: Fix NULL pointer dereference introduced by issue #20368 http://hg.python.org/cpython/rev/145032f626d3 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:57 | admin | set | github: 64567 |
| 2014年02月04日 23:31:42 | python-dev | set | messages: + msg210278 |
| 2014年02月03日 21:49:03 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2014年02月03日 19:39:30 | python-dev | set | nosy:
+ python-dev messages: + msg210155 |
| 2014年02月03日 12:14:21 | terry.reedy | set | messages: + msg210118 |
| 2014年02月03日 10:08:28 | serhiy.storchaka | set | files:
+ tkinter_null_character_2.patch messages: + msg210106 |
| 2014年02月03日 02:21:06 | terry.reedy | set | messages: + msg210075 |
| 2014年02月02日 20:44:03 | serhiy.storchaka | set | assignee: serhiy.storchaka messages: + msg210012 |
| 2014年01月23日 15:15:02 | serhiy.storchaka | create | |