Message135707
| Author |
kbk |
| Recipients |
Bernt.Røskar.Brenna, BreamoreBoy, jsprunck, kbk, mgstrein, ned.deily, r.david.murray |
| Date |
2011年05月10日.14:45:04 |
| SpamBayes Score |
2.0235602e-08 |
| Marked as misclassified |
No |
| Message-id |
<1305038706.35.0.232088682441.issue1028@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Tcl/Tk uses modified utf-8 internally. This includes using 0xC080, a multibyte Unicode null character, for embedded nulls that work with C's null terminated strings. Java does the same.
Note that typing Ctrl-space and Ctrl-2 are conventional ways to enter a null from the keyboard. That's the reason a null char is associated with those key combinations.
When Tcl exports Unicode, it is supposed to be strict utf-8. Until Tcl8.5, the %A (Unicode character corresponding to an event) was incorrectly leaking the modified Unicode null.
_tkinter.c.2.patch is narrowly focused: if PythonCmd raises a UnicodeDecodeError and if the string passed in an arg is 0xC080, it is replaced with the Unicode null 0x00. |
|