[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)


On 11/13/2021 4:35 PM, [email protected] wrote:
I’ve not been following the thread, but Steve Holden forwarded me the 
To explore the extreme case, I wrote a pyparsing transformer to convert 
identifiers in a body of Python source to mixed font, equivalent to the 
original source after NFKC normalization. Here are hello.py, and a 
snippet from unittest/utils.py:
def hello():
   try:
hello_ = "Hello"
world_ = "World"
     print(f"{hello_}, {world_}!")
   except TypeError as exc:
print("failed: {}".format(exc))
if __name__ == "__main__":
hello()
# snippet from unittest/util.py
_PLACEHOLDER_LEN = 12
def _shorten(s, prefixlen, suffixlen):
   skip = len(s) - prefixlen - suffixlen
   if skip > _PLACEHOLDER_LEN:
s = '%s[%d chars]%s' % (s[:prefixlen], skip, s[len(s) - 
suffixlen:])
   return s
You should able to paste these into your local UTF-8-aware editor or IDE 
and execute them as-is.
Wow. After pasting the util.py snippet into current IDLE, which on my 
Windows machine* displays the complete text:
>>> dir()
['_PLACEHOLDER_LEN', '__annotations__', '__builtins__', '__doc__', 
'__loader__', '__name__', '__package__', '__spec__', '_shorten']
>>> _shorten('abc', 1, 1)
'abc'
>>> _shorten('abcdefghijklmnopqrw', 2, 2)
'ab[15 chars]rw'
* Does not at all work in CommandPrompt, even after supposedly changing 
to a utf-8 codepage with 'chcp 65000'.
--
Terry Jan Reedy
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/NSGBCZQ2R6G2HGPAID4ZI35YCRMF7ERC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Preventing Unicode-related gotchas (Was: pre-PEP: Unicode Security Considerations for Python)

Reply via email to