homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mark.dickinson
Recipients mark.dickinson
Date 2008年11月30日.18:54:07
SpamBayes Score 1.8109727e-08
Marked as misclassified No
Message-id <1228071248.49.0.10094276273.issue4474@psf.upfronthosting.co.za>
In-reply-to
Content
On systems (Linux, OS X) where sizeof(wchar_t) is 4 and wchar_t arrays are 
usually encoded as UTF-32, it looks as though PyUnicode_FromWideChar 
simply truncates the 32-bit characters to 16-bits, thus giving incorrect 
results for characters outside the BMP. I expected it to convert the UTF-
32 encoding to UTF-16.
Note that PyUnicode_FromWideChar is used to process command-line 
arguments, so strange things can happen when passing filenames with non-
BMP characters to a Python script.
Here's an OS X 10.5 Terminal session (current directory is the root of the 
py3k tree).
dickinsm$ cat test𐅭.py
from sys import argv
print("My arguments are: ",argv)
dickinsm$ ./python.exe test𐅭.py
My arguments are: ['testŭ.py']
dickinsm$ ./python.exe Lib/tabnanny.py test𐅭.py
'testŭ.py': I/O Error: [Errno 2] No such file or directory: 'testŭ.py'
(In case the character after 'test' and before '.py' isn't showing up 
correctly, it's chr(65901), 'GREEK ACROPHONIC TROEZENIAN FIVE HUNDRED'.)
History
Date User Action Args
2008年11月30日 18:54:08mark.dickinsonsetrecipients: + mark.dickinson
2008年11月30日 18:54:08mark.dickinsonsetmessageid: <1228071248.49.0.10094276273.issue4474@psf.upfronthosting.co.za>
2008年11月30日 18:54:07mark.dickinsonlinkissue4474 messages
2008年11月30日 18:54:07mark.dickinsoncreate

AltStyle によって変換されたページ (->オリジナル) /