homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author chris.jerdonek
Recipients Arfrever, Ariel.Ben-Yehuda, berker.peksag, chris.jerdonek, eric.smith, ezio.melotti, loewis, serhiy.storchaka
Date 2012年09月17日.02:02:20
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1347847342.69.0.862599556531.issue15276@psf.upfronthosting.co.za>
In-reply-to
Content
I did some analysis of this issue.
For starters, I could not reproduce this on Mac OS X 10.7.4. I iterated through all available locales, and the separator was ASCII in all cases.
Instead, I was able to fake the issue by changing "," to "\xa0" in the following line--
http://hg.python.org/cpython/file/820032281f49/Objects/stringlib/formatter.h#l651
and then reproduce with:
>>> u'{:,}'.format(10000)
 ..
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 2: ordinal not in range(128)
>>> format(10000, u',')
 ..
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 2: ordinal not in range(128)
However, note this difference (see also issue 15952)--
>>> (10000).__format__(u',')
'10\xa0000'
The issue seems to be that PyObject_Format() in Objects/abstract.c (which, unlike int__format__() in Objects/intobject.c, does respect whether the format string is unicode or not) calls int__format__() to get the formatted string as a byte string. It then passes this to PyObject_Unicode() to convert to unicode. This in turn calls PyUnicode_FromEncodedObject() with a NULL encoding, which causes that code to use PyUnicode_GetDefaultEncoding() for the encoding (i.e. sys.getdefaultencoding()).
The right way to fix this seems to be to make int__format__() return unicode as appropriate, which may mean modifying formatter.h's format_int_or_long_internal() to return unicode -- as well as taking into account the locale encoding when accessing the locale's thousands separator.
History
Date User Action Args
2012年09月17日 02:02:22chris.jerdoneksetrecipients: + chris.jerdonek, loewis, eric.smith, ezio.melotti, Arfrever, berker.peksag, serhiy.storchaka, Ariel.Ben-Yehuda
2012年09月17日 02:02:22chris.jerdoneksetmessageid: <1347847342.69.0.862599556531.issue15276@psf.upfronthosting.co.za>
2012年09月17日 02:02:22chris.jerdoneklinkissue15276 messages
2012年09月17日 02:02:20chris.jerdonekcreate

AltStyle によって変換されたページ (->オリジナル) /