homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: format method: c presentation type broken in 2.7
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: eric.smith Nosy List: BreamoreBoy, benjamin.peterson, doerwalter, eric.smith, ezio.melotti, francismb, jwilk, python-dev, serhiy.storchaka, terry.reedy, vstinner
Priority: high Keywords: patch

Created on 2009年11月05日 16:22 by doerwalter, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue7267.patch francismb, 2013年03月23日 21:28 review
int_format_c.patch vstinner, 2013年07月02日 00:34 review
int_format_c_warn.patch serhiy.storchaka, 2015年05月13日 09:02 review
Messages (23)
msg94935 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2009年11月05日 16:22
The c presentation type in the new format method from PEP 3101 seems to
be broken:
Python 2.6.4 (r264:75706, Oct 27 2009, 15:18:04) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> u'{0:c}'.format(256)
u'\x00'
The PEP states:
'c' - Character. Converts the integer to the corresponding Unicode
character before printing, so I would have expected this to return
u'\u0100' instead of u'\x00'.
msg94936 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009年11月05日 16:30
I'll look at it.
msg94969 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009年11月06日 14:09
This is a bug in the way ints and longs are formatted. They always do
the formatting as str, then convert to unicode. This works everywhere
except with the 'c' presentation type. I'm still trying to decide how
best to handle this.
msg94972 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2009年11月06日 14:52
I'd say that a value >= 128 should generate a Unicode string (as the PEP
explicitely states that the value is a Unicode code point and not a byte
value).
However str.format() doesn't seem to support mixing str and unicode anyway:
>>> '{0}'.format(u'\u3042')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u3042' in
position 0: ordinal not in range(128)
so str.format() might raise an OverflowError for values >= 128 (or >= 256?)
msg95113 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2009年11月10日 13:20
> so str.format() might raise an OverflowError for values >= 128 (or >=
256?)
Maybe, but the issue you reported is in unicode.format() (not
str.format()), and I think that should be fixed. I'm trying to think of
how best to address it.
As for the second issue you raise (which I think is that str.format()
can't take a unicode argument), would you mind opening a separate issue
for this and assigning it to me? Thanks.
msg95115 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2009年11月10日 13:58
Done: issue 7300.
msg98107 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010年01月21日 11:38
See also issue #7649.
msg98173 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010年01月23日 00:46
('%c' % 255) == chr(255) == '\xff'
'%c' % 256 raise an "OverflowError: unsigned byte integer is greater than maximum" and chr(256) raise a "ValueError: chr() arg not in range(256)". I prefer the second error ;-)
str.format() should follow the same behaviour.
str is a byte string: it can be used to create a network packet or encode data into a byte stream. '%c' is useful for that, and str.format() should keep this nice feature.
msg100772 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010年03月10日 00:25
u'{0:c}'.format(256) formatter in implemented in Objects/stringlib/formatter.h and this C template is instanciated in... Python/formatter_string.c (and not Python/formatter_unicode.c). Extract of formatter_unicode.c comment:
/* don't define FORMAT_LONG, FORMAT_FLOAT, and FORMAT_COMPLEX, since
 we can live with only the string versions of those. The builtin
 format() will convert them to unicode. */
format_int_or_long_internal() is instanciated (only once) with STRINGLIB_CHAR=char and so "numeric_char = (STRINGLIB_CHAR)x;" becomes "numeric_char = (char)x;" whereas x is a long in [0; 0x10ffff] (or [0; 0xffff] depending on Python unicode build option).
I think that 'c' format type should have its own function because 
format_int_or_long_internal() gets locale info, compute the number of digits, and other things not related to just creating one character from its code (chr(code) / unichr(code)). But it's just a remark, it doesn't fix this issue.
To fix this issue, I think that the FORMAT_LONG & cie templates should be instanciated twice (str & unicode).
msg185089 - (view) Author: Francis MB (francismb) * Date: 2013年03月23日 20:52
In 2.7.3 >>>
>>> u'{0:c}'.format(127)
u'\x7f'
>>> u'{0:c}'.format(128)
Traceback (most recent call last):
 File "<pyshell#6>", line 1, in <module>
 u'{0:c}'.format(128)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
>>> u'{0:c}'.format(255)
Traceback (most recent call last):
 File "<pyshell#7>", line 1, in <module>
 u'{0:c}'.format(255)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)
>>> u'{0:c}'.format(256)
u'\x00'
>>> u'{0:c}'.format(257)
u'\x01'
msg185092 - (view) Author: Francis MB (francismb) * Date: 2013年03月23日 21:28
Adding a test that triggers the issue, let me know if is enough.
msg192169 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2013年07月02日 00:34
u'{0:c}'.format(256) calls 256.__format__('c') which returns a str (bytes) object, so we must reject value outside range(0, 256). The real fix for this issue is to upgrade to Python 3.
Attached patch works around the inital issue (u'{0:c}'.format(256)) by raising OverflowError on int.__format__('c') if the value is not in range(0, 256).
msg217674 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2014年05月01日 01:18
If the purpose of backporting .format was/is to help people writing forward-looking code, or now, to write 2&3 code, then it should work like .format in 3.x, at lease when the format string is unicode.
msg242726 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2015年05月07日 19:07
What if any harm can be done by applying the patch with Victor's work around?
msg242753 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年05月08日 10:33
May be just emit a warning in -3 mode?
msg243059 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年05月13日 09:02
Here is a modification of Victor's patch, that just emits Py3k warning.
Both ways, with OverflowError and Py3k DeprecationWarning, are good to me. What would you say about this Benjamin?
msg254373 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年11月09日 08:59
Ping.
msg254376 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年11月09日 10:14
> Both ways, with OverflowError and Py3k DeprecationWarning, are good to me. What would you say about this Benjamin?
I prefer an OverflowError. I don't like having to enable a flag to fix a bug :-(
According to the issue title, it's really a bug: "format method: c presentation type *broken* in 2.7".
Note: The unit test may check the error message, currently the error message is irrevelant (it mentions unicode whereas bytes (str type) are used).
>>> format(-1, "c")
OverflowError: %c arg not in range(0x110000) (wide Python build)
msg254378 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年11月09日 10:51
Then feel free to commit your patch please. It LGTM.
msg254379 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年11月09日 11:22
New changeset 2f2c52c9ff38 by Victor Stinner in branch '2.7':
Issue #7267: format(int, 'c') now raises OverflowError when the argument is not
https://hg.python.org/cpython/rev/2f2c52c9ff38 
msg254380 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年11月09日 11:23
> Then feel free to commit your patch please. It LGTM.
Thanks for the review ;-)
@Walter: Sorry for the late fix (6 years later!).
msg254383 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2015年11月09日 12:38
Don't worry, I've switched to using Python 3 in 2012, where this isn't a problem. ;)
msg254391 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年11月09日 15:29
Walter Dörwald added the comment:
> Don't worry, I've switched to using Python 3 in 2012, where this isn't a problem. ;)
Wow, cool! We still have 1 or 2 customers stuck with Python 2, haha.
History
Date User Action Args
2022年04月11日 14:56:54adminsetgithub: 51516
2015年11月09日 19:16:53berker.peksagsetstage: commit review -> resolved
2015年11月09日 15:29:28vstinnersetmessages: + msg254391
2015年11月09日 12:38:45doerwaltersetmessages: + msg254383
2015年11月09日 11:23:06vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg254380
2015年11月09日 11:22:22python-devsetnosy: + python-dev
messages: + msg254379
2015年11月09日 10:51:06serhiy.storchakasetmessages: + msg254378
stage: patch review -> commit review
2015年11月09日 10:14:57vstinnersetmessages: + msg254376
2015年11月09日 08:59:00serhiy.storchakasetmessages: + msg254373
2015年06月10日 18:54:25jwilksetnosy: + jwilk
2015年05月19日 09:19:01serhiy.storchakasetnosy: + benjamin.peterson
2015年05月13日 09:02:11serhiy.storchakasetfiles: + int_format_c_warn.patch

messages: + msg243059
2015年05月08日 10:33:20serhiy.storchakasetnosy: + serhiy.storchaka
messages: + msg242753
2015年05月07日 19:07:48BreamoreBoysetnosy: + BreamoreBoy
messages: + msg242726
2014年05月01日 01:35:46terry.reedysettitle: format method: c presentation type broken -> format method: c presentation type broken in 2.7
2014年05月01日 01:18:37terry.reedysetnosy: + terry.reedy

messages: + msg217674
stage: needs patch -> patch review
2013年07月02日 00:34:30vstinnersetfiles: + int_format_c.patch

messages: + msg192169
2013年06月23日 14:57:49terry.reedysetstage: test needed -> needs patch
2013年03月23日 21:28:00francismbsetfiles: + issue7267.patch
keywords: + patch
messages: + msg185092
2013年03月23日 20:52:57francismbsetnosy: + francismb
messages: + msg185089
2011年11月19日 14:03:05ezio.melottisetversions: - Python 2.6
2010年03月10日 00:25:42vstinnersetmessages: + msg100772
2010年02月24日 18:25:05eric.smithsetpriority: normal -> high
2010年02月24日 18:04:15eric.smithsetpriority: normal
2010年01月23日 00:46:34vstinnersetmessages: + msg98173
2010年01月21日 11:38:47vstinnersetnosy: + vstinner
messages: + msg98107
2010年01月14日 00:11:48ezio.melottisetnosy: + ezio.melotti

stage: test needed
2009年11月10日 13:58:23doerwaltersetmessages: + msg95115
2009年11月10日 13:20:17eric.smithsetmessages: + msg95113
2009年11月06日 14:52:30doerwaltersetmessages: + msg94972
2009年11月06日 14:09:20eric.smithsetmessages: + msg94969
versions: + Python 2.7
2009年11月05日 16:30:22eric.smithsetassignee: eric.smith

messages: + msg94936
nosy: + eric.smith
2009年11月05日 16:22:47doerwaltercreate

AltStyle によって変換されたページ (->オリジナル) /