This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年11月05日 16:22 by doerwalter, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue7267.patch | francismb, 2013年03月23日 21:28 | review | ||
| int_format_c.patch | vstinner, 2013年07月02日 00:34 | review | ||
| int_format_c_warn.patch | serhiy.storchaka, 2015年05月13日 09:02 | review | ||
| Messages (23) | |||
|---|---|---|---|
| msg94935 - (view) | Author: Walter Dörwald (doerwalter) * (Python committer) | Date: 2009年11月05日 16:22 | |
The c presentation type in the new format method from PEP 3101 seems to be broken: Python 2.6.4 (r264:75706, Oct 27 2009, 15:18:04) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> u'{0:c}'.format(256) u'\x00' The PEP states: 'c' - Character. Converts the integer to the corresponding Unicode character before printing, so I would have expected this to return u'\u0100' instead of u'\x00'. |
|||
| msg94936 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2009年11月05日 16:30 | |
I'll look at it. |
|||
| msg94969 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2009年11月06日 14:09 | |
This is a bug in the way ints and longs are formatted. They always do the formatting as str, then convert to unicode. This works everywhere except with the 'c' presentation type. I'm still trying to decide how best to handle this. |
|||
| msg94972 - (view) | Author: Walter Dörwald (doerwalter) * (Python committer) | Date: 2009年11月06日 14:52 | |
I'd say that a value >= 128 should generate a Unicode string (as the PEP
explicitely states that the value is a Unicode code point and not a byte
value).
However str.format() doesn't seem to support mixing str and unicode anyway:
>>> '{0}'.format(u'\u3042')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u3042' in
position 0: ordinal not in range(128)
so str.format() might raise an OverflowError for values >= 128 (or >= 256?)
|
|||
| msg95113 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2009年11月10日 13:20 | |
> so str.format() might raise an OverflowError for values >= 128 (or >= 256?) Maybe, but the issue you reported is in unicode.format() (not str.format()), and I think that should be fixed. I'm trying to think of how best to address it. As for the second issue you raise (which I think is that str.format() can't take a unicode argument), would you mind opening a separate issue for this and assigning it to me? Thanks. |
|||
| msg95115 - (view) | Author: Walter Dörwald (doerwalter) * (Python committer) | Date: 2009年11月10日 13:58 | |
Done: issue 7300. |
|||
| msg98107 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2010年01月21日 11:38 | |
See also issue #7649. |
|||
| msg98173 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2010年01月23日 00:46 | |
('%c' % 255) == chr(255) == '\xff'
'%c' % 256 raise an "OverflowError: unsigned byte integer is greater than maximum" and chr(256) raise a "ValueError: chr() arg not in range(256)". I prefer the second error ;-)
str.format() should follow the same behaviour.
str is a byte string: it can be used to create a network packet or encode data into a byte stream. '%c' is useful for that, and str.format() should keep this nice feature.
|
|||
| msg100772 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2010年03月10日 00:25 | |
u'{0:c}'.format(256) formatter in implemented in Objects/stringlib/formatter.h and this C template is instanciated in... Python/formatter_string.c (and not Python/formatter_unicode.c). Extract of formatter_unicode.c comment:
/* don't define FORMAT_LONG, FORMAT_FLOAT, and FORMAT_COMPLEX, since
we can live with only the string versions of those. The builtin
format() will convert them to unicode. */
format_int_or_long_internal() is instanciated (only once) with STRINGLIB_CHAR=char and so "numeric_char = (STRINGLIB_CHAR)x;" becomes "numeric_char = (char)x;" whereas x is a long in [0; 0x10ffff] (or [0; 0xffff] depending on Python unicode build option).
I think that 'c' format type should have its own function because
format_int_or_long_internal() gets locale info, compute the number of digits, and other things not related to just creating one character from its code (chr(code) / unichr(code)). But it's just a remark, it doesn't fix this issue.
To fix this issue, I think that the FORMAT_LONG & cie templates should be instanciated twice (str & unicode).
|
|||
| msg185089 - (view) | Author: Francis MB (francismb) * | Date: 2013年03月23日 20:52 | |
In 2.7.3 >>>
>>> u'{0:c}'.format(127)
u'\x7f'
>>> u'{0:c}'.format(128)
Traceback (most recent call last):
File "<pyshell#6>", line 1, in <module>
u'{0:c}'.format(128)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 0: ordinal not in range(128)
>>> u'{0:c}'.format(255)
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
u'{0:c}'.format(255)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 0: ordinal not in range(128)
>>> u'{0:c}'.format(256)
u'\x00'
>>> u'{0:c}'.format(257)
u'\x01'
|
|||
| msg185092 - (view) | Author: Francis MB (francismb) * | Date: 2013年03月23日 21:28 | |
Adding a test that triggers the issue, let me know if is enough. |
|||
| msg192169 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年07月02日 00:34 | |
u'{0:c}'.format(256) calls 256.__format__('c') which returns a str (bytes) object, so we must reject value outside range(0, 256). The real fix for this issue is to upgrade to Python 3.
Attached patch works around the inital issue (u'{0:c}'.format(256)) by raising OverflowError on int.__format__('c') if the value is not in range(0, 256).
|
|||
| msg217674 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2014年05月01日 01:18 | |
If the purpose of backporting .format was/is to help people writing forward-looking code, or now, to write 2&3 code, then it should work like .format in 3.x, at lease when the format string is unicode. |
|||
| msg242726 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2015年05月07日 19:07 | |
What if any harm can be done by applying the patch with Victor's work around? |
|||
| msg242753 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月08日 10:33 | |
May be just emit a warning in -3 mode? |
|||
| msg243059 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月13日 09:02 | |
Here is a modification of Victor's patch, that just emits Py3k warning. Both ways, with OverflowError and Py3k DeprecationWarning, are good to me. What would you say about this Benjamin? |
|||
| msg254373 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年11月09日 08:59 | |
Ping. |
|||
| msg254376 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年11月09日 10:14 | |
> Both ways, with OverflowError and Py3k DeprecationWarning, are good to me. What would you say about this Benjamin? I prefer an OverflowError. I don't like having to enable a flag to fix a bug :-( According to the issue title, it's really a bug: "format method: c presentation type *broken* in 2.7". Note: The unit test may check the error message, currently the error message is irrevelant (it mentions unicode whereas bytes (str type) are used). >>> format(-1, "c") OverflowError: %c arg not in range(0x110000) (wide Python build) |
|||
| msg254378 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年11月09日 10:51 | |
Then feel free to commit your patch please. It LGTM. |
|||
| msg254379 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年11月09日 11:22 | |
New changeset 2f2c52c9ff38 by Victor Stinner in branch '2.7': Issue #7267: format(int, 'c') now raises OverflowError when the argument is not https://hg.python.org/cpython/rev/2f2c52c9ff38 |
|||
| msg254380 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年11月09日 11:23 | |
> Then feel free to commit your patch please. It LGTM. Thanks for the review ;-) @Walter: Sorry for the late fix (6 years later!). |
|||
| msg254383 - (view) | Author: Walter Dörwald (doerwalter) * (Python committer) | Date: 2015年11月09日 12:38 | |
Don't worry, I've switched to using Python 3 in 2012, where this isn't a problem. ;) |
|||
| msg254391 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年11月09日 15:29 | |
Walter Dörwald added the comment: > Don't worry, I've switched to using Python 3 in 2012, where this isn't a problem. ;) Wow, cool! We still have 1 or 2 customers stuck with Python 2, haha. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:54 | admin | set | github: 51516 |
| 2015年11月09日 19:16:53 | berker.peksag | set | stage: commit review -> resolved |
| 2015年11月09日 15:29:28 | vstinner | set | messages: + msg254391 |
| 2015年11月09日 12:38:45 | doerwalter | set | messages: + msg254383 |
| 2015年11月09日 11:23:06 | vstinner | set | status: open -> closed resolution: fixed messages: + msg254380 |
| 2015年11月09日 11:22:22 | python-dev | set | nosy:
+ python-dev messages: + msg254379 |
| 2015年11月09日 10:51:06 | serhiy.storchaka | set | messages:
+ msg254378 stage: patch review -> commit review |
| 2015年11月09日 10:14:57 | vstinner | set | messages: + msg254376 |
| 2015年11月09日 08:59:00 | serhiy.storchaka | set | messages: + msg254373 |
| 2015年06月10日 18:54:25 | jwilk | set | nosy:
+ jwilk |
| 2015年05月19日 09:19:01 | serhiy.storchaka | set | nosy:
+ benjamin.peterson |
| 2015年05月13日 09:02:11 | serhiy.storchaka | set | files:
+ int_format_c_warn.patch messages: + msg243059 |
| 2015年05月08日 10:33:20 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg242753 |
| 2015年05月07日 19:07:48 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages: + msg242726 |
| 2014年05月01日 01:35:46 | terry.reedy | set | title: format method: c presentation type broken -> format method: c presentation type broken in 2.7 |
| 2014年05月01日 01:18:37 | terry.reedy | set | nosy:
+ terry.reedy messages: + msg217674 stage: needs patch -> patch review |
| 2013年07月02日 00:34:30 | vstinner | set | files:
+ int_format_c.patch messages: + msg192169 |
| 2013年06月23日 14:57:49 | terry.reedy | set | stage: test needed -> needs patch |
| 2013年03月23日 21:28:00 | francismb | set | files:
+ issue7267.patch keywords: + patch messages: + msg185092 |
| 2013年03月23日 20:52:57 | francismb | set | nosy:
+ francismb messages: + msg185089 |
| 2011年11月19日 14:03:05 | ezio.melotti | set | versions: - Python 2.6 |
| 2010年03月10日 00:25:42 | vstinner | set | messages: + msg100772 |
| 2010年02月24日 18:25:05 | eric.smith | set | priority: normal -> high |
| 2010年02月24日 18:04:15 | eric.smith | set | priority: normal |
| 2010年01月23日 00:46:34 | vstinner | set | messages: + msg98173 |
| 2010年01月21日 11:38:47 | vstinner | set | nosy:
+ vstinner messages: + msg98107 |
| 2010年01月14日 00:11:48 | ezio.melotti | set | nosy:
+ ezio.melotti stage: test needed |
| 2009年11月10日 13:58:23 | doerwalter | set | messages: + msg95115 |
| 2009年11月10日 13:20:17 | eric.smith | set | messages: + msg95113 |
| 2009年11月06日 14:52:30 | doerwalter | set | messages: + msg94972 |
| 2009年11月06日 14:09:20 | eric.smith | set | messages:
+ msg94969 versions: + Python 2.7 |
| 2009年11月05日 16:30:22 | eric.smith | set | assignee: eric.smith messages: + msg94936 nosy: + eric.smith |
| 2009年11月05日 16:22:47 | doerwalter | create | |