This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年09月16日 19:59 by chris.jerdonek, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue-15952-1-branch-27.patch | chris.jerdonek, 2012年09月18日 19:39 | |||
| Messages (6) | |||
|---|---|---|---|
| msg170575 - (view) | Author: Chris Jerdonek (chris.jerdonek) * (Python committer) | Date: 2012年09月16日 19:59 | |
format(value) and value.__format__() behave differently even though the documentation says otherwise: "Note: format(value, format_spec) merely calls value.__format__(format_spec)." (from http://docs.python.org/library/functions.html?#format ) The difference happens when the format string is unicode. For example: >>> format(10, u'n') u'10' >>> (10).__format__(u'n') # parentheses needed to prevent SyntaxError '10' So either the documentation should be changed, or the behavior should be changed to match. Related to this: neither the "Format Specification Mini-Language" documentation nor the string.Formatter docs seem to say anything about the effect that a unicode format string should have on the return value (in particular, should it cause the return value to be unicode or not): http://docs.python.org/library/string.html#formatspec http://docs.python.org/library/string.html#string-formatting See also issue 15276 (int formatting), issue 15951 (empty format string), and issue 7300 (unicode arguments). |
|||
| msg170587 - (view) | Author: Chris Jerdonek (chris.jerdonek) * (Python committer) | Date: 2012年09月17日 06:26 | |
See this code comment: /* don't define FORMAT_LONG, FORMAT_FLOAT, and FORMAT_COMPLEX, since we can live with only the string versions of those. The builtin format() will convert them to unicode. */ from http://hg.python.org/cpython/file/19601d451d4c/Python/formatter_unicode.c In other words, it was deliberate not to make value.__format__(format_spec) return unicode when format_spec is unicode. So the docs should be adjusted to say that they are not always the same. |
|||
| msg170603 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2012年09月17日 12:46 | |
I believe the conversion is happening in Objects/abstract.c in PyObject_Format, around line 864, near this comment: /* Convert to unicode, if needed. Required if spec is unicode and result is str */ I think changing the docs will result in more confusion than clarity, but if you can come up with some good wording, I'd be okay with it. I think changing the code will likely break things with little or no benefit. |
|||
| msg170669 - (view) | Author: Chris Jerdonek (chris.jerdonek) * (Python committer) | Date: 2012年09月18日 19:39 | |
Here is a proposed patch. One note on the patch. I feel the second sentence of the note is worth adding because value.__format__() departs from what PEP 3101 says: "Note for Python 2.x: The 'format_spec' argument will be either a string object or a unicode object, depending on the type of the original format string. The __format__ method should test the type of the specifiers parameter to determine whether to return a string or unicode object. It is the responsibility of the __format__ method to return an object of the proper type." The extra sentence will help in heading off and when responding to issues about value.__format__() that are similar to issue 15951. |
|||
| msg170671 - (view) | Author: Chris Jerdonek (chris.jerdonek) * (Python committer) | Date: 2012年09月18日 19:44 | |
To clarify, one of the sentences above should have read, "I feel the second sentence of the note *in the patch* was worth adding..." (not the second sentence of the PEP note I quoted). |
|||
| msg171026 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2012年09月23日 11:15 | |
``format(value, format_spec)`` merely calls
- ``value.__format__(format_spec)``.
+ ``value.__format__(format_spec)`` and, if *format_spec* is Unicode,
+ converts the value to Unicode if it is not already Unicode.
This is correct, but should be rephrased (and "value" should be "return value").
+ The method ``value.__format__(format_spec)`` may return 8-bit strings
+ for some built-in types when *format_spec* is Unicode.
This is not limited to built-in types. __format__() might return either str or unicode, and format() returns the same -- except for the aforementioned case.
This is a summary of the possible cases.
__format__ can return unicode or str:
>>> class Uni(object):
... def __format__(*args): return u'uni'
...
>>> class Str(object):
... def __format__(*args): return 'str'
...
format() and __format__ return the same value, except when the format_spec is unicode and __format__ returns str:
>>> format(Uni(), 'd'), Uni().__format__( 'd') # same
(u'uni', u'uni')
>>> format(Uni(), u'd'), Uni().__format__(u'd') # same
(u'uni', u'uni')
>>> format(Str(), 'd'), Str().__format__( 'd') # same
('str', 'str')
>>> format(Str(), u'd'), Str().__format__(u'd') # different
(u'str', 'str')
It is also not true that the type of return value is the same of the format_spec, because in the first case the returned type is unicode even if the format_spec is str. Therefore this part of the patch should be changed:
+ Per :pep:`3101`, the function returns a Unicode object if *format_spec* is
+ Unicode. Otherwise, it returns an 8-bit string.
The behavior might be against PEP 3101 (see quotation in msg170669), even thought the wording of the PEP is somewhat lenient IMHO ("proper type" doesn't necessary mean "same type").
|
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:36 | admin | set | github: 60156 |
| 2020年05月31日 12:11:06 | serhiy.storchaka | set | status: open -> closed resolution: out of date stage: patch review -> resolved |
| 2012年09月23日 11:15:10 | ezio.melotti | set | messages: + msg171026 |
| 2012年09月22日 18:32:04 | chris.jerdonek | link | issue15276 dependencies |
| 2012年09月22日 14:06:16 | chris.jerdonek | set | nosy:
+ ezio.melotti |
| 2012年09月18日 19:44:29 | chris.jerdonek | set | messages: + msg170671 |
| 2012年09月18日 19:39:25 | chris.jerdonek | set | files:
+ issue-15952-1-branch-27.patch keywords: + patch messages: + msg170669 stage: patch review |
| 2012年09月17日 12:46:45 | eric.smith | set | nosy:
+ eric.smith messages: + msg170603 |
| 2012年09月17日 06:26:10 | chris.jerdonek | set | messages: + msg170587 |
| 2012年09月16日 20:01:37 | Arfrever | set | nosy:
+ Arfrever |
| 2012年09月16日 19:59:51 | chris.jerdonek | create | |