homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: str.format and string.Formatter bug with French (and other) locale
Type: behavior Stage: resolved
Components: Interpreter Core Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder: float.__format__('n') fails with _PyUnicode_CheckConsistency assertion error for locales with non-ascii thousands separator
View: 33954
Assigned To: Nosy List: canuck7, eric.smith, vstinner
Priority: normal Keywords:

Created on 2018年12月06日 21:54 by canuck7, last changed 2022年04月11日 14:59 by admin. This issue is now closed.

Messages (3)
msg331254 - (view) Author: Bruno Chanal (canuck7) Date: 2018年12月06日 21:54
The short story: Small numbers are not displayed properly when using a French (language) locale or similar, and formatting output with str.format or string.Formatter(). The problem probably extends to other locales.
Long story:
---
$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.1 LTS
Release:	18.04
Codename:	bionic
$ python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL, '')
'fr_CA.UTF-8'
>>> print('{:n}'.format(10)) # Garbled output
>>> print('{:n}'.format(10000)) # OK
10 000
>>> # Note: narrow non-break space used as thousands separator
... pass
>>> locale.format_string('%d', 10, grouping=True) # OK
'10'
>>> locale.format_string('%d', 10123) # OK
'10123'
>>> locale.format_string('%d', 10123, grouping=True) # OK thousands separator \u202f
'10\u202f123'
>>> import string
>>> print(string.Formatter().format('{:n}', 10)) # Same problem with Formatter
AB
>>> print(string.Formatter().format('{:n}', 10000))
10 000
locale aware functions implementing the {:n} formatting code, such as str.format and string.Formatter, generate garbled output with small numbers under a French locale.
However, locale.format_string('%d', numeric_value) produces valid strings. In other words, it's a workaround for the time being...
The problem seems to originate from a new version of Ubuntu: I ran the same program about 18 months ago and didn't notice any problem.
My 0.02 $ worth of analysis: the output from the str.locale function is some random and changing value with small numbers. The behavior is reminiscent of invalid memory reads in C functions, e.g., mismatch of parameter in function calls, or similar. The value is not consistent. It feels like format does not expect and deal properly with long Unicode characters as part of numbers. The space character is a NARROW NON-BREAK SPACE, in most Ubuntu French locales (and quite a few others) however.
The problem shows up in Python 3.6 and 3.7.
This might also be a security issue...
msg331259 - (view) Author: Eric V. Smith (eric.smith) * (Python committer) Date: 2018年12月07日 01:22
Possibly related to issue 33954?
msg333023 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2019年01月04日 23:31
This bug is a duplicate of bpo-33954. Good news: it's now fixed in Python 3.6.8!
https://docs.python.org/3.6/whatsnew/changelog.html#python-3-6-8-release-candidate-1
"bpo-33954: For str.format(), float.__format__() and complex.__format__() methods for non-ASCII decimal point when using the "n" formatter."
Note: I fixed other bugs with special locales (mostly LC_NUMERIC and/or LC_MONETARY using a different encoding than the LC_CTYPE locale).
History
Date User Action Args
2022年04月11日 14:59:09adminsetgithub: 79613
2019年01月04日 23:31:03vstinnersetstatus: open -> closed
superseder: float.__format__('n') fails with _PyUnicode_CheckConsistency assertion error for locales with non-ascii thousands separator
messages: + msg333023

resolution: fixed
stage: resolved
2018年12月22日 10:23:48xtreaksetnosy: + vstinner
2018年12月07日 01:22:44eric.smithsetnosy: + eric.smith
messages: + msg331259
2018年12月06日 21:54:23canuck7create

AltStyle によって変換されたページ (->オリジナル) /