[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021年2月11日 16:12:03 -0800

On Fri, Feb 12, 2021 at 5:18 AM Jim J. Jewett <[email protected]> wrote:
>
> Inada Naoki wrote:
>
> > Default encoding is used for:
>
> > a. Really need to use locale specific encoding
> > b. UTF-8 (bug. not work on Windows)
> > c. ASCII (not a bug, but slow on Windows)
>
> > I assume most usages are (b) and (c). This PEP can reduce them soon.
>
> Is this just an assumption, based on those times being visible to someone who 
> installs a lot of packages, or has the use of any locale other than UTF-8 and 
> ASCII really gone down a lot? Have browsers stopped using charset sniffing?
>
Using "most" is my fault. I am not good at Englsh. I should use "many" here.
You can see many bugs caused by not specifying `encoding="utf-8"` in Q&A sites.
I wrote some number about this common bugs in the PEP.
UTF-8 is used for 96.3% of web sites [1], although browser still use
charset sniffing. But how is it relating to this PEP?
[1] https://w3techs.com/technologies/details/en-utf8
> > Additionally, encoding="locale" will be backward/forward compatible
>
> What would be the problem with changing the default from None to locale?
It doesn't work on Python ~3.9.
So using `encoding="locale"` is not recommended anytime soon until
user drops Python 3.9 support.
> (I think you mentioned that they are the same 99% of the time; is that other 
> 1% likely to be cases where locale is wrong but None is right? Would there 
> be a better way to represent that 1%?)
>
`encoding="locale"` and `encoding=None` has same behavior except
`encoding="locale"` doesn't emit EncodingWarning even when it is
opt-in.
There is little difference between `encoding=None` and
`encoding=locale.getpreferredencoding(False)`. The difference is:
* When Python is using Windows, and
* When when the file is console, and
* (for open()) When PYTHONLEGACYWINDOWSSTDIO is set
* (for TextIOWrapper()) When the file is not _WindowsConsoleIO
encoding=None uses console codepage but
encoding=locale.getpreferredencoding(False) uses
Otherwise, encoding=None and
encoding=locale.getpreferredencoding(False) are same.
So `encoding=locale.getpreferredencoding(False)` can be used to
specify locale-specific encoding explicitly.
But this PEP doesn't recommend it. This PEP recommend to use
EncodingWarning for just finding missing `encoding="utf-8"` (or any
other specific encoding).
-- 
Inada Naoki <[email protected]>
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/PD4BTBAQHFUYOCF5QKIBDIMHATPVEFPW/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to