Message149948
| Author |
gz |
| Recipients |
benjamin.peterson, gz, poolie, r.david.murray, vstinner |
| Date |
2011年12月21日.01:12:29 |
| SpamBayes Score |
3.7858605e-14 |
| Marked as misclassified |
No |
| Message-id |
<1324429950.17.0.0585650740961.issue13643@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
> During 1 month, we had PYTHONFSENCODING environment variable. It was not a
> good idea.
I strongly agree. There is no sense in having a separate configurable value, anyone who would think about using a PYTHONFSENCODING should just change their locale instead. However, avoiding the need for manual intervention completely in a relatively narrow set of cases is still useful.
> Not after Python start. Using two encodings at the same would just adds new
> problems. On UNIX (at least on Linux?), it is mandatory to use the same
> encoding for:
>
> - command line arguments
> - environment variables
> - filenames
> - and more generally, all data exchanged with the system and other programs
Having more than one encoding on unix is already a reality, there's nothing to stop someone setting LANG=de_DE.UTF-8 and LC_MESSAGES=C say.
The real lesson is not that having more than one encoding is dangerous, but that having incompatible encodings is dangerous. As 'ascii' is a strict subset of 'utf-8' the cross process communication issues are greatly lessened, at worst stuff just breaks still.
Expanding the filesystem default encoding to utf-8 should be a very narrow change, mostly just affecting io and os operations. Other actions involving paths will still break if a non-ascii string is used, but without the possibility of mangling data. |
|