Message105010
| Author |
lemburg |
| Recipients |
Arfrever, lemburg, loewis, pitrou, vstinner |
| Date |
2010年05月05日.10:07:00 |
| SpamBayes Score |
5.595341e-05 |
| Marked as misclassified |
No |
| Message-id |
<4BE14342.2030502@egenix.com> |
| In-reply-to |
<1273051865.07.0.321860913525.issue8610@psf.upfronthosting.co.za> |
| Content |
STINNER Victor wrote:
>
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
>
>> I think that using ASCII is a safer choice in case of errors.
>
> I choosed UTF-8 to keep backward compatibility: PyUnicode_DecodeFSDefaultAndSize() uses utf-8 if Py_FileSystemDefaultEncoding==NULL. If the OS has no nl_langinfo(CODESET) function at all, Python3 uses utf-8.
Ouch, that was a poor choice. In Python we have a tradition to
avoid guessing, if possible. Since we cannot guarantee that the
file system will indeed use UTF-8, it would have been safer to
use ASCII. Not sure why this reasoning wasn't applied for
the file system encoding.
Nothing we can do about now, though.
>> Using UTF-8 may be safe for reading file names, but it's not
>> safe for creating files or directories.
>
> Well, I don't know. You are maybe right. And which encoding should be used if nl_langinfo(CODESET) function is missing: ASCII or UTF-8?
>
> UTF-8 is also an optimist choice: I bet that more and more OS will move to UTF-8.
I think we should also add a new environment variable to override
the automatic determination of the file system encoding, much like
what we have for the I/O encoding:
PYTHONFSENCODING: Encoding[:errors] used for file system.
(that would need to go on a new ticket, though) |
|