Message105007
| Author |
lemburg |
| Recipients |
Arfrever, lemburg, loewis, pitrou, vstinner |
| Date |
2010年05月05日.09:16:42 |
| SpamBayes Score |
2.639662e-06 |
| Marked as misclassified |
No |
| Message-id |
<4BE13778.3080108@egenix.com> |
| In-reply-to |
<1273049882.6.0.568296637609.issue8610@psf.upfronthosting.co.za> |
| Content |
STINNER Victor wrote:
>
> STINNER Victor <victor.stinner@haypocalc.com> added the comment:
>
>> manpage for nl_langinfo() doesn't mention any errors that could
>> be raised by it
>
> It's more about get_codeset(). This function can fail for different reasons:
>
> - nl_langinfo() result is an empty string: "If item is not valid, a pointer to an empty string is returned." say the manpage
> - _PyCodec_Lookup() failed: unable to import the encoding codec module, there is no such codec, codec machinery is broken, etc.
> - the codec has no "name "attribute
> - strdup() failure (no more memory)
>
> Do you think that you should fallback to ASCII if nl_langinfo() result is an empty string, and UTF-8 otherwise? get_codeset() failure is very unlikely, and I think that fallback to UTF-8 is just fine. A warning is printed to stderr, the user should try to understand why get_codeset() failed.
I think that using ASCII is a safer choice in case of errors.
Using UTF-8 may be safe for reading file names, but it's not
safe for creating files or directories.
I also think that an application should be able to update the
file system encoding in such an error case (and only in such a case).
The application may have better knowledge about how it's being
used and can provide correct encoding information by other means. |
|