Message74237
| Author |
loewis |
| Recipients |
HWJ, amaury.forgeotdarc, bboissin, benjamin.peterson, djc, dlitz, draghuram, gvanrossum, loewis, pitrou, vstinner, zegreek |
| Date |
2008年10月03日.10:43:51 |
| SpamBayes Score |
7.2190864e-11 |
| Marked as misclassified |
No |
| Message-id |
<48E5F765.30100@v.loewis.de> |
| In-reply-to |
<200810031238.20821.victor.stinner@haypocalc.com> |
| Content |
> Which charset is used when you use bytes filename?
It's the "ANSI" code page, which is a system-wide admin-modifiable
indirection to some real code page (changing it requires a reboot).
In the API, it's referred to as CP_ACP. It's also related to the
"multi-byte" API, which has caused Mark Hammond to call the codec
invoking it "mbcs" (IOW, "mbcs" is always the codec name for the
file system encoding). The specific code page that CP_ACP denotes
can be found with locale.getpreferredencoding(). Using that codec
name (which might be e.g. "cp1252") is different from using "mbcs",
as that goes through a regular (table-driven) Python codec. In
particular, the Python codec will report errors, whereas the "mbcs"
codec will find replacement characters. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2008年10月03日 10:43:52 | loewis | set | recipients:
+ loewis, gvanrossum, amaury.forgeotdarc, pitrou, vstinner, draghuram, benjamin.peterson, djc, HWJ, dlitz, zegreek, bboissin |
| 2008年10月03日 10:43:51 | loewis | link | issue3187 messages |
| 2008年10月03日 10:43:51 | loewis | create |
|