Message324225
| Author |
vstinner |
| Recipients |
Michael.Felt, michael-o, terry.reedy, vstinner |
| Date |
2018年08月28日.08:59:06 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1535446746.19.0.56676864532.issue34403@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
...
> byte 0xA7 decoded to Unicode character U+00A7
...
Well, it confirms what I expected: nl_langinfo(CODESET) announces "roman8", but mbstowcs() uses Latin1 encoding in practice.
So I wrote the PR 8969 which forces the ASCII encoding in that case. I'm not sure how test_utf8_mode is supposed to be fixed in that case.
Michael: you can try to apply PR 8969, and then apply manually PR 8967 patch:
https://patch-diff.githubusercontent.com/raw/python/cpython/pull/8967.patch
But I expect that with both patches, test_utf8_mode will still fail on test_cmd_line(). You can try to modify test_cmd_line() to force encoding to "ascii".
What are the values of sys.getfilesystemencoding() and locale.getpreferredencoding() with the C locale with PR 8969? I expect "roman8" which can cause issue in os.fsencode()/os.fsdecode(). Maybe Python should also force ASCII here? |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2018年08月28日 08:59:06 | vstinner | set | recipients:
+ vstinner, terry.reedy, Michael.Felt, michael-o |
| 2018年08月28日 08:59:06 | vstinner | set | messageid: <1535446746.19.0.56676864532.issue34403@psf.upfronthosting.co.za> |
| 2018年08月28日 08:59:06 | vstinner | link | issue34403 messages |
| 2018年08月28日 08:59:06 | vstinner | create |
|