Message227346
| Author |
pitrou |
| Recipients |
Arfrever, ezio.melotti, lemburg, ncoghlan, pitrou, r.david.murray, serhiy.storchaka, vstinner |
| Date |
2014年09月23日.11:23:47 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1411471427.31.0.791719184051.issue18814@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
The encoding used impacts the result:
>>> s = 'abc\udcc3\udca9'
>>> s.encode('ascii', 'surrogateescape').decode('ascii', 'replace')
'abc��'
>>> s.encode('utf-8', 'surrogateescape').decode('utf-8', 'replace')
'abcé'
The original string ('abc\udcc3\udca9') was obtained by decoding a valid utf-8 string with the 'ascii' codec and the 'surrogateescape' error handler.
If anything, the default encoding should probably be sys.getfilesystemencoding(). |
|