Message340841
| Author |
immerrr again |
| Recipients |
barry, immerrr again, jaraco, jayvdb, r.david.murray, tanzer@swing.co.at |
| Date |
2019年04月25日.14:01:12 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1556200872.39.0.16827059931.issue25545@roundup.psfhosted.org> |
| In-reply-to |
| Content |
Hi everyone,
It's the first time I'm using this bugtracker, so apologies in advance if I manage to break something from the first go.
Not sure if it's the right place to report this, but I have the following repro that involves email.message_from_bytes:
In [128]: import email
...: msg_bytes = (
...: b'MIME-Version: 1.0\r\n'
...: b'Content-Type: text/plain;\r\n'
...: b' charset=utf-8\r\n'
...: b'Content-Transfer-Encoding: 8bit\r\n'
...: b'Content-Disposition: attachment;\r\n'
...: b' filename="camper_store.csv"\r\n\r\n'
...: ) + 'Beyoğlu-İst'.encode('utf8')
...: email.message_from_bytes(msg_bytes).get_payload(decode=True)
Out[128]: b'Beyo\xc4\x9flu-\xc4\xb0st'
I have read this and some previous bug reports where it was clearly explained that message_from_string has its limitations and message_from_bytes should be used for better results. And if I'm not mistaken my repro should have it all set up correctly: CTE=8bit, body encoded in utf8 which is explicitly indicated as the content charset, yet the result is still encoded with 'raw-unicode-escape'.
Is there something wrong with the input or is it a bug?
Thanks! |
|