Message 340841 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	immerrr again
Recipients	barry, immerrr again, jaraco, jayvdb, r.david.murray, tanzer@swing.co.at
Date	2019年04月25日.14:01:12
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1556200872.39.0.16827059931.issue25545@roundup.psfhosted.org>

Content
Hi everyone, It's the first time I'm using this bugtracker, so apologies in advance if I manage to break something from the first go. Not sure if it's the right place to report this, but I have the following repro that involves email.message_from_bytes: In [128]: import email ...: msg_bytes = ( ...: b'MIME-Version: 1.0\r\n' ...: b'Content-Type: text/plain;\r\n' ...: b' charset=utf-8\r\n' ...: b'Content-Transfer-Encoding: 8bit\r\n' ...: b'Content-Disposition: attachment;\r\n' ...: b' filename="camper_store.csv"\r\n\r\n' ...: ) + 'Beyoğlu-İst'.encode('utf8') ...: email.message_from_bytes(msg_bytes).get_payload(decode=True) Out[128]: b'Beyo\xc4\x9flu-\xc4\xb0st' I have read this and some previous bug reports where it was clearly explained that message_from_string has its limitations and message_from_bytes should be used for better results. And if I'm not mistaken my repro should have it all set up correctly: CTE=8bit, body encoded in utf8 which is explicitly indicated as the content charset, yet the result is still encoded with 'raw-unicode-escape'. Is there something wrong with the input or is it a bug? Thanks!

Content

Hi everyone,
It's the first time I'm using this bugtracker, so apologies in advance if I manage to break something from the first go.
Not sure if it's the right place to report this, but I have the following repro that involves email.message_from_bytes:
In [128]: import email 
 ...: msg_bytes = ( 
 ...: b'MIME-Version: 1.0\r\n' 
 ...: b'Content-Type: text/plain;\r\n' 
 ...: b' charset=utf-8\r\n' 
 ...: b'Content-Transfer-Encoding: 8bit\r\n' 
 ...: b'Content-Disposition: attachment;\r\n' 
 ...: b' filename="camper_store.csv"\r\n\r\n' 
 ...: ) + 'Beyoğlu-İst'.encode('utf8') 
 ...: email.message_from_bytes(msg_bytes).get_payload(decode=True) 
Out[128]: b'Beyo\xc4\x9flu-\xc4\xb0st'
I have read this and some previous bug reports where it was clearly explained that message_from_string has its limitations and message_from_bytes should be used for better results. And if I'm not mistaken my repro should have it all set up correctly: CTE=8bit, body encoded in utf8 which is explicitly indicated as the content charset, yet the result is still encoded with 'raw-unicode-escape'.
Is there something wrong with the input or is it a bug?
Thanks!

History
Date	User	Action	Args
2019年04月25日 14:01:12	immerrr again	set	recipients: + immerrr again, barry, jaraco, r.david.murray, jayvdb, tanzer@swing.co.at
2019年04月25日 14:01:12	immerrr again	set	messageid: <1556200872.39.0.16827059931.issue25545@roundup.psfhosted.org>
2019年04月25日 14:01:12	immerrr again	link	issue25545 messages
2019年04月25日 14:01:12	immerrr again	create

homepage