Issue 1685453: email package should work better with unicode

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/44753

classification

Title:	email package should work better with unicode
Type:	behavior	Stage:	resolved
Components:	Library (Lib), Unicode	Versions:	Python 3.1, Python 3.2, Python 2.7, Python 2.6

process

Dependencies:	Superseder:
Status:	closed	Resolution:	out of date
Assigned To:	r.david.murray	Nosy List:	ajaksu2, barry, bgamari, eric.araujo, l0nwlf, ocean-city, pebbe, r.david.murray, sivang
Priority:	normal	Keywords:

Created on 2007年03月21日 18:39 by barry, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Messages (7)
msg31612 - (view)	Author: Barry A. Warsaw (barry) * (Python committer)	Date: 2007年03月21日 18:39
This is a catch-all issue for improving the email package's handling of unicode. For now, please add issues/problems you find with email & unicode to this tracker item. For example: MIMEText()'s first argument should accept a unicode if _charset is also given. It should not be necessary to manually encode the first argument into an 8-bit string.
msg84700 - (view)	Author: Daniel Diniz (ajaksu2) * (Python triager)	Date: 2009年03月30日 22:56
Link to #1681333, #4487, #1443875, #1555842, #4661, #1078919, #963906, #1379416 and #1368247.
msg84753 - (view)	Author: Hirokazu Yamamoto (ocean-city) * (Python committer)	Date: 2009年03月31日 06:05
Probably these are related too. #5259 #5304
msg100550 - (view)	Author: Peter Kleiweg (pebbe)	Date: 2010年03月06日 22:45
In Python 3.1.1, email.mime.text.MIMEText accepts an 8-bit charset, but not utf-8. I think you should not have to specify a charset. All strings are unicode now, so I think the package should choose an appropriate charset based on the characters in the text, us-ascii, some iso-8859 charset, or utf-8, whatever fits. Python 3.1.1 (r311:74480, Oct 2 2009, 11:50:52) [GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from email.mime.text import MIMEText >>> text = 'H\u00e9' >>> msg = MIMEText(text, 'plain', 'iso-8859-1') >>> print(msg.as_string()) Content-Type: text/plain; charset="iso-8859-1" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable H=E9 >>> msg = MIMEText(text, 'plain', 'utf-8') Traceback (most recent call last): File "/my/opt/Python-3/lib/python3.1/email/message.py", line 269, in set_charset cte(self) TypeError: 'str' object is not callable During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/my/opt/Python-3/lib/python3.1/email/mime/text.py", line 30, in __init__ self.set_payload(_text, _charset) File "/my/opt/Python-3/lib/python3.1/email/message.py", line 234, in set_payload self.set_charset(charset) File "/my/opt/Python-3/lib/python3.1/email/message.py", line 271, in set_charset self._payload = charset.body_encode(self._payload) File "/my/opt/Python-3/lib/python3.1/email/charset.py", line 380, in body_encode return email.base64mime.body_encode(string) File "/my/opt/Python-3/lib/python3.1/email/base64mime.py", line 94, in body_encode enc = b2a_base64(s[i:i + max_unencoded]).decode("ascii") TypeError: must be bytes or buffer, not str >>>
msg124715 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2010年12月27日 17:04
Now that we are primarily focused on Python3 development, collecting "unicode" issues is not really all that useful (at least not to me, and I'm currently doing the email maintenance), so I'm closing this. All the relevant issues are assigned to me anyway, so I'll be dealing with them by and by.
msg148880 - (view)	Author: Sivan Greenberg (sivang)	Date: 2011年12月05日 17:12
I am having hard time parsing all the text/html and text/plain parts of a message, concatenating them into a string. I am thinking of writing some custom code to do manual handling of this... If this could be fixed that would be great. The issues are converting from and to ascii/unicode or whatever encoding/charset the part uses.
msg148882 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2011年12月05日 18:14
That particular problem will get fixed in the next version of the email package (hopefully in Python3.3), but that isn't ready yet.

History
Date	User	Action	Args
2022年04月11日 14:56:23	admin	set	github: 44753
2011年12月05日 18:14:12	r.david.murray	set	messages: + msg148882
2011年12月05日 17:12:44	sivang	set	nosy: + sivang messages: + msg148880
2010年12月27日 17:04:58	r.david.murray	set	status: open -> closed nosy: barry, ocean-city, ajaksu2, eric.araujo, r.david.murray, bgamari, l0nwlf, pebbe messages: + msg124715 dependencies: - Add utf8 alias for email charsets, email.parser: impossible to read messages encoded in a different encoding, smtplib is broken in Python3, email/base64mime.py cannot work, Add decode_header_as_string method to email.utils, Unicode email address helper, email.Header (via add_header) encodes non-ASCII content incorrectly, unicode in email.MIMEText and email/Charset.py, email.Header encode() unicode P2.6, email/charset.py convert() patch, email package and Unicode strings handling, email.header unicode fix resolution: out of date stage: test needed -> resolved
2010年07月17日 10:20:11	eric.araujo	set	nosy: + eric.araujo
2010年06月24日 10:40:09	l0nwlf	set	nosy: + l0nwlf
2010年05月05日 13:34:46	barry	set	assignee: barry -> r.david.murray nosy: + r.david.murray
2010年03月06日 22:45:26	pebbe	set	nosy: + pebbe messages: + msg100550
2009年06月18日 01:37:46	r.david.murray	set	dependencies: + Add decode_header_as_string method to email.utils versions: + Python 3.2, - Python 3.0
2009年05月01日 16:00:58	bgamari	set	nosy: + bgamari
2009年03月31日 06:05:28	ocean-city	set	nosy: + ocean-city dependencies: + smtplib is broken in Python3, email/base64mime.py cannot work messages: + msg84753
2009年03月30日 22:56:23	ajaksu2	set	dependencies: + Add utf8 alias for email charsets, email.parser: impossible to read messages encoded in a different encoding, Unicode email address helper, email.Header (via add_header) encodes non-ASCII content incorrectly, unicode in email.MIMEText and email/Charset.py, email.Header encode() unicode P2.6, email/charset.py convert() patch, email package and Unicode strings handling, email.header unicode fix type: behavior components: + Unicode versions: + Python 3.0, Python 3.1, Python 2.7 nosy: + ajaksu2 messages: + msg84700 stage: test needed
2007年03月21日 18:39:03	barry	create

homepage