homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: email package should work better with unicode
Type: behavior Stage: resolved
Components: Library (Lib), Unicode Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: out of date
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: ajaksu2, barry, bgamari, eric.araujo, l0nwlf, ocean-city, pebbe, r.david.murray, sivang
Priority: normal Keywords:

Created on 2007年03月21日 18:39 by barry, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Messages (7)
msg31612 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2007年03月21日 18:39
This is a catch-all issue for improving the email package's handling of unicode. For now, please add issues/problems you find with email & unicode to this tracker item.
For example:
MIMEText()'s first argument should accept a unicode if _charset is also given. It should not be necessary to manually encode the first argument into an 8-bit string.
msg84700 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009年03月30日 22:56
Link to #1681333, #4487, #1443875, #1555842, #4661, #1078919, #963906,
#1379416 and #1368247.
msg84753 - (view) Author: Hirokazu Yamamoto (ocean-city) * (Python committer) Date: 2009年03月31日 06:05
Probably these are related too. #5259 #5304 
msg100550 - (view) Author: Peter Kleiweg (pebbe) Date: 2010年03月06日 22:45
In Python 3.1.1, email.mime.text.MIMEText accepts an 8-bit charset, but not utf-8.
I think you should not have to specify a charset. All strings are unicode now, so I think the package should choose an appropriate charset based on the characters in the text, us-ascii, some iso-8859 charset, or utf-8, whatever fits.
Python 3.1.1 (r311:74480, Oct 2 2009, 11:50:52) 
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2 
Type "help", "copyright", "credits" or "license" for more information. 
>>> from email.mime.text import MIMEText 
>>> text = 'H\u00e9' 
>>> msg = MIMEText(text, 'plain', 'iso-8859-1') 
>>> print(msg.as_string()) 
Content-Type: text/plain; charset="iso-8859-1" 
MIME-Version: 1.0 
Content-Transfer-Encoding: quoted-printable 
 
H=E9 
>>> msg = MIMEText(text, 'plain', 'utf-8') 
Traceback (most recent call last): 
 File "/my/opt/Python-3/lib/python3.1/email/message.py", line 269, in set_charset 
 cte(self) 
TypeError: 'str' object is not callable 
 
During handling of the above exception, another exception occurred: 
 
Traceback (most recent call last): 
 File "<stdin>", line 1, in <module> 
 File "/my/opt/Python-3/lib/python3.1/email/mime/text.py", line 30, in __init__ 
 self.set_payload(_text, _charset) 
 File "/my/opt/Python-3/lib/python3.1/email/message.py", line 234, in set_payload 
 self.set_charset(charset) 
 File "/my/opt/Python-3/lib/python3.1/email/message.py", line 271, in set_charset 
 self._payload = charset.body_encode(self._payload) 
 File "/my/opt/Python-3/lib/python3.1/email/charset.py", line 380, in body_encode 
 return email.base64mime.body_encode(string) 
 File "/my/opt/Python-3/lib/python3.1/email/base64mime.py", line 94, in body_encode 
 enc = b2a_base64(s[i:i + max_unencoded]).decode("ascii") 
TypeError: must be bytes or buffer, not str 
>>>
msg124715 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年12月27日 17:04
Now that we are primarily focused on Python3 development, collecting "unicode" issues is not really all that useful (at least not to me, and I'm currently doing the email maintenance), so I'm closing this. All the relevant issues are assigned to me anyway, so I'll be dealing with them by and by.
msg148880 - (view) Author: Sivan Greenberg (sivang) Date: 2011年12月05日 17:12
I am having hard time parsing all the text/html and text/plain parts of a message, concatenating them into a string. I am thinking of writing some custom code to do manual handling of this...
If this could be fixed that would be great. The issues are converting from and to ascii/unicode or whatever encoding/charset the part uses.
msg148882 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年12月05日 18:14
That particular problem will get fixed in the next version of the email package (hopefully in Python3.3), but that isn't ready yet.
History
Date User Action Args
2022年04月11日 14:56:23adminsetgithub: 44753
2011年12月05日 18:14:12r.david.murraysetmessages: + msg148882
2011年12月05日 17:12:44sivangsetnosy: + sivang
messages: + msg148880
2010年12月27日 17:04:58r.david.murraysetstatus: open -> closed
nosy: barry, ocean-city, ajaksu2, eric.araujo, r.david.murray, bgamari, l0nwlf, pebbe
messages: + msg124715

dependencies: - Add utf8 alias for email charsets, email.parser: impossible to read messages encoded in a different encoding, smtplib is broken in Python3, email/base64mime.py cannot work, Add decode_header_as_string method to email.utils, Unicode email address helper, email.Header (via add_header) encodes non-ASCII content incorrectly, unicode in email.MIMEText and email/Charset.py, email.Header encode() unicode P2.6, email/charset.py convert() patch, email package and Unicode strings handling, email.header unicode fix
resolution: out of date
stage: test needed -> resolved
2010年07月17日 10:20:11eric.araujosetnosy: + eric.araujo
2010年06月24日 10:40:09l0nwlfsetnosy: + l0nwlf
2010年05月05日 13:34:46barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2010年03月06日 22:45:26pebbesetnosy: + pebbe
messages: + msg100550
2009年06月18日 01:37:46r.david.murraysetdependencies: + Add decode_header_as_string method to email.utils
versions: + Python 3.2, - Python 3.0
2009年05月01日 16:00:58bgamarisetnosy: + bgamari
2009年03月31日 06:05:28ocean-citysetnosy: + ocean-city
dependencies: + smtplib is broken in Python3, email/base64mime.py cannot work
messages: + msg84753
2009年03月30日 22:56:23ajaksu2setdependencies: + Add utf8 alias for email charsets, email.parser: impossible to read messages encoded in a different encoding, Unicode email address helper, email.Header (via add_header) encodes non-ASCII content incorrectly, unicode in email.MIMEText and email/Charset.py, email.Header encode() unicode P2.6, email/charset.py convert() patch, email package and Unicode strings handling, email.header unicode fix
type: behavior
components: + Unicode
versions: + Python 3.0, Python 3.1, Python 2.7
nosy: + ajaksu2

messages: + msg84700
stage: test needed
2007年03月21日 18:39:03barrycreate

AltStyle によって変換されたページ (->オリジナル) /