homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Regression in Python3 of email handling of unicode strings in headers
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: aikinci, python-dev, r.david.murray
Priority: high Keywords: easy, patch

Created on 2012年03月13日 19:55 by r.david.murray, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
Issue14291.patch aikinci, 2012年03月14日 00:51 review
Messages (4)
msg155656 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年03月13日 19:55
In Python2, this works:
 >>> from email.mime.text import MIMEText
 >>> m = MIMEText('abc')
 >>> str(m)
 'From nobody Tue Mar 13 15:44:59 2012\nContent-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc'
 >>> m['Subject'] = u'É test'
 >>> str(m)
 'From nobody Tue Mar 13 15:48:11 2012\nContent-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\nSubject: =?utf-8?q?=C3=89_test?=\n\nabc'
That is, unicode string automatically get turned into encoded words.
In Python3 this no longer works:
 >>> from email.mime.text import MIMEText
 >>> m = MIMEText('abc')
 >>> str(m)
 'Content-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc'
 >>> m['Subject'] = u'É test'
 >>> str(m)
 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/home/rdmurray/python/p33/Lib/email/message.py", line 154, in __str__
 return self.as_string()
 File "/home/rdmurray/python/p33/Lib/email/message.py", line 168, in as_string
 g.flatten(self, unixfrom=unixfrom)
 File "/home/rdmurray/python/p33/Lib/email/generator.py", line 99, in flatten
 self._write(msg)
 File "/home/rdmurray/python/p33/Lib/email/generator.py", line 152, in _write
 self._write_headers(msg)
 File "/home/rdmurray/python/p33/Lib/email/generator.py", line 186, in _write_headers
 header_name=h)
 File "/home/rdmurray/python/p33/Lib/email/header.py", line 205, in __init__
 self.append(s, charset, errors)
 File "/home/rdmurray/python/p33/Lib/email/header.py", line 286, in append
 s.encode(output_charset, errors)
 UnicodeEncodeError: 'ascii' codec can't encode character '\xc9' in position 0: ordinal not in range(128)
Presumably the problem is that the Python2 code tests for 'string' and if
it isn't string handles it by CTE encoding it. In Python3 everything
is a string. Probably what should happen is the encoding error should
be caught, and the CTE encoding done at that point, based on the model of how Python2 handled unicode strings.
msg155700 - (view) Author: Ali Ikinci (aikinci) Date: 2012年03月14日 00:51
Together with David we have worked on a fix and test for this. Thanks David.
msg155728 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年03月14日 07:03
New changeset fd4b4650856f by R David Murray in branch '3.2':
#14291: if a header has non-ascii unicode, default to CTE using utf-8
http://hg.python.org/cpython/rev/fd4b4650856f
New changeset f5dcb2d58893 by R David Murray in branch 'default':
Merge #14291: if a header has non-ascii unicode, default to CTE using utf-8
http://hg.python.org/cpython/rev/f5dcb2d58893 
msg155729 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年03月14日 07:04
Fix committed. Thanks Ali.
History
Date User Action Args
2022年04月11日 14:57:27adminsetgithub: 58499
2012年03月14日 07:04:58r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg155729

stage: needs patch -> resolved
2012年03月14日 07:03:46python-devsetnosy: + python-dev
messages: + msg155728
2012年03月14日 00:51:38aikincisetfiles: + Issue14291.patch
keywords: + patch
messages: + msg155700
2012年03月13日 19:55:33r.david.murraycreate

AltStyle によって変換されたページ (->オリジナル) /