homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: UTF-8 Email Subject problem
Type: behavior Stage: commit review
Components: Library (Lib) Versions: Python 3.2, Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: loewis, msladek, python-dev, r.david.murray, tati_alchueyr
Priority: normal Keywords:

Created on 2012年02月20日 08:03 by msladek, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue14062_buggy_email_subject.py tati_alchueyr, 2012年03月13日 16:31 Code used as example for issue 14062, but that didn't reproduce the bug locally
a.py loewis, 2012年03月14日 17:16
Messages (11)
msg153766 - (view) Author: Michal Sladek (msladek) Date: 2012年02月20日 08:03
Hello!
I think there is a problem when adding UTF-8 subject to email message. I wrote following function (its code is based on examples I found in offical docs) which should send an email with UTF-8 subject, UTF-8 plain text body and attached file when all arguments are given. 
fromAddr - address of sender
toAddr - address of recipient
subject - subject
body - text of email body
attachment - full path to file we want to attach
Here is the code:
def sendMail (fromAddr, toAddr, subject, body = '', attachment = ''):
 message = email.mime.multipart.MIMEMultipart()
 message.add_header('From',fromAddr)
 message.add_header('To',toAddr)
 message['Subject'] = email.header.Header(subject,'utf-8')
 if (body != ''):
 msgPart = email.mime.text.MIMEText(body,'plain','utf-8')
 message.attach(msgPart)
 if (attachment != ''):
 if os.path.exists(attachment) == True:
 filename = attachment.rpartition(os.sep)[2]
 fp = open(attachment,'rb')
 msgPart = email.mime.base.MIMEBase('application','octet-stream')
 msgPart.set_payload(fp.read())
 fp.close()
 email.encoders.encode_base64(msgPart)
 msgPart.add_header('Content-Disposition','attachment',filename=filename)
 message.attach(msgPart)
 if smtpPort == 25:
 smtpCon = smtplib.SMTP(smtpSrv,smtpPort)
 else:
 smtpCon = smtplib.SMTP_SSL(smtpSrv,smtpPort)
 if (smtpUser != '') and (smtpPass != ''):
 smtpCon.login(smtpUser,smtpPass)
 smtpCon.send_message(message,mail_options=['UTF8SMTP','8BITMIME'])
 smtpCon.quit()
Running the function with following arguments:
sendMail('rzrobot@seznam.cz','msladek@volny.cz','žluťoučký kůň','úpěl ďábelské ódy')
produces following output on receiving side:
Return-Path: <rzrobot@seznam.cz>
Received: from smtp2.seznam.cz (smtp2.seznam.cz [77.75.76.43])
	by mx1.volny.cz (Postfix) with ESMTP id DD6BB2E09CD
	for <msladek@volny.cz>; 2012年2月20日 08:34:38 +0100 (CET)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=seznam.cz;
	h=Received:Content-Type:MIME-Version:From:To:Subject:--===============1029508565==:MIME-Version:Content-Transfer-Encoding:X-Smtpd:X-Seznam-User:X-Session:X-Country:X-Virus-Info:X-Seznam-SPF:X-Seznam-DomainKeys;
	b=cdU1VSRTCDf0x2CeBNbLJxYSOhSy7r9lNp+1s7+bed6AGBI48vufe3q7f8JFxlfTc
	ulZIDptWi6PMvlZYCBkh1uzTKcihZR7MCoxgW0PJLO1LX5elTJsZ/GTc5oe/GZXkTPT
	qwj1EQIlVn0dpZtt4jIzfC2RrO2IRieR2rozeQM=
Received: from dvr.ph.sladkovi.eu (ip-84-42-150-218.net.upcbroadband.cz [84.42.150.218])	by email-relay2.ng.seznam.cz (Seznam SMTPD 1.2.15-6@18976) with ESMTP;	2012年2月20日 08:34:35 +0100 (CET) 
Content-Type: multipart/mixed; boundary="===============1029508565=="
MIME-Version: 1.0
From: rzrobot@seznam.cz
To: msladek@volny.cz
Subject: =?utf-8?b?xb5sdcWlb3XEjWvDvSBrxa/FiA==?= 
X-DKIM-Status: fail
X-Virus: no (m2.volny.internal - 2012年2月20日 08:34:40 +0100 (CET))
X-Spam: no (m2.volny.internal - 2012年2月20日 08:34:41 +0100 (CET))
X-Received-Date: 2012年2月20日 08:34:42 +0100 (CET)
--===============1029508565==:Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
X-Smtpd: 1.2.15-6@18976
X-Seznam-User: rzrobot@seznam.cz
X-Session: 11
X-Country: CZ
X-Virus-Info:clean
X-Seznam-SPF:neutral
X-Seznam-DomainKeys:unknown
w7pwxJtsIMSPw6FiZWxza8OpIMOzZHk=
--===============1029508565==--
Although no attachment argument was given, the client says that message has an attachment of unknown type and that message does not contain any text at all. See that message part header :Content-Type: text/plain; charset="utf-8" is part of the message part boundary instead of beeing inside of the message part.
When I change the code of function to generate the subject manually and add it via add_header like this:
def sendMail (fromAddr, toAddr, subject, body = '', attachment = ''):
 message = email.mime.multipart.MIMEMultipart()
 message.add_header('From',fromAddr)
 message.add_header('To',toAddr)
 base64Subject = base64.b64encode(subject.encode('utf-8')).decode()
 encodedSubject = '=?UTF-8?B?{0}?='.format(base64Subject)
 message.add_header('Subject',encodedSubject)
 if (body != ''):
 msgPart = email.mime.text.MIMEText(body,'plain','utf-8')
 message.attach(msgPart)
 if (attachment != ''):
 if os.path.exists(attachment) == True:
 filename = attachment.rpartition(os.sep)[2]
 fp = open(attachment,'rb')
 msgPart = email.mime.base.MIMEBase('application','octet-stream')
 msgPart.set_payload(fp.read())
 fp.close()
 email.encoders.encode_base64(msgPart)
 msgPart.add_header('Content-Disposition','attachment',filename=filename)
 message.attach(msgPart)
 if smtpPort == 25:
 smtpCon = smtplib.SMTP(smtpSrv,smtpPort)
 else:
 smtpCon = smtplib.SMTP_SSL(smtpSrv,smtpPort)
 if (smtpUser != '') and (smtpPass != ''):
 smtpCon.login(smtpUser,smtpPass)
 smtpCon.send_message(message,mail_options=['UTF8SMTP','8BITMIME'])
 smtpCon.quit()
Then everything is OK on receiving side, both subject and plaint text body are visible:
Return-Path: <rzrobot@seznam.cz>
Received: from smtp2.seznam.cz (smtp2.seznam.cz [77.75.76.43])
	by mx1.volny.cz (Postfix) with ESMTP id 177092E0825
	for <msladek@volny.cz>; 2012年2月20日 08:51:58 +0100 (CET)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=seznam.cz;
	h=Received:Content-Type:MIME-Version:From:To:Subject:X-Smtpd:X-Seznam-User:X-Session:X-Country:X-Virus-Info:X-Seznam-SPF:X-Seznam-DomainKeys;
	b=F2A6GhX0TWVjnrB4vx/ayc1BTGDFxBI96oI0fk/gr/tgP0jlV1UC91m4i/O4ay+Bg
	lfka88qa71XZOlHtY2vl7zxYjGPJ97pRCdtqWB+JcNOa5bMsk6lmjMHh+A+FQ2e7+yb
	1F091t0nMcQlarriF8sD5rNjhuRYjvCv7kKbt8s=
Received: from dvr.ph.sladkovi.eu (ip-84-42-150-218.net.upcbroadband.cz [84.42.150.218])	by email-relay1.ng.seznam.cz (Seznam SMTPD 1.2.15-6@18976) with ESMTP;	2012年2月20日 08:51:55 +0100 (CET) 
Content-Type: multipart/mixed; boundary="===============1044203895=="
MIME-Version: 1.0
From: rzrobot@seznam.cz
To: msladek@volny.cz
Subject: =?UTF-8?B?xb5sdcWlb3XEjWvDvSBrxa/FiA==?=
X-Smtpd: 1.2.15-6@18976
X-Seznam-User: rzrobot@seznam.cz
X-Session: 11
X-Country: CZ
X-Virus-Info: clean
X-Seznam-SPF: neutral
X-Seznam-DomainKeys: unknown
X-DKIM-Status: pass seznam.cz
X-Virus: no (m2.volny.internal - 2012年2月20日 08:52:00 +0100 (CET))
X-Spam: no (m2.volny.internal - 2012年2月20日 08:52:01 +0100 (CET))
X-Received-Date: 2012年2月20日 08:52:01 +0100 (CET)
--===============1044203895==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
w7pwxJtsIMSPw6FiZWxza8OpIMOzZHk=
--===============1044203895==--
I am not a programmer so I might overlook some obvious mistake in my code but for now I think it's a bug.
msg155629 - (view) Author: Tatiana Al-Chueyr (tati_alchueyr) * Date: 2012年03月13日 16:31
Hi msladek!
I tried to reproduce your bug using Python 3.2.2 on MacOS X, but didn't manage - all worked fine. I used gmail both to send and receive the message, on SSL:
 smtpPort = '465'
 smtpSrv = 'smtp.gmail.com'
As I'm no SMPTP nor email expert, I asked r.david.murray to review the email message code received and it looks fine.
Could you provide a smaller example of code that causes the same problem?
I just extracted your code to help other people trying to reproduce the bug. It is attached.
msg155634 - (view) Author: Michal Sladek (msladek) Date: 2012年03月13日 17:11
I tested the code again. Using Gmail SMTP server produces correct results, using server smtp.seznam.cz leads to a problem (I should mention here, that Seznam is the largest free mail provider in the Czech Republic). Here are the differences on receiving side.
GMAIL:
Return-Path: <michal@sladkovi.eu>
Received: from mail-bk0-f45.google.com (mail-bk0-f45.google.com [209.85.214.45])
	by mx4.volny.cz (Postfix) with ESMTP id 0A3E12E086B
	for <msladek@volny.cz>; 2012年3月13日 17:58:03 +0100 (CET)
Received: by bkcjg9 with SMTP id jg9so842625bkc.18
 for <msladek@volny.cz>; 2012年3月13日 09:58:03 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=message-id:date:content-type:mime-version:from:to:subject
 :x-gm-message-state;
 bh=Sdb8G6CtN+pEzPJHxwbwCprTgWPJUrR3jiU+qeK1WAs=;
 b=X88feHvtpL6zBXYNYSjgUQ+1WirGmU8B69k+4fGlAge6F5+pYd6SzuJ6ExdBsp+brw
 1QuCne97OdVnYoFmg86ZviFz3m6Cn6N8hgPNa2H7hCPQD4O+cjJQQzze4xXYqgPJQs+D
 ke4ISEmxL9UFJUvkTyFhrCDefSxQMY+TnnLwWQR+PCD/uB0FgR2UgBjEx9K7EUKQi6W0
 78+EZYO3cd+SuuadOUvIpe2cj0576ahcP40dGN0kIe+P4NX5Ij7D2cCa/bWiwFdDRUI4
 v8UxJcnbTuOCQFtlItxCAxU9IzZWGekWtpJVnRDBGG63iGXHoTDzp+4+d1FRBGsDQ2pD
 l5tg==
Received: by 10.204.150.73 with SMTP id x9mr6371797bkv.7.1331657883687;
 2012年3月13日 09:58:03 -0700 (PDT)
Received: from dvr.ph.sladkovi.eu (ip-84-42-150-218.net.upcbroadband.cz. [84.42.150.218])
 by mx.google.com with ESMTPS id u14sm2783344bkp.2.2012年03月13日.09.58.02
 (version=SSLv3 cipher=OTHER);
 2012年3月13日 09:58:02 -0700 (PDT)
Message-ID: <4f5f7c9a.0e70cc0a.12f5.75a3@mx.google.com>
Date: 2012年3月13日 09:58:02 -0700 (PDT)
Content-Type: multipart/mixed; boundary="===============1165280172=="
MIME-Version: 1.0
From: michal@sladkovi.eu
To: msladek@volny.cz
Subject: =?utf-8?b?xb5sdcWlb3XEjWvDvSBrxa/FiA==?=
X-Gm-Message-State: ALoCoQmf6k2GVVKdm0ZNbvSyPpZ0Gl1yv/BDc3h3zrh34hWWp3wa/fSBXbWT9FANzBLd5k1qUnEP
X-DKIM-Status: neutral
X-Virus: no (m2.volny.internal - 2012年3月13日 17:58:05 +0100 (CET))
X-Spam: no (m2.volny.internal - 2012年3月13日 17:58:07 +0100 (CET))
X-Received-Date: 2012年3月13日 17:58:08 +0100 (CET)
--===============1165280172==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
w7pwxJtsIMSPw6FiZWxza8OpIMOzZHk=
--===============1165280172==--
--------------------------------------------------------------
SEZNAM:
Return-Path: <Michal.Sladek@seznam.cz>
Received: from smtp2.seznam.cz (smtp2.seznam.cz [77.75.76.43])
	by mx4.volny.cz (Postfix) with ESMTP id 542A32E0868
	for <msladek@volny.cz>; 2012年3月13日 18:00:05 +0100 (CET)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=seznam.cz;
	h=Received:Content-Type:MIME-Version:From:To:Subject:--===============1097187749==:MIME-Version:Content-Transfer-Encoding:X-Smtpd:X-Seznam-User:X-Session:X-Country:X-Virus-Info:X-Seznam-SPF:X-Seznam-DomainKeys;
	b=bfwTOSoFJU7vGbB7VvXNIQzhbsj+pDPhwr72BX1aVWAicyK0Cix3evz6c3+srYBba
	lHDeYd74ZXW5553N6ocfy68pRxpI6K5dKfvcKKLgUN7+N/iQOUtj09D4wN81cjPt7qQ
	uH5rjcdsDsbZV31EsxyS1P/rn6F7bYOxrpPeHAk=
Received: from dvr.ph.sladkovi.eu (ip-84-42-150-218.net.upcbroadband.cz [84.42.150.218])	by email-relay1.ng.seznam.cz (Seznam SMTPD 1.2.15-6@18976) with ESMTP;	2012年3月13日 17:59:32 +0100 (CET) 
Content-Type: multipart/mixed; boundary="===============1097187749=="
MIME-Version: 1.0
From: Michal.Sladek@seznam.cz
To: msladek@volny.cz
Subject: =?utf-8?b?xb5sdcWlb3XEjWvDvSBrxa/FiA==?= 
X-DKIM-Status: fail
X-Virus: no (m2.volny.internal - 2012年3月13日 18:00:06 +0100 (CET))
X-Spam: no (m2.volny.internal - 2012年3月13日 18:00:08 +0100 (CET))
X-Received-Date: 2012年3月13日 18:00:08 +0100 (CET)
--===============1097187749==:Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
X-Smtpd: 1.2.15-6@18976
X-Seznam-User: michal.sladek@seznam.cz
X-Session: 5
X-Country: CZ
X-Virus-Info:clean
X-Seznam-SPF:neutral
X-Seznam-DomainKeys:unknown
w7pwxJtsIMSPw6FiZWxza8OpIMOzZHk=
--===============1097187749==--
--------------------------------------------------------------
As you can see, Seznam is adding a lot of headers into mail's body. Anyway, making utf-8 subject manually like this:
 base64Subject = base64.b64encode(subject.encode('utf-8')).decode()
 encodedSubject = '=?UTF-8?B?{0}?='.format(base64Subject)
 message.add_header('Subject',encodedSubject)
works correctly for both SMTP servers. So there must be a difference...
msg155658 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年03月13日 20:14
It makes no sense that changing how Subject is generated would affect the later formatting of the mime header. There is no coupling that I'm aware of in the code.
I notice that your handcrafted version uses uppercase for the charset and CTE code. Can you try using lowercase like the email module does, and see if that reproduces the problem?
msg155738 - (view) Author: Michal Sladek (msladek) Date: 2012年03月14日 09:31
Changing code to:
 encodedSubject = '=?utf-8?b?{0}?='.format(base64Subject)
still works properly with smtp.seznam.cz server....
msg155753 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年03月14日 15:00
I think the next thing to do would be to replace the call to send_message with code that calls BytesGenerator to write the message out to disk, and diff the output of the two versions (normal subject and hand-encoded subject). Maybe that will give us a clue.
msg155771 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年03月14日 17:10
I digged a little bit further. The data being sent is
'Content-Type: multipart/mixed; boundary="===============1981330074035035012=="\r\nMIME-Version: 1.0\r\nFrom: rzrobot@seznam.cz\r\nTo: msladek@volny.cz\r\nSubject: =?utf-8?b?xb5sdcWlb3XEjWvDvSBrxa/FiA==?=\n\r\n--===============1981330074035035012==\r\nContent-Type: text/plain; charset="utf-8"\r\nMIME-Version: 1.0\r\nContent-Transfer-Encoding: base64\r\n\r\nw7pwxJtsIMSPw6FiZWxza8OpIMOzZHk=\n\r\n--===============1981330074035035012==--'
As you notice, there is a plain \n (without \r) after the subject (and all other places with base64), which might confuse seznam.
msg155772 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年03月14日 17:16
I also attach a stand-alone version. To run this locally, run
smtpdX.Y.py -dn localhost:2525
msg155777 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年03月14日 18:00
OK, got it. When I created BytesParser I turned the 'NL' constant into a class attribute, but in the line that handles Header objects in BytesParser I failed to change NL to self._NL. So when send_message calls flatten with linesep='\r\n', in that one place it was using \n instead of the correct linesep.
I've got a patch which I will commit shortly.
msg155779 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年03月14日 18:24
New changeset d0bf40ff20ef by R David Murray in branch '3.2':
#14062: fix BytesParser handling of linesep for Header objects
http://hg.python.org/cpython/rev/d0bf40ff20ef
New changeset 7617f3071320 by R David Murray in branch 'default':
#14062: fix BytesParser handling of Header objects
http://hg.python.org/cpython/rev/7617f3071320 
msg155781 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年03月14日 18:28
Thanks for the bug report. I thought we had tests for processing Header objects when serializing a message using BytesParser, but clearly we didn't.
And thanks Tatiana and Martin for issue review and testing.
History
Date User Action Args
2022年04月11日 14:57:26adminsetgithub: 58270
2012年03月14日 18:28:06r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg155781
2012年03月14日 18:24:46python-devsetnosy: + python-dev
messages: + msg155779
2012年03月14日 18:00:12r.david.murraysetversions: + Python 3.3
messages: + msg155777

assignee: r.david.murray
components: + Library (Lib), - None
type: behavior
stage: commit review
2012年03月14日 17:16:08loewissetfiles: + a.py

messages: + msg155772
2012年03月14日 17:10:35loewissetmessages: + msg155771
2012年03月14日 15:00:59r.david.murraysetmessages: + msg155753
2012年03月14日 09:31:17msladeksetmessages: + msg155738
2012年03月13日 20:14:02r.david.murraysetmessages: + msg155658
2012年03月13日 17:11:40msladeksetmessages: + msg155634
2012年03月13日 16:31:25tati_alchueyrsetfiles: + issue14062_buggy_email_subject.py
nosy: + tati_alchueyr
messages: + msg155629

2012年02月20日 23:47:43eric.araujosetnosy: + r.david.murray
2012年02月20日 15:47:44loewissetnosy: + loewis
2012年02月20日 08:03:28msladekcreate

AltStyle によって変換されたページ (->オリジナル) /