homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: EmailMessage bad encoding for international domain
Type: behavior Stage: needs patch
Components: email Versions: Python 3.11, Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: barry, drlazor8, iritkatriel, r.david.murray
Priority: normal Keywords: patch

Created on 2020年02月26日 09:36 by drlazor8, last changed 2022年04月11日 14:59 by admin.

Pull Requests
URL Status Linked Edit
PR 18667 closed drlazor8, 2020年02月26日 10:30
Messages (5)
msg362687 - (view) Author: Julien Castiaux (drlazor8) * Date: 2020年02月26日 09:36
Affected python version: 3.5 and above (did test them all except 3.9)
Steps to reproduce:
 from mail.message import EmailMessage
 from mail.policy import SMTP
 msg = EmailMessage(policy=SMTP)
 msg['To'] = 'Joe <joe@examplé.com>' # notice the é in the domain
 print(msg.as_string())
It prints
 To: "Joe <joe@=?utf-8?q?exampl=C3=A9?=.com>"
But it should be
 To: "Joe <joe@xn--exampl-gva.com>"
While b64/qp can be used to encode most non-ascii headers, the domain part of an email address is an exception. According to IDNA2008 (rfc5890 , rfc5891), non-ascii domain should be encoded using the punycode algorithm and the ACE prefix.
msg362801 - (view) Author: Julien Castiaux (drlazor8) * Date: 2020年02月27日 14:15
Duplicate of https://bugs.python.org/issue39757 
msg362802 - (view) Author: Julien Castiaux (drlazor8) * Date: 2020年02月27日 14:17
Woops wrong copie/paste, here is the correct link: https://bugs.python.org/issue11783 
msg362906 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2020年02月28日 19:07
This is not actually a duplicate of 11783. Rereading (parts of) that issue, we decided we currently have no good way to do automatic conversion between unicode and internationalized domains, so the user of the library has to do it themselves. This means that the bug *here* is that the new email API is *wrongly* encoding the non-ascii in the domain by using an encoded word. I'm surprised at that; I thought I'd guarded against it.
What should be happening here is that an error should be raised when that header is set (or possibly when it is accessed/serialized, but when set would be better I think) saying that there is non-ascii in the domain part.
msg408472 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021年12月13日 17:50
Reproduced on 3.11.
History
Date User Action Args
2022年04月11日 14:59:27adminsetgithub: 83938
2021年12月13日 17:50:29iritkatrielsetnosy: + iritkatriel

messages: + msg408472
versions: + Python 3.9, Python 3.10, Python 3.11, - Python 3.5
2020年02月28日 19:07:10r.david.murraysetstatus: closed -> open
title: EmailMessage wrong encoding for international domain -> EmailMessage bad encoding for international domain
superseder: email parseaddr and formataddr should be IDNA aware ->
messages: + msg362906

resolution: duplicate ->
stage: resolved -> needs patch
2020年02月27日 15:55:53SilentGhostsetsuperseder: email parseaddr and formataddr should be IDNA aware
2020年02月27日 14:17:13drlazor8setmessages: + msg362802
2020年02月27日 14:15:37drlazor8setstatus: open -> closed
resolution: duplicate
messages: + msg362801

stage: patch review -> resolved
2020年02月26日 10:30:28drlazor8setkeywords: + patch
stage: patch review
pull_requests: + pull_request18023
2020年02月26日 09:36:15drlazor8create

AltStyle によって変換されたページ (->オリジナル) /