homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: email module should not allow some header field repetitions
Type: enhancement Stage: resolved
Components: email Versions: Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: adrien-saladin, barry, berker.peksag, kxroberto, martin.panter, python-dev, r.david.murray, rhettinger
Priority: high Keywords: patch

Created on 2011年01月05日 21:40 by adrien-saladin, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
max_count.patch r.david.murray, 2012年05月28日 21:50 review
issue10839_27.diff berker.peksag, 2016年06月12日 07:22
Messages (22)
msg125473 - (view) Author: Adrien Saladin (adrien-saladin) Date: 2011年01月05日 21:40
Hi,
The following script shows two problems with email.mime.text.MIMEText:
- first the use of msg['To'] seems confusing because its dictionnary-like syntax made me think it acts as a "set or replace", but in fact is working as a stream operation
- second this behavior allows for the same field to be repeated several times in a header which is discouraged in rfc-822 and forbidden for many fields in rfc-2822. 
#########################################"
from email.mime.text import MIMEText
msg = MIMEText("""Hello World""")
dest = ["one@example.com", "two@example.com", "three@example.com", "four@example.com"]
 
for d in dest:
 msg["From"] = "myself@example.com"
 msg["To"] = d
 msg["subject"] = "just a test"
 print (msg)
 # + send the buggy mail...
###################################
the last sent mail will looks like this: 
---------------------
Hello World
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
From: myself@example.com
To: one@example.com
subject: just a test
From: myself@example.com
To: two@example.com
subject: just a test
From: myself@example.com
To: three@example.com
subject: just a test
From: myself@example.com
To: four@example.com
subject: just a test
Hello World
----------------------
I see some possible modifications:
* make the [] operator work as a dictionnary-like syntax. So calling msg['To'] multiple times would simply replace the previous 'To:' field. The additional constraint is that some fields like 'comments' or 'keywords' can be repeated
* (or) throw an error when some fields are repeated in this list: 
 from, sender, reply-to, to, cc, bcc, message-id, in-reply-to, references, subject
msg125477 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年01月05日 22:28
The behaviour you observe is by design, and documented. The email package needs to be able to handle RFC-invalid input, which includes messages with multiple instances of fields that are supposed to be singletons. It also needs to keep track of the order of headers. Thus its interface is, as documented, a "mapping-like" interface with duplicable keys and an element order.
That said, it would be a valid feature request to have a way to have it generate errors if a field that is supposed to be a singleton per-RFC is added more than once. This will require a registry of such headers...a registry of headers is planned for the next version of the email package (email6), so that would be an appropriate time for this to be implemented. email6 will also have strict and lenient modes, which will also be useful in this context.
msg150437 - (view) Author: kxroberto (kxroberto) Date: 2012年01月01日 17:38
I think really ill/strange is that kind of item _assignments_ do _add_ multiple.
If msg[field] = xy would just add-first/replace-frist , and only msg.add_xxxx/.append(field, xy) would add multiples that would be clear and understandable/readable. 
(The sophisticated check dictionary is unnecessary IMHO, I don't expect the class to be ever smart enough for a full RFC checklist.)
e.g. I remember a bug like
msg[field] = xy
if special_condition:
 msg[field] = abc # just wanted a alternative
Never ever expected a double header here!
"=" with adding behavior is absurd IMHO. Certainly doesn't allow readable code.
msg150469 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年01月02日 17:51
Regardless of what anybody thinks about the design, it is what it is and can't be changed for backward compatibility reasons. The best we can do is reject creating duplicate headers for headers that may only appear once. That feature has already been coded in the new version of the email package (see http://pypi.python.org/pypi/email), but has not yet been committed to the trunk, which is why this issue is still open.
msg161811 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年05月28日 21:50
My original fix for this for email6 got lost in a refactoring. Here is a patch that fixes it in the code I recently checked in. It may not cover all the headers that should be unique, since I haven't implemented parsers for all structured headers yet, but they will all be there before the new code moves from provisional to stable.
msg161877 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年05月29日 13:14
New changeset d7881a371c41 by R David Murray in branch 'default':
#10839: raise an error on add of duplicate unique headers in new email policies
http://hg.python.org/cpython/rev/d7881a371c41 
msg161878 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年05月29日 13:17
Committed. It is almost never the right thing to do to allow duplicates of unique headers, but an application that does need it can create a policy subclass and override the header_max_count method.
msg161897 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年05月29日 16:31
New changeset 08857f4e9709 by R David Murray in branch 'default':
#10839: add new test file that was omitted from checkin
http://hg.python.org/cpython/rev/08857f4e9709 
msg268279 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016年06月11日 21:25
I think we should consider this as an API design bug and backport the fix.
This seems to be the exact cause of this week's email address leak at LetsEncrypt: 
* https://community.letsencrypt.org/t/email-address-disclosures-preliminary-report-june-11-2016/16867
* https://news.ycombinator.com/item?id=11881704
* https://twitter.com/TvdW/status/741482093054550016 
msg268338 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016年06月12日 07:22
The API in Python 3 is very different and I'm not sure we can backport the expected behavior without breaking other people's code (unless we treat this as a security issue). Here is a naive backport for 2.7.
Known test failures: test_get_all, test_get_decoded_uu_payload, test_multipart_no_boundary
msg268385 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016年06月12日 14:16
On Jun 11, 2016, at 09:25 PM, Raymond Hettinger wrote:
>I think we should consider this as an API design bug and backport the fix.
No, it's deliberate, required, and expected in some cases as RDM explains.
Certainly for compat32 policy, this can't change.
Other policies can prevent multiple additions of some headers. Probably those
would go in defects if you parsed a message with prohibited duplicates.
msg268403 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2016年06月12日 21:19
Would you consider raising an exception at least for the case of a "To:" header or perhaps a warning or someother failsafe.
Using __setitem__ for appending instead of replacement is surprising and in the case of LetsEncrypt was a small disaster. There is a docstring explaining what is going on but that typically isn't visible to the user of the square brackets operator.
For Python3.6, I think there should be an alternative API that doesn't use the square brackets operator: add_header, replace_header, remove_header or somesuch. The problem is that square brackets never suggests appending which is what is actually happening.
msg268409 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016年06月13日 00:09
There are already the makings of an alternative API:
https://docs.python.org/3.6/library/email.message.html#email.message.Message.add_header
There is also replace_header(), but it only replaces the _first_ header field, and leaves later ones untouched. However there is only __del__(), which deletes all matching header fields; there is no remove_header() or similar.
I think I would support deprecating the __setitem__() etc methods, perhaps with a cleanup of the alternatives, e.g. add remove_all(). Also, __getitem__() is equivalent to get(), which does not raise KeyError. Although according to Issue 12111, David said things are unlikely to change.
msg268420 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016年06月13日 05:24
On Jun 13, 2016, at 12:09 AM, Martin Panter wrote:
>I think I would support deprecating the __setitem__() etc methods, perhaps
>with a cleanup of the alternatives, e.g. add remove_all(). Also,
>__getitem__() is equivalent to get(), which does not raise KeyError. Although
>according to Issue 12111, David said things are unlikely to change.
I do not support deprecating __setitem__(). I'm okay with developing an
alternative API, but setitem syntax is just too pervasive and convenient.
It's been there since the very earliest days of mimelib (the precursor to the
email package). If you read and understand the docs, you know exactly what
the semantics are and you know how to use it effectively and safely.
msg268421 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016年06月13日 05:28
On Jun 12, 2016, at 09:19 PM, Raymond Hettinger wrote:
>Would you consider raising an exception at least for the case of a "To:"
>header or perhaps a warning or someother failsafe.
No, not for compat32 policy. Seriously, I do not want to change the semantics
or syntax for existing code. This API predates even the renaming and
stdlibbing of the email package from the older mimelib.
By all means, let's develop API alternatives for new code, or stricter RFC
compliance with defect registration for newer policies. But please leave
things alone for compat32 and existing code.
msg268424 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016年06月13日 06:38
I don't think a new API is needed. But we need to promote the policy keyword better in docs. See https://twitter.com/aksiksi/status/741769504817045508 for an example of confusion.
I don't know if it's a good idea or API but can we add a 'policy' keyword argument to email.mime.base.MIMEBase? Right now, this is the only way to change the default policy without using high level functions like email.message_from_string():
 m = MIMEMultipart()
 m.policy = email.policy.default
msg268430 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016年06月13日 08:19
On Jun 13, 2016, at 06:38 AM, Berker Peksag wrote:
>I don't know if it's a good idea or API but can we add a 'policy' keyword
>argument to email.mime.base.MIMEBase? Right now, this is the only way to
>change the default policy without using high level functions like
>email.message_from_string():
>
> m = MIMEMultipart()
> m.policy = email.policy.default
I think we just need to plumb a `policy` argument through to the ultimate base
class, email.message.Message
msg268431 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016年06月13日 08:34
> I think we just need to plumb a `policy` argument through to the ultimate base
> class, email.message.Message
That's already possible: https://docs.python.org/dev/library/email.message.html#email.message.Message
It would be nice to be able to customize 'policy' via BaseMime: https://github.com/python/cpython/blob/master/Lib/email/mime/base.py#L23 
msg268432 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2016年06月13日 08:45
On Jun 13, 2016, at 08:34 AM, Berker Peksag wrote:
>Berker Peksag added the comment:
>
>> I think we just need to plumb a `policy` argument through to the ultimate
>> base class, email.message.Message
>
>That's already possible: https://docs.python.org/dev/library/email.message.html#email.message.Message
>
>It would be nice to be able to customize 'policy' via BaseMime:
>https://github.com/python/cpython/blob/master/Lib/email/mime/base.py#L23
Right, that's what I meant by "plumb". :) Basically we want all the
subclasses of email.message.Message to accept a `policy` argument and pass
them to their superclass constructors.
msg268654 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016年06月16日 10:42
> Right, that's what I meant by "plumb". :)
Sorry, apparently I can't read in the mornings :) I have just opened issue 27331 to implement this idea.
msg268828 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016年06月18日 19:42
In the new API there's no real reason to use the old MIME classes. If you want to add the keyword I have no objection, though.
I started a documentation revision last year but haven't had time to get back to it. Hopefully I'll dust it off Real Soon Now.
msg274891 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2016年09月07日 21:02
I've committed Berker's patch from #27331, and I'm about to take the new email API out of provisional status.
Barry is committed to not changing this behavior in 2.7 and I agree. In any case 2.7 doesn't differentiate between headers being added by the user and headers coming from the parsed message, and the latter *have* to allow duplicates even if the fields aren't supposed to be. The python3 code does make a distinction between these two cases, which is what allowed me to do the fix in the new policies. (Yes, you *could* fix feedparser and message in 2.7 so it also could tell, but that is an invasive change).
So, I'm re-closing this as fixed. It's "won't fix" for 2.7 and compat32.
History
Date User Action Args
2022年04月11日 14:57:10adminsetgithub: 55048
2016年09月07日 21:02:14r.david.murraysetstatus: open -> closed

messages: + msg274891
versions: + Python 3.5, Python 3.6, - Python 2.7
2016年06月18日 19:42:01r.david.murraysetmessages: + msg268828
2016年06月16日 10:42:48berker.peksagsetmessages: + msg268654
2016年06月13日 08:45:29barrysetmessages: + msg268432
2016年06月13日 08:34:21berker.peksagsetmessages: + msg268431
2016年06月13日 08:19:07barrysetmessages: + msg268430
2016年06月13日 06:38:20berker.peksagsetmessages: + msg268424
2016年06月13日 05:28:03barrysetmessages: + msg268421
2016年06月13日 05:24:02barrysetmessages: + msg268420
2016年06月13日 00:09:40martin.pantersetnosy: + martin.panter
messages: + msg268409
2016年06月12日 21:19:07rhettingersetmessages: + msg268403
2016年06月12日 14:16:57barrysetmessages: + msg268385
2016年06月12日 07:22:45berker.peksagsetfiles: + issue10839_27.diff
nosy: + berker.peksag
messages: + msg268338

2016年06月11日 21:25:46rhettingersetstatus: closed -> open
priority: normal -> high
versions: + Python 2.7, - Python 3.3
nosy: + rhettinger

messages: + msg268279
2012年05月29日 16:31:39python-devsetmessages: + msg161897
2012年05月29日 13:17:35r.david.murraysetstatus: open -> closed
resolution: fixed
messages: + msg161878

stage: test needed -> resolved
2012年05月29日 13:14:55python-devsetnosy: + python-dev
messages: + msg161877
2012年05月28日 21:50:50r.david.murraysetfiles: + max_count.patch
keywords: + patch
messages: + msg161811
2012年05月28日 20:23:17r.david.murraysetnosy: + barry
components: + email, - Library (Lib)
2012年01月02日 17:51:09r.david.murraysetmessages: + msg150469
2012年01月01日 17:38:49kxrobertosetnosy: + kxroberto
messages: + msg150437
2011年01月05日 22:28:56r.david.murraysettype: behavior -> enhancement
versions: + Python 3.3, - Python 2.6, Python 3.1
messages: + msg125477
stage: test needed
2011年01月05日 22:11:06georg.brandlsetassignee: r.david.murray

nosy: + r.david.murray
2011年01月05日 22:03:08adrien-saladinsettype: behavior
2011年01月05日 21:40:02adrien-saladincreate

AltStyle によって変換されたページ (->オリジナル) /