homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: email.utils.formataddr() should be rfc2047 aware
Type: enhancement Stage: resolved
Components: Library (Lib) Versions: Python 3.3
process
Status: closed Resolution: accepted
Dependencies: Superseder:
Assigned To: r.david.murray Nosy List: barry, python-dev, r.david.murray, terry.reedy, torsten.becker
Priority: normal Keywords: easy, patch

Created on 2007年03月29日 13:27 by barry, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
issue-1690608.patch torsten.becker, 2011年03月27日 13:44 Contains a test and fix for this issue review
issue-1690608-v2.patch torsten.becker, 2011年03月27日 17:00 Switched to email.charset.Charset, added more tests review
issue-1690608-v3.patch torsten.becker, 2011年03月27日 21:53 Optional arg can be str or Charset, trimmed down tests, added docs. review
issue-1690608-v4.patch torsten.becker, 2011年03月28日 09:56 Checking for instance of str instead of Charset, added 2 more tests for that review
Messages (13)
msg31677 - (view) Author: Barry A. Warsaw (barry) * (Python committer) Date: 2007年03月29日 13:27
formataddr() should rfc2047 encode its name argument if necessary.
msg113048 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2010年08月05日 20:43
I am just responding so this will not show up on the 'unanswered issues' list.
msg132320 - (view) Author: Torsten Becker (torsten.becker) Date: 2011年03月27日 13:44
I implemented a basic test for the issue and an attempt for a fix.
I am not entirely sure with my implementation, specifically I would like to get comments concerning the following points:
 - Is is OK that formataddr() will now check if address is ascii safe and if not it will raise a UnicodeEncodeError?
 
 - I was not sure on the style how to append new tests to test_email.py, I just put it into the same spot where all the other formataddr() tests where, shall I put it to the end instead?
I am submitting this patch as part of my preparation for the Google Summer of Code to familiarize myself with the contribution process, any feedback on what I should do different is very welcome.
msg132331 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年03月27日 14:28
The general approach of the patch looks good to me. Since formataddr is designed to be called from user code that is constructing a message, having it raise for non-ascii in the address is probably OK. However, there should be a test for that, and I'm curious to know what happens if you use such an address in an address field in the unmodified email package.
Instead of directly calling bencode, you should use the charset module and its header_encode method. Note that you need to turn the charset into a Charset instance first. The advantage of doing this is that it will choose the "best" encoding to use based on the charset and the contents of the string.
Your choice of location for the new tests is fine; TestMiscelaneous really should be split up a bit, but that will wait until I do a general refactoring of the tests.
Thanks for working on this.
msg132351 - (view) Author: Torsten Becker (torsten.becker) Date: 2011年03月27日 17:00
> However, there should be a test for that, and I'm curious to know what happens if you use such an address in an address field in the unmodified email package.
I added a test to check if the exceptions get thrown when a address is invalid.
I also added a small test to check how a resulting message should look, it looks good to me but I am not a specialist with email. Do you have any other ideas how to check if it does not have a negative impact to other parts of the module?
> Instead of directly calling bencode, you should use the charset module and its header_encode method. Note that you need to turn the charset into a Charset instance first. The advantage of doing this is that it will choose the "best" encoding to use based on the charset and the contents of the string.
The code also uses email.charset.Charset now.
msg132356 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年03月27日 18:51
You should check if 'charset' is a string, and call Charset on it only if it is (a Charset may be passed directly in other email package interfaces, and so should be supported here as well.
The test doesn't need to cater for the fact that either b or B (or q or Q) are legitimate: we know which one the package is generating, so just test for that.
For the Message['To'], I wasn't clear. What I would like is a test that includes non-ascii characters in the address part, *without* passing it through formataddr, to see what the package currently does with it. This may in fact reveal an additional bug. But, it is really out of scope for this issue, so you can just remove that test (sorry).
There should also be an update to the docs (Doc/library/email.utils.rst) documenting the API change.
msg132367 - (view) Author: Torsten Becker (torsten.becker) Date: 2011年03月27日 21:53
I incorporated the changes as you suggested and added the text to the docs. Just out of curiosity, why are the docs repeated in email.util.rst when they are already in the docstrings?
msg132381 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年03月28日 01:00
Thanks. Looks good except that it should check isinstance(string) rather than isinstance(Charset), that way someone can pass a custom class that implements the Charset API if they want. (Alternatively, the check could be for an encode_header method...actually that might be better, although it isn't what the other email modules do.)
The doc strings are an abreviated version of what is in the docstrings, and the text is often not-quite-equivalent even when it is not a strict subset of the docs. We believe it produces higher quality documentation to maintain them separately and tune each one for its intended use case (though this does mean that they occasionally get out of sync due to oversights).
msg132389 - (view) Author: Torsten Becker (torsten.becker) Date: 2011年03月28日 09:56
I incorporated that change as well. My rationale behind the previous version was to be consistent with how Lib/email/header.py handled this, unfortunately I did not look around in the other classes and didn't think about that kind of compatibility.
When formataddr() is called with a object which is not a string and which does not have a header_encode it will raise the following exception now:
> AttributeError: 'CharsetMock' object has no attribute 'header_encode'
Thank you for your patience, sorry that it took probably more of your time by taking 4 iterations for this patch than if you had just implemented it yourself.
msg132390 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年03月28日 10:46
Ah, yes. Header is probably wrong there, I should fix that at some point.
Sorry for the misytpes in my last message (it was late at night for me when I wrote it :)
As for time, it probably didn't take any more time than it would have to write it myself, and the end product is almost certainly better for having had two sets of eyes on it. This kind of back and forth often happens even when it is an experienced developer writing the patch. 
But even if neither of those were true it would be worthwhile to do it in order to support you in learning to contribute. Thanks again for working on this, and I'll probably commit it some time today.
msg133129 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011年04月06日 13:52
New changeset 184ddd9acd5a by R David Murray in branch 'default':
#1690608: make formataddr RFC2047 aware.
http://hg.python.org/cpython/rev/184ddd9acd5a 
msg133131 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年04月06日 14:04
Finally got around to committing this; thanks, Torsten. As a reward, I'm going to make you nosy on a new, related issue I'm about to create. It is, of course, your option whether you want to work on it :)
By the way, have you submitted a contributor agreement? This patch isn't really big enough to require one, but having one on file is always a good idea, especially if you are going to keep contributing (and I hope you do).
msg133202 - (view) Author: Torsten Becker (torsten.becker) Date: 2011年04月07日 09:31
Hi David, thank you for polishing up the patch and committing it. :)
I am glad I could help and I was actually about to ask you if you knew
any follow-up issues. I'll definitely continue contributing as time
allows. I did not submit the agreement yet, but I'll look into that
ASAP.
History
Date User Action Args
2022年04月11日 14:56:23adminsetgithub: 44784
2011年04月07日 09:31:56torsten.beckersetmessages: + msg133202
2011年04月06日 14:04:35r.david.murraysetstatus: open -> closed
resolution: accepted
messages: + msg133131

stage: commit review -> resolved
2011年04月06日 13:52:23python-devsetnosy: + python-dev
messages: + msg133129
2011年03月28日 10:46:31r.david.murraysetversions: - Python 3.1, Python 2.7, Python 3.2
2011年03月28日 10:46:13r.david.murraysetmessages: + msg132390
stage: test needed -> commit review
2011年03月28日 09:56:56torsten.beckersetfiles: + issue-1690608-v4.patch

messages: + msg132389
2011年03月28日 01:00:27r.david.murraysetmessages: + msg132381
2011年03月27日 21:53:50torsten.beckersetfiles: + issue-1690608-v3.patch

messages: + msg132367
2011年03月27日 18:51:49r.david.murraysetmessages: + msg132356
2011年03月27日 17:00:43torsten.beckersetfiles: + issue-1690608-v2.patch

messages: + msg132351
2011年03月27日 14:28:54r.david.murraysetmessages: + msg132331
2011年03月27日 13:44:28torsten.beckersetfiles: + issue-1690608.patch

nosy: + torsten.becker
messages: + msg132320

keywords: + patch
2011年03月13日 22:42:29r.david.murraysetversions: + Python 3.1, Python 2.7, Python 3.3
2010年08月05日 20:43:38terry.reedysetnosy: + terry.reedy
messages: + msg113048
2010年07月11日 14:46:27BreamoreBoysetversions: + Python 3.2, - Python 3.1, Python 2.7
2010年05月05日 13:45:45barrysetassignee: barry -> r.david.murray

nosy: + r.david.murray
2009年04月22日 05:08:47ajaksu2setkeywords: + easy
2009年03月30日 22:10:10ajaksu2setstage: test needed
type: enhancement
versions: + Python 3.1, Python 2.7, - Python 2.6
2007年03月29日 13:27:49barrycreate

AltStyle によって変換されたページ (->オリジナル) /