homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: cgi.parse_multipart is broken on 3.x
Type: behavior Stage:
Components: Library (Lib) Versions: Python 3.2, Python 3.3
process
Status: open Resolution: fixed
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: MHordecki, efosmark, eric.araujo, flox, glyph, gvanrossum, iritkatriel, jonas.wagner, milesck, orsenthil, python-dev, quentel, r.david.murray
Priority: normal Keywords: patch

Created on 2011年06月25日 15:09 by jonas.wagner, last changed 2022年04月11日 14:57 by admin.

Files
File name Uploaded Description Edit
cgi-coverage.diff jonas.wagner, 2011年06月25日 15:09 review
cgi-coverage-2.diff jonas.wagner, 2011年07月04日 11:26 review
Messages (14)
msg139082 - (view) Author: Jonas Wagner (jonas.wagner) Date: 2011年06月25日 15:09
While writing tests for the cgi module I came across what looks like a conversion bug.
cgi.parse_multipart is comparing values it reads from a binary file like with a string literal:
line = fp.readline()
...
if line.startswith("--"):
This patch adds fixes the issue and adds test for it.
msg139649 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2011年07月02日 13:53
This looks like a conversion bug indeed; network I/O should use bytes. Strange that no-one caught this, but if there was no test and no users, then bugs can slip.
See also #11066, #8077, #4953, #6234 (also adding some people from those bugs’ nosy fields).
msg139668 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年07月03日 01:26
Indeed, Victor's comments on his patch say that he changed code that was in the posted patch to say 'line.startswith(b'--')', and the original patch did use b'--', but the code he checked in is missing the 'b'. He also asked for more tests.
Victor, any chance you can review this patch, since you were the last one to work on the code in question?
msg139742 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2011年07月04日 07:55
The patch seems broken to me.
In cgi.parse_multipart(), the 'boundary' variable can be a string even though it is concatenated to bytes. Its default value is a string, and a string can be given via the pdict argument. There is no validity check other than valid_boundary(), which allows both string and bytes.
Most of the changes to test_cgi.py are entirely unrelated. The one test added which tests cgi.parse_multipart() should fail since it uses a string (not bytes) boundary, while the correct boundary for the test is commented out.
I short this patch seems half-baked. IMO reject this patch and fix just the bytes/strings issue with cgi.parse_multipart. Or, as mentioned in the comments, use FieldStorage to implement it and be done with it.
msg139752 - (view) Author: Jonas Wagner (jonas.wagner) Date: 2011年07月04日 11:26
Hi Tal,
Thanks a lot for your feedback.
My primary objective was to increase the test coverage for cgi.py. If it is a problem to have the additional tests in this patch I'm happy to create a new issue with a separate patch. 
The default value for the boundary was an oversight, sorry for that.
You are right regarding the commented out boundary as well, I forgot to refresh the patch. Again, sorry.
Do you think valid_boundary should contain a check to ensure it is a byte object?
msg139757 - (view) Author: Tal Einat (taleinat) * (Python committer) Date: 2011年07月04日 12:11
Yes, please submit the other additional tests in a separate issue.
The default value for boundary should surely be b"". A simple test should be added where cgi.parse_multipart() uses the default boundary.
If valid_boundary() is used only for cgi.parse_multipart() then it should be changed to validate that the boundary is a bytes instance (which would also make it simpler). Otherwise (if vaild_boundary() is also used elsewhere) cgi.parse_multipart() should itself check that the boundary is indeed a bytes instance, throwing a TypeError otherwise.
Tip: You should run the relevant tests, making sure they all pass, before submitting a patch. That way you really know that the patch actually works (as far as passing all of the tests).
Thanks for adding more stdlib tests :)
msg139803 - (view) Author: Pierre Quentel (quentel) * Date: 2011年07月04日 20:44
When the FieldStorage class was fixed there was a discussion in issue 4953 about the module-level functions parse() and parse_multipart(). The code was very similar to methods of the FieldStorage class so the idea was to use FieldStorage inside the functions
The patch proposed in issue 11066 replaced the code in parse_multipart by just :
def parse_multipart(fp, pdict):
 return FieldStorage(fp,environ=pdict)
Did anyone test it ?
msg180417 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013年01月22日 18:39
Twisted would really like to see this bug fixed.
msg180425 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013年01月22日 19:16
Does anyone who was on this bug previously (e.g. the original author
or the reviewers) know what was holding up the patch? Does it need
more review? More tests? Is there any reason to reject fixing this at
all? (I hope not.) As far as replacing the whole thing with a call
into the other code goes, I'm hesitant if only because we don't have
enough unit tests for the edge cases of the implementation that would
be deleted, so if the wholesale replacement were to break user code we
wouldn't find out until after it's been released. Fixing it seems less
risky.
msg180428 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013年01月22日 19:22
I personally think, that the "grey area" of multipart form encoding and trying to use email's updated features for parsing was holding it, not the tests. This can be submitted IMO after looking at the "related bugs", I shall do a review on this one today.
msg180429 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2013年01月22日 19:23
Thank you very much Senthil!
msg180456 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2013年01月23日 11:01
New changeset a46a0dafcb7a by Senthil Kumaran in branch '3.2':
Issue #12411: Fix to cgi.parse_multipart to correctly use bytes boundaries and
http://hg.python.org/cpython/rev/a46a0dafcb7a
New changeset 59ea872d8b6b by Senthil Kumaran in branch '3.3':
merge from 3.2
http://hg.python.org/cpython/rev/59ea872d8b6b
New changeset 3d7000549eb1 by Senthil Kumaran in branch 'default':
merge from 3.3
http://hg.python.org/cpython/rev/3d7000549eb1 
msg180457 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013年01月23日 11:05
I updated the patch addressing Ezio's comments in the review system and also condensed the tests. This fixes the parse_multipart's byte handling at "some" level. 
The docstring of parse_multipart say that, this should be deprecated in favor of FieldStorage completely. I will have to trace through the argument and see what should be done here and then I shall close this bug.
msg408461 - (view) Author: Irit Katriel (iritkatriel) * (Python committer) Date: 2021年12月13日 16:26
Senthil, was this last part done in issue29979?
History
Date User Action Args
2022年04月11日 14:57:19adminsetstatus: pending -> open
github: 56620
2021年12月13日 17:35:22iritkatrielsetstatus: open -> pending
2021年12月13日 17:29:21vstinnersetstatus: pending -> open
nosy: - vstinner
2021年12月13日 16:26:14iritkatrielsetstatus: open -> pending
nosy: + iritkatriel
messages: + msg408461

2014年02月04日 12:15:19taleinatsetnosy: - taleinat
2013年01月23日 11:05:33orsenthilsetassignee: orsenthil
resolution: fixed
messages: + msg180457
2013年01月23日 11:01:36python-devsetnosy: + python-dev
messages: + msg180456
2013年01月22日 19:23:25gvanrossumsetmessages: + msg180429
2013年01月22日 19:22:17orsenthilsetmessages: + msg180428
2013年01月22日 19:16:07gvanrossumsetmessages: + msg180425
2013年01月22日 18:41:29glyphsetnosy: + glyph
2013年01月22日 18:39:04gvanrossumsetnosy: + gvanrossum
messages: + msg180417
2012年07月21日 14:15:03floxsetnosy: + orsenthil
2011年07月04日 20:44:02quentelsetmessages: + msg139803
2011年07月04日 12:11:23taleinatsetmessages: + msg139757
2011年07月04日 11:26:35jonas.wagnersetfiles: + cgi-coverage-2.diff

messages: + msg139752
2011年07月04日 07:55:58taleinatsetnosy: + taleinat
messages: + msg139742
2011年07月03日 01:26:27r.david.murraysetnosy: + r.david.murray
messages: + msg139668
2011年07月02日 13:53:27eric.araujosetnosy: + eric.araujo, efosmark, milesck, MHordecki, flox, quentel

messages: + msg139649
versions: + Python 3.2
2011年06月25日 15:16:37r.david.murraysetnosy: + vstinner
2011年06月25日 15:09:07jonas.wagnercreate

AltStyle によって変換されたページ (->オリジナル) /