This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年07月31日 16:07 by Peter Landry, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| cgi_multipart.patch | Peter Landry, 2015年07月31日 16:07 | Test case and naive bugfix | review | |
| cgi_multipart.patch | Peter Landry, 2015年08月07日 15:12 | Improved patch that removes the header when present | review | |
| Messages (14) | |||
|---|---|---|---|
| msg247751 - (view) | Author: Peter Landry (Peter Landry) * | Date: 2015年07月31日 16:07 | |
`cgi.FieldStorage` can't parse a multipart with a `Content-Length` header set on a part:
```Python 3.4.3 (default, May 22 2015, 15:35:46)
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.49)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cgi
>>> from io import BytesIO
>>>
>>> BOUNDARY = "JfISa01"
>>> POSTDATA = """--JfISa01
... Content-Disposition: form-data; name="submit-name"
... Content-Length: 5
...
... Larry
... --JfISa01"""
>>> env = {
... 'REQUEST_METHOD': 'POST',
... 'CONTENT_TYPE': 'multipart/form-data; boundary={}'.format(BOUNDARY),
... 'CONTENT_LENGTH': str(len(POSTDATA))}
>>> fp = BytesIO(POSTDATA.encode('latin-1'))
>>> fs = cgi.FieldStorage(fp, environ=env, encoding="latin-1")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/cgi.py", line 571, in __init__
self.read_multi(environ, keep_blank_values, strict_parsing)
File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/cgi.py", line 726, in read_multi
self.encoding, self.errors)
File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/cgi.py", line 573, in __init__
self.read_single()
File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/cgi.py", line 736, in read_single
self.read_binary()
File "/usr/local/Cellar/python3/3.4.3/Frameworks/Python.framework/Versions/3.4/lib/python3.4/cgi.py", line 758, in read_binary
self.file.write(data)
TypeError: must be str, not bytes
>>>
```
This happens because of a mismatch between the code that creates a temp file to write to and the code that chooses to read in binary mode or not:
* the presence of `filename` in the `Content-Disposition` header triggers creation of a binary mode file
* the present of a `Content-Length` header for the part triggers a binary read
When `Content-Length` is present but `filename` is absent, `bytes` are written to the non-binary temp file, causing the error above.
I've reviewed the relevant RFCs, and I'm not really sure what the correct way to handle this is. I don't believe `Content-Length` is addressed for part bodies in the MIME spec[0], and HTTP has its own semantics[1].
At the very least, I think this behavior is confusing and unexpected. Some libraries, like Retrofit[2], will by default include `Content-Length`, and break when submitting POST data to a python server.
I've made an attempt to work in the way I'd expect, and attached a patch, but I'm really not sure if it's the proper decision. My patch kind of naively accepts the existing semantics of `Content-Length` that presume bytes, and treats the creation of a non-binary file as the "bug".
[0]: http://www.ietf.org/rfc/rfc2045.txt
[1]: http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4
[2]: http://square.github.io/retrofit/
|
|||
| msg247752 - (view) | Author: Peter Landry (Peter Landry) * | Date: 2015年07月31日 16:08 | |
I realized my formatting was poor, making it hard to quickly test the issue. Here's a cleaner version:
import cgi
from io import BytesIO
BOUNDARY = "JfISa01"
POSTDATA = """--JfISa01
Content-Disposition: form-data; name="submit-name"
Content-Length: 5
Larry
--JfISa01"""
env = {
'REQUEST_METHOD': 'POST',
'CONTENT_TYPE': 'multipart/form-data; boundary={}'.format(BOUNDARY),
'CONTENT_LENGTH': str(len(POSTDATA))}
fp = BytesIO(POSTDATA.encode('latin-1'))
fs = cgi.FieldStorage(fp, environ=env, encoding="latin-1")
|
|||
| msg247753 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年07月31日 16:13 | |
@Pierre Quentel: Hi! Are you still working on CGI? Can you please review this patch? Thanks. -- Previous large change in the cgi module: issue #4953. Pierre helped a lot on this one. |
|||
| msg247799 - (view) | Author: Pierre Quentel (quentel) * | Date: 2015年08月01日 07:37 | |
Yes, I will be able to review the patch next week 2015年07月31日 18:13 GMT+02:00 STINNER Victor <report@bugs.python.org>: > > STINNER Victor added the comment: > > @Pierre Quentel: Hi! Are you still working on CGI? Can you please review > this patch? Thanks. > > -- > > Previous large change in the cgi module: issue #4953. Pierre helped a lot > on this one. > > ---------- > nosy: +quentel > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue24764> > _______________________________________ > |
|||
| msg248185 - (view) | Author: Pierre Quentel (quentel) * | Date: 2015年08月07日 10:47 | |
I don't really see why there is a Content-Length in the headers of a multipart form data. The specification at http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.2 doesn't mention it, and it is absent in the example that looks like the one tested by Peter : Content-Type: multipart/form-data; boundary=AaB03x --AaB03x Content-Disposition: form-data; name="submit-name" Larry --AaB03x Content-Disposition: form-data; name="files"; filename="file1.txt" Content-Type: text/plain ... contents of file1.txt ... --AaB03x-- In case a user agent would insert it, I think the best would be to ignore it. That is, inside read_multi(), after headers = parser.close() add these lines : if 'content-length' in headers: del headers['content-length'] It's hard to see the potential side effects but I think it's cleaner than the proposed patch, which is not correct anyway for another reason : the attribute value is set to a bytes objects, instead of a string. Peter, does this make sense ? If so, can you submit another patch ? |
|||
| msg248198 - (view) | Author: Peter Landry (Peter Landry) * | Date: 2015年08月07日 15:06 | |
Yeah, I think that makes the most sense to me as well. I tried to make a minimum-impact patch, but this feels cleaner. If we remove the Content-Length header, the `limit` kwarg might occur at an odd place in the part itself, but that feels unavoidable if someone submits an incorrect Content-Length for the request itself. I'll re-work the patch and make sure the tests I added still add value. Thanks! |
|||
| msg248199 - (view) | Author: Peter Landry (Peter Landry) * | Date: 2015年08月07日 15:12 | |
A new patch that simply removes Content-Length from part headers when present. |
|||
| msg248258 - (view) | Author: Pierre Quentel (quentel) * | Date: 2015年08月08日 09:43 | |
Victor, you can apply the patch and close the issue. Le 7 août 2015 17:12, "Peter Landry" <report@bugs.python.org> a écrit : > > Peter Landry added the comment: > > A new patch that simply removes Content-Length from part headers when > present. > > ---------- > Added file: http://bugs.python.org/file40145/cgi_multipart.patch > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue24764> > _______________________________________ > |
|||
| msg248306 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年08月08日 22:56 | |
Not today. I'm in holiday. Ping me in two weeks or find another core dev. |
|||
| msg248720 - (view) | Author: Pradeep (pradeep) | Date: 2015年08月17日 07:09 | |
Hi Victor, I found similar problem at https://bugs.launchpad.net/barbican/+bug/1485452, is problem mentioned in the bug is same with mentioned bug? |
|||
| msg248729 - (view) | Author: Peter Landry (Peter Landry) * | Date: 2015年08月17日 14:20 | |
Pradeep, that error seems to be in Barbican. This bug and patch only addresses content-length headers in MIME multipart messages. |
|||
| msg248778 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年08月18日 17:24 | |
New changeset 11e9f34169d1 by Victor Stinner in branch '3.4': cgi.FieldStorage.read_multi ignores Content-Length https://hg.python.org/cpython/rev/11e9f34169d1 New changeset 5b9209e4c3e4 by Victor Stinner in branch '3.5': (Merge 3.4) cgi.FieldStorage.read_multi ignores Content-Length https://hg.python.org/cpython/rev/5b9209e4c3e4 New changeset 0ff1acc89cf0 by Victor Stinner in branch 'default': (Merge 3.5) cgi.FieldStorage.read_multi ignores Content-Length https://hg.python.org/cpython/rev/0ff1acc89cf0 |
|||
| msg248779 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年08月18日 17:26 | |
I applied the fix, thanks Peter for the report and the fix, thanks Pierre for the review. > https://bugs.launchpad.net/barbican/+bug/1485452, > is problem mentioned in the bug is same with mentioned bug? I don't know, but you can try to apply the patch locally if you want, or download the Python 3.4 using Mercurial. |
|||
| msg281076 - (view) | Author: Bert JW Regeer (X-Istence) * | Date: 2016年11月18日 05:21 | |
I've uploaded a patchset to bug #27777 that fixes this issue by fixing make_file, and doesn't cause Python to throw out the content-length information. It also fixes FieldStorage for when you provide it a non-multipart form submission and there is no list in FS. Please see http://bugs.python.org/issue27777 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:19 | admin | set | github: 68952 |
| 2020年07月20日 20:52:04 | Rhodri James | set | nosy:
- Rhodri James |
| 2019年12月03日 17:23:21 | ethan.furman | set | assignee: ethan.furman nosy: + ethan.furman, Rhodri James |
| 2016年11月18日 05:21:51 | X-Istence | set | nosy:
+ X-Istence messages: + msg281076 |
| 2016年06月13日 20:03:47 | berker.peksag | link | issue27308 superseder |
| 2015年08月18日 17:45:31 | vstinner | set | status: open -> closed resolution: fixed |
| 2015年08月18日 17:26:58 | vstinner | set | messages: + msg248779 |
| 2015年08月18日 17:24:39 | python-dev | set | nosy:
+ python-dev messages: + msg248778 |
| 2015年08月17日 14:20:35 | Peter Landry | set | messages: + msg248729 |
| 2015年08月17日 07:09:04 | pradeep | set | nosy:
+ pradeep messages: + msg248720 |
| 2015年08月10日 21:16:26 | berker.peksag | set | nosy:
+ berker.peksag stage: patch review type: behavior versions: - Python 3.2, Python 3.3 |
| 2015年08月08日 22:56:01 | vstinner | set | messages: + msg248306 |
| 2015年08月08日 09:43:05 | quentel | set | messages: + msg248258 |
| 2015年08月07日 15:12:11 | Peter Landry | set | files:
+ cgi_multipart.patch messages: + msg248199 |
| 2015年08月07日 15:06:48 | Peter Landry | set | messages: + msg248198 |
| 2015年08月07日 10:48:00 | quentel | set | messages: + msg248185 |
| 2015年08月01日 07:37:06 | quentel | set | messages: + msg247799 |
| 2015年07月31日 16:13:48 | vstinner | set | nosy:
+ quentel messages: + msg247753 |
| 2015年07月31日 16:08:47 | Peter Landry | set | messages: + msg247752 |
| 2015年07月31日 16:07:21 | Peter Landry | create | |