homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: multipart/form-data encoding
Type: enhancement Stage: needs patch
Components: email Versions: Python 3.4
process
Status: open Resolution:
Dependencies: 3243 Superseder:
Assigned To: Nosy List: Chris.Waigl, Johannes.Hoff, ajaksu2, alexz, atommixz, barry, bgamari, catalin.iacob, catlee, cco3, checat, daniel.ugra, eric.araujo, forest_atq, fsteinel, gotgenes, guettli, jnoller, martin.panter, orsenthil, piotr.dobrogost, pitrou, r.david.murray, raylu, shazow, tamentis
Priority: normal Keywords: easy, patch

Created on 2008年06月30日 18:04 by catlee, last changed 2022年04月11日 14:56 by admin.

Files
File name Uploaded Description Edit
rfc2388.py forest_atq, 2010年06月04日 04:14 Simple implementation of multipart/form-data.
http_formdata.patch forest_atq, 2010年06月04日 12:40 Patch against py3k implementing http.formdata
http_formdata.patch forest_atq, 2010年06月04日 12:53 New patch with module docstring.
http_formdata.patch forest_atq, 2010年06月04日 13:39 New patch
Issue3244.patch orsenthil, 2012年06月25日 10:28 review
Messages (42)
msg69015 - (view) Author: Chris AtLee (catlee) * Date: 2008年06月30日 18:04
The standard library should provide a way to encode data using the
standard multipart/form-data encoding.
This encoding is required to support file uploads via HTTP POST (or PUT)
requests.
Ideally file data could be streamed to the remote server if httplib
supported iterable request bodies (see issue #3243).
Mailing list thread:
http://mail.python.org/pipermail/python-dev/2008-June/080783.html 
msg76356 - (view) Author: Florian Steinel (fsteinel) Date: 2008年11月24日 19:38
see http://code.activestate.com/recipes/146306/ for a user contributed
implementation.
msg76357 - (view) Author: Chris AtLee (catlee) * Date: 2008年11月24日 19:54
I also wrote some software to handle this:
http://atlee.ca/software/poster/poster.encode.html
The reason I wrote this is to avoid having the load the entire file into
memory before posting the request.
This, along with Issue #3243, would allow streaming uploads of files via
HTTP POST.
msg81498 - (view) Author: Daniel Diniz (ajaksu2) * (Python triager) Date: 2009年02月09日 21:47
So, what is the best way to go about this (beyond docs and tests)? Beat
the linked recipe into a patch, adapt Chris' implementation?
msg93657 - (view) Author: Andrey Petrov (shazow) Date: 2009年10月06日 18:35
Once upon a time I wrote a library that did some of this among other things:
http://code.google.com/p/urllib3/
Or specifically:
http://code.google.com/p/urllib3/source/browse/trunk/urllib3/filepost.py
The code was borrowed from some of the recipes mentioned, but cleaned up
and adjusted a bit. Feel free to use it or borrow from it in any way you
like.
msg93666 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2009年10月06日 21:53
This request really does need a patch+tests+doc changes - I don't know if 
anyone with +commit has the time to distill the various implementations 
and generate something.
msg97571 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年01月11日 03:03
Daniel suggested marking this as superseding issue 727898, and I agree. But I want to note that in that issue Barry suggested that it was possible services from the email package could be useful in building this support, and that there might be a better place for it to live than urllib.
msg107004 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 04:14
Hi,
I believe the attached implementation is reasonable. I'm not sure if it should be called "email.mime.formdata", "rfc2388", etc.
I'd be happy to attach a proper patch with tests given some quick feedback.
Thanks,
Forest
msg107005 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 04:18
Oh, hm, looks like I left a hard-coded name="files" in attach_file. I'll fix that in the patch after I've received any other feedback.
msg107019 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010年06月04日 09:39
You should write your patch against Python 3.x (py3k).
msg107024 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 11:23
I haven't yet touched Python 3.0, and may not have time to dig in at the moment. It wouldn't be suitable to provide a patch against 2.7?
msg107025 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010年06月04日 11:26
> I haven't yet touched Python 3.0, and may not have time to dig in at
> the moment. It wouldn't be suitable to provide a patch against 2.7?
2.7 is almost in release candidate phase, which means it's much too late
for new features now.
msg107029 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 11:50
Okay, I'll submit against py3k.
msg107031 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 11:52
Should the module be called rfc2388 or should it go into email.mime as formdata? It seems odd to put something HTML/HTTP related into email.mime, but maybe that would be fine. In any case, httplib docs should probably point to this module with an example, right?
msg107032 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010年06月04日 11:54
I think it belongs in the http package.
msg107034 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 11:56
As http.formdata?
msg107038 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010年06月04日 12:07
Seems good to me, as long as the module docstring clearly stats whether it’s useful for the client side, the server side or both.
BTW, isn’t there overlap with cgi.FieldStorage?
msg107041 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 12:40
Hi,
Patch attached. Let me know what needs fixing.
I had to fix a bug in email.encoders for my tests to pass. I have not run the full test suite at this point (need to build py3k to do that, maybe I'll have time later today, but if someone else has time, feel free).
Thanks,
Forest
msg107042 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 12:48
Éric,
Sorry, I just read your message.
I'll post a new patch with a module docstring.
I believe cgi.FieldStorage is only useful for parsing (i.e. on the server side). MIMEMultipartFormData is for generating multipart/form-data messages (i.e. on the client side).
Thanks,
Forest
msg107044 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 12:53
Here's a new patch.
msg107045 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010年06月04日 12:56
Would you please open another issue for the email fix? Bonus points if you test it on trunk too, since release candidate happens in some days :)
Do you people think we could unify client and server-side code in the new module (with an alias from cgi for b/w compat), to prevent endless questions?
Minor remark: I think we don’t have to follow the email naming scheme here. A simpler name like FormData could be just fine.
msg107046 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 13:11
Hm, there is one issue. The example in the docstring wouldn't work.
You have to get the headers *after* the body, because the boundary isn't generated until the body has been. So this would work:
 body = msg.get_body()
 headers = dict(msg)
But this won't:
 headers = dict(msg)
 body = msg.get_body()
I'm not sure what the best way to deal with this is. Maybe instead of get_body we should have get_request_data which returns both headers and body. That would provide simpler semantics.
Thoughts?
Thanks,
Forest
msg107050 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 13:39
New patch:
* Renames class to FormData.
* Replaces method get_body with get_request_data to simplify semantics.
* Drops changes to email.encoders. I'll create a new ticket to deal with that bug. Note that tests here fail without that fix.
msg107056 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 14:07
See issue8896 for email.encoders fix.
msg107058 - (view) Author: Forest Bond (forest_atq) * Date: 2010年06月04日 14:15
I don't think Python trunk has the encoders issue, as that is related to the base64 moving to the bytes type.
msg108692 - (view) Author: Konstantin Pelepelin (checat) Date: 2010年06月26日 08:13
Did you test it against server-side form-data parser implementation? It will be useless if it won't work with most widespread implementations: PHP's and at least some others (consider some popular python web frameworks). 
MIME-compliance is not enough, because browsers send only subset of MIME format and server-side parsers don't expect bodies full of MIME features.
Particularly, I believe, most implementations don't expect any "Content-Transfer-Encoding", except "binary", because only 8-bit transfers are implemented in browsers, also you should check there will be no line-splitting and header-folding in headers and content, and make sure CR+LF (not plain LF, which is in patch) is always used.
msg128608 - (view) Author: Ben Gamari (bgamari) Date: 2011年02月15日 17:18
Has there been any progress here?
msg128611 - (view) Author: Forest Bond (forest_atq) * Date: 2011年02月15日 17:47
Hi,
Sorry for the long delay. I have tested against a Python web application using restish via various WSGI web servers (CherryPy, wsgiref) and I have not seen problems. It may cause problems with other server-side implementations.
I will not have time to do broad testing against many different server-side implementations. Is there harm in applying the patch and fixing bugs that get reported?
Thanks,
Forest
msg128612 - (view) Author: Forest Bond (forest_atq) * Date: 2011年02月15日 17:48
Looks like bgamari and I stepped on each other's requests.
msg128613 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年02月15日 18:36
This patch needs a Doc component.
Having finally taken a quick look at the RFC (I haven't read it through yet), I think this does belong in email and not http. The RFC makes it clear that while the most common implementation is http, it is designed to be generic, and as such IMO the logical place for it in the stdlib is with the rest of the MIME types, in email. From a usability standpoint, however, it would be more convenient in http, so if most people think it should go into http I won't object.
msg128618 - (view) Author: Jesse Noller (jnoller) * (Python committer) Date: 2011年02月15日 19:14
Yeah, despite what the RFC says, the most common usage is in web clients, and stuffing it in the email module won't be obvious to 95% of the population I think, unless that's where the implementation lives, but we can add a doc stub in the http docs pointing to it and why.
msg128620 - (view) Author: Forest Bond (forest_atq) * Date: 2011年02月15日 21:14
Hi,
So is the following enough to get this applied? If so, I'm game.
* Review RFC and enforce Content-Encoding: binary if applicable [checat].
* Generate CR+LF line endings [checat].
* Review RFC and address "line-splitting and header-folding" if applicable [checat].
* Write documentation.
I can have this done in a week or so, but I'd like to have some confidence that it will be applied if I spend the time on it.
Thanks,
Forest
msg128621 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2011年02月15日 21:55
In principle I think something like this should go in. Since it is a Message subclass, I'd like it to follow the current Message API whether or not it is located in the email package. __str__ and as_string have the right default for line length (no folding). The current default for line endings is \n, and I think the class should stick with that. You can't use __str__ or as_string to generate what you send on the wire if you are supporting binary data.
I am planning additions to the email API that will make integrating this class and adjusting the generated line endings easier. For the latter (assuming the email-sig approves) I plan a __bytes__ method that will generate "wire format", which would include using \r\n line endings and should be just what you need.
The current email package does not support the binary content transfer encoding, only 8bit. Support for the binary CTE is another planned addition for 3.3, and I think it can be prioritized ahead of most other features, given that this code needs it.
So, you might want to wait until the email pieces are in place, and possibly even help me develop them :)
msg141905 - (view) Author: Johannes Hoff (Johannes.Hoff) Date: 2011年08月11日 14:31
Forest Bond: Thanks for this patch - I hope it will go in soon. In the meantime, could I get permission to use it as is? (I notice there is a copyright in the file) I would of course keep the attributions in the file.
msg141906 - (view) Author: Forest Bond (forest_atq) * Date: 2011年08月11日 14:35
Hi, Johannes. You can assume the Python license for this patch.
-Forest
msg161511 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年05月24日 14:34
Forest, could you please submit a contributor agreement?
http://www.python.org/psf/contrib/
Life caught up with me and I haven't made enough progress to do anything with this yet, but I still want to.
msg161512 - (view) Author: Forest Bond (forest_atq) * Date: 2012年05月24日 14:40
Sure thing. I'll send it via e-mail later today.
msg161522 - (view) Author: Forest Bond (forest_atq) * Date: 2012年05月24日 18:04
Okay, Contributor Agreement sent.
msg161526 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年05月24日 19:12
Thanks.
msg163926 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2012年06月25日 10:28
Thanks for the patch, Forest Bond. 
However, the way I look at this feature, it could be added into urllib.request as a separate handler called MultiPostHandler and request object when it requires it should be able to add it and then use it.
Here is a first version of this patch, which would give the idea of how it would be added to the urllib.request (Note this is python3 adaptation of PyPi package by name MultipartPostHandler). I shall see to add this in 3.3 and shall include the tests/docs/howto.
msg163969 - (view) Author: Forest Bond (forest_atq) * Date: 2012年06月25日 13:51
Hi Senthil Kumaran,
Thanks for the feedback & patch.
I agree having support in urllib probably makes some sense. But why not implement basic support elsewhere and then tie it into urllib so those of us using something else can also use it? I'm using httplib in my application.
Thanks,
Forest
msg272655 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2016年08月14日 07:37
I think encoding the user’s IP address into the boundary is a bad idea. Forest’s version uses the existing "email" package, which calls random.randrange(sys.maxsize) and searches through the data for conflicts.
I haven’t really researched this, but I suspect it would be even better to use a CSPRNG like the new "secrets" module, or uuid.uuid4(). Otherwise, perhaps there is the possibility of attacks by predicting the boundary and injecting HTTP headers, splitting up requests, etc via a file upload.
Both Forest and Senthil’s patches look like they load all the data into memory, so would not be useful for streaming, which was the original request. Hence I am putting this back to "needs patch". Issue 3243 has been resolved, meaning that we can stream upload data as long as the Content-Length has been pre-calculated. The length could be calculated based from the length of each piece (e.g. file sizes).
Also, with Issue 12319 (chunked encoding) about to be resolved, if people only need to use HTTP 1.1, it may be easier to upload forms using chunked encoding, where you don’t have to worry about Content-Length.
History
Date User Action Args
2022年04月11日 14:56:36adminsetgithub: 47494
2016年08月14日 07:37:52martin.pantersetmessages: + msg272655
stage: patch review -> needs patch
2016年08月13日 10:44:27SilentGhostsetnosy: + barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, bgamari, daniel.ugra, alexz, tamentis, checat, catalin.iacob, Chris.Waigl, Johannes.Hoff, martin.panter, cco3, atommixz, piotr.dobrogost, raylu, - lissacoffeyx
2016年08月13日 10:44:12SilentGhostsetmessages: - msg272583
2016年08月13日 10:37:58lissacoffeyxsetnosy: + lissacoffeyx, - barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, bgamari, daniel.ugra, alexz, tamentis, checat, catalin.iacob, Chris.Waigl, Johannes.Hoff, martin.panter, cco3, atommixz, piotr.dobrogost, raylu
messages: + msg272583
2015年08月13日 04:29:49raylusetnosy: + raylu
2013年11月08日 05:26:28martin.pantersetnosy: + martin.panter
2012年11月22日 23:36:53piotr.dobrogostsetnosy: + piotr.dobrogost
2012年10月02日 05:34:08ezio.melottisetversions: + Python 3.4, - Python 3.3
2012年06月25日 13:51:35forest_atqsetmessages: + msg163969
2012年06月25日 10:28:03orsenthilsetfiles: + Issue3244.patch

messages: + msg163926
2012年05月24日 19:12:04r.david.murraysetmessages: + msg161526
2012年05月24日 18:04:10forest_atqsetmessages: + msg161522
2012年05月24日 14:40:52forest_atqsetmessages: + msg161512
2012年05月24日 14:34:40r.david.murraysetassignee: r.david.murray ->
messages: + msg161511
components: + email, - Library (Lib)
stage: test needed -> patch review
2011年10月21日 20:17:32daniel.ugrasetnosy: + daniel.ugra
2011年10月06日 20:33:53atommixzsetnosy: + atommixz
2011年08月24日 14:26:47cco3setnosy: + cco3
2011年08月11日 14:35:20forest_atqsetmessages: + msg141906
2011年08月11日 14:31:12Johannes.Hoffsetnosy: + Johannes.Hoff
messages: + msg141905
2011年04月12日 14:09:47catalin.iacobsetnosy: + catalin.iacob
2011年02月15日 21:55:43r.david.murraysetnosy: barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, bgamari, alexz, tamentis, checat, Chris.Waigl
messages: + msg128621
2011年02月15日 21:14:17forest_atqsetnosy: barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, bgamari, alexz, tamentis, checat, Chris.Waigl
messages: + msg128620
2011年02月15日 19:14:34jnollersetnosy: barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, bgamari, alexz, tamentis, checat, Chris.Waigl
messages: + msg128618
2011年02月15日 18:36:56r.david.murraysetnosy: barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, bgamari, alexz, tamentis, checat, Chris.Waigl
messages: + msg128613
2011年02月15日 17:48:58forest_atqsetnosy: barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, bgamari, alexz, tamentis, checat, Chris.Waigl
messages: + msg128612
2011年02月15日 17:47:36forest_atqsetnosy: barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, bgamari, alexz, tamentis, checat, Chris.Waigl
messages: + msg128611
2011年02月15日 17:18:02bgamarisetnosy: + bgamari
messages: + msg128608
2010年12月27日 17:28:55r.david.murraysetnosy: barry, guettli, orsenthil, pitrou, catlee, gotgenes, ajaksu2, jnoller, eric.araujo, forest_atq, fsteinel, r.david.murray, shazow, alexz, tamentis, checat, Chris.Waigl
versions: + Python 3.3, - Python 3.2
2010年08月11日 12:22:07Chris.Waiglsetnosy: + Chris.Waigl
2010年06月26日 08:13:57checatsetnosy: + checat
messages: + msg108692
2010年06月04日 14:15:22forest_atqsetmessages: + msg107058
2010年06月04日 14:07:59forest_atqsetmessages: + msg107056
2010年06月04日 13:39:59forest_atqsetfiles: + http_formdata.patch

messages: + msg107050
2010年06月04日 13:11:05forest_atqsetmessages: + msg107046
2010年06月04日 12:56:07eric.araujosetmessages: + msg107045
2010年06月04日 12:53:50forest_atqsetfiles: + http_formdata.patch

messages: + msg107044
2010年06月04日 12:48:48forest_atqsetmessages: + msg107042
2010年06月04日 12:40:47forest_atqsetfiles: + http_formdata.patch
keywords: + patch
messages: + msg107041
2010年06月04日 12:07:02eric.araujosetmessages: + msg107038
2010年06月04日 11:56:34forest_atqsetmessages: + msg107034
2010年06月04日 11:54:58eric.araujosetnosy: + eric.araujo
messages: + msg107032
2010年06月04日 11:52:04forest_atqsetmessages: + msg107031
2010年06月04日 11:50:26forest_atqsetmessages: + msg107029
2010年06月04日 11:26:36pitrousetmessages: + msg107025
2010年06月04日 11:23:20forest_atqsetmessages: + msg107024
2010年06月04日 09:39:21pitrousetnosy: + pitrou
messages: + msg107019
2010年06月04日 04:18:05forest_atqsetmessages: + msg107005
2010年06月04日 04:14:09forest_atqsetfiles: + rfc2388.py
nosy: + forest_atq
messages: + msg107004

2010年05月05日 16:33:20r.david.murraysetassignee: r.david.murray
versions: + Python 3.2, - Python 2.7
2010年05月05日 13:38:38barrysetassignee: barry -> (no value)
2010年01月11日 03:04:07r.david.murraylinkissue727898 superseder
2010年01月11日 03:03:26r.david.murraysetnosy: + r.david.murray
messages: + msg97571
2009年11月03日 11:58:42guettlisetnosy: + guettli
2009年10月06日 21:53:20jnollersetnosy: + jnoller
messages: + msg93666
2009年10月06日 18:35:25shazowsetnosy: + shazow
messages: + msg93657
2009年08月12日 18:17:55tamentissetnosy: + tamentis
2009年07月19日 18:40:04alexzsetnosy: + alexz
2009年05月04日 15:45:25barrysetassignee: barry

nosy: + barry
2009年04月22日 18:48:42ajaksu2setpriority: normal
2009年02月12日 17:47:25ajaksu2setkeywords: + easy
nosy: + orsenthil
dependencies: + Support iterable bodies in httplib
stage: test needed
2009年02月09日 21:47:09ajaksu2setnosy: + ajaksu2
messages: + msg81498
2009年01月07日 18:29:50gotgenessetnosy: + gotgenes
2008年11月24日 19:54:47catleesetmessages: + msg76357
2008年11月24日 19:38:54fsteinelsetnosy: + fsteinel
messages: + msg76356
2008年06月30日 18:04:19catleecreate

AltStyle によって変換されたページ (->オリジナル) /