homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urrlib2/httplib doesn't reset file position between requests
Type: behavior Stage: needs patch
Components: Documentation, Library (Lib) Versions: Python 3.10, Python 3.9
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: Anthony.Kong, LorenzMende, ajaksu2, dheiberg, ggenellina, jjlee, kc, martin.panter, matejcik, orsenthil
Priority: normal Keywords: easy, patch

Created on 2009年01月23日 17:07 by matejcik, last changed 2022年04月11日 14:56 by admin.

Files
File name Uploaded Description Edit
auth-mmap.py martin.panter, 2018年09月02日 13:32 demonstration
Pull Requests
URL Status Linked Edit
PR 11843 closed python-dev, 2019年02月13日 18:30
PR 11904 closed kc, 2019年02月17日 07:58
Messages (11)
msg80419 - (view) Author: jan matejek (matejcik) * Date: 2009年01月23日 17:06
since 2.6 httplib supports reading from file-like objects.
Now consider the following situation:
There are two handlers in urrlib2, first is plain http, second is basic
auth.
I want to POST a file to a service, and pass the open file object as
data parameter to urllib2.urlopen.
First handler is invoked, it sends the file data, but gets 401
Unauthorized return code and fails with that.
Second handler in chain is invoked (at least that's how i understand
urrlib2, please correct me if i'm talking rubbish). At that point the
open file is at EOF, so empty data is sent.
furthermore, the obvious solution "you can't do this through urllib so
go read the file yourself" doesn't apply that well - the file object in
question is actually a mmap.mmap instance.
This code is in production since python 2.4. Until file object support
in httplib was introduced, it worked fine, handling the mmap'ed file as
a string. Now it is picked up as read()-able and this problem occurs.
Only workaround to restore pre-2.6 behavior that comes to mind is
building a wrapper class for the mmap object that hides its read() method.
msg80422 - (view) Author: Gabriel Genellina (ggenellina) Date: 2009年01月23日 23:28
This happens in other implementations too, not just urllib2.
If the server supports it, the best way is to send an 'Expect: 100-
Continue' header field before attempting to send the actual file.
msg185512 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2013年03月29日 19:49
I think, this requires triaging in terms of is the feature request still applicable. Except 100 is sent by httplib and the support for this was added few years ago, much later then this bug was originally raised.
msg241191 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015年04月16日 02:13
Actually, I do not think any "Expect: 100-continue" headers are explicitly sent by the Python standard library. The Python client does not support waiting for a "100 Continue" response; see Issue 1346874.
There is Issue 23740 opened about fixing or clarifying the various data types accepted by "http.client".
On the other hand, the documentation for urlopen() says only bytes and iterables are supported. If mmap objects are being treated as file objects by urlopen() that is unexpected, and the documentation or implementation needs fixing there. Also, iterating a mmap() object is different from iterating either the equivalent bytearray() or file object, so there is something weird going on there.
msg324476 - (view) Author: Lorenz Mende (LorenzMende) * Date: 2018年09月02日 08:26
Issue shall be closed, as no reproduction code is provided.
No patch provided and no comments since 2015.
msg324477 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2018年09月02日 13:32
Here is a demonstration script in case it helps. I haven’t tested it with versions before Python 2.6.
Older versions send "Content-Length: 11", but leave the server hanging trying to read the data. Newer versions (I presume since Issue 12319, 3.6+) send a valid HTTP 1.1 chunked request, but with empty data.
msg335624 - (view) Author: kc (kc) * Date: 2019年02月15日 17:28
PR 11843 should fix the issue in master, I didn't check python 2.6 or prior versions. The problem is that in the first request sent to HTTP service the POST data is sent correctly. After that the HTTP server responds with 401 and the request is resent but the mmap file pointer is pointing now to the end of the file because it has been fully read in the requests before. The PR just seeks to the beginning of the file after the file has been read and sends the request with auth credentials including POST body.
msg335672 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2019年02月16日 07:13
For 3.7+ (where iterable objects are supported), I suggest:
1. Document the problem as a limitation of handlers like AbstractBasicAuthHandler, and consider raising an exception instead of trying to upload a file or iterable a second time.
2. Clarify the behaviour for different types of the "urllib.request" data parameter. I understand "file-like objects" means objects with a "read" attribute, and the "read" method is called in preference to iteration or treating the parameter as a "bytes" object.
Despite the bug title, I don’t think the library should mess with the file position. Certainly not when making a single request. But it should already be possible for the caller to supply a custom iterable object that resets the file position:
class FileReiterator:
 def __iter__(self):
 self.file.seek(0)
 while True:
 chunk = self.file.read(self.chunksize)
 yield chunk
 if len(chunk) < self.chunksize:
 break
msg335760 - (view) Author: kc (kc) * Date: 2019年02月17日 08:06
I added a new pull request.
Martin, you are right I realized when looking through the code that just setting the file pointer to zero inside http lib might interfere with requests that don't have authentication enabled.
The new pull requests does number 2.) of your suggestion for both Basic and Digest authentication.
Can you please review the code? Thank you.
msg335767 - (view) Author: kc (kc) * Date: 2019年02月17日 10:16
I will fix the build errors first.
msg335778 - (view) Author: kc (kc) * Date: 2019年02月17日 15:03
the pull request now passed the build checks, please review the code.
History
Date User Action Args
2022年04月11日 14:56:44adminsetgithub: 49288
2020年11月04日 11:39:38iritkatrielsetstage: patch review -> needs patch
components: + Documentation
versions: + Python 3.9, Python 3.10, - Python 2.6
2019年02月17日 15:03:22kcsetmessages: + msg335778
2019年02月17日 10:16:06kcsetmessages: + msg335767
2019年02月17日 08:06:16kcsetmessages: + msg335760
2019年02月17日 07:58:12kcsetpull_requests: + pull_request11930
2019年02月16日 07:13:24martin.pantersetmessages: + msg335672
2019年02月15日 17:28:06kcsetnosy: + kc
messages: + msg335624
2019年02月13日 18:30:33python-devsetkeywords: + patch
stage: test needed -> patch review
pull_requests: + pull_request11873
2019年01月23日 00:57:19dheibergsetnosy: + dheiberg
2018年09月02日 13:32:33martin.pantersetfiles: + auth-mmap.py

messages: + msg324477
2018年09月02日 08:26:40LorenzMendesetnosy: + LorenzMende
messages: + msg324476
2015年04月16日 02:13:57martin.pantersetnosy: + martin.panter
messages: + msg241191
2013年03月30日 09:20:09Anthony.Kongsetnosy: + Anthony.Kong
2013年03月29日 19:49:56orsenthilsetassignee: orsenthil
messages: + msg185512
2009年04月22日 17:24:08ajaksu2setpriority: normal
keywords: + easy
2009年02月13日 01:48:27ajaksu2setnosy: + jjlee
2009年02月12日 18:38:39ajaksu2setnosy: + orsenthil, ajaksu2
stage: test needed
2009年01月23日 23:28:55ggenellinasetnosy: + ggenellina
messages: + msg80422
2009年01月23日 17:07:02matejcikcreate

AltStyle によって変換されたページ (->オリジナル) /