This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年06月19日 13:53 by ezio.melotti, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| 6312.diff | chkneo, 2009年06月29日 17:03 | patch for Lib/http/client.py | ||
| Messages (11) | |||
|---|---|---|---|
| msg89521 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2009年06月19日 13:53 | |
Try this code (youtube.com uses "transfer-encoding: chunked"): import httplib url = 'www.youtube.com' conn = httplib.HTTPConnection(url) conn.request('HEAD', '/') # send an HEAD request res = conn.getresponse() print res.getheader('transfer-encoding') so far it works fine, but when you try: res.read() it just hung there, where "there" is: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Programs\Python26\lib\httplib.py", line 517, in read return self._read_chunked(amt) File "C:\Programs\Python26\lib\httplib.py", line 553, in _read_chunked line = self.fp.readline() File "C:\Programs\Python26\lib\socket.py", line 395, in readline data = recv(1) KeyboardInterrupt If instead of youtube.com we replace the url with the one of a site that doesn't use "transfer-encoding: chunked" (e.g. url = 'dpaste.com'), res.read() returns an empty string. When an HEAD request is sent, the content of the page is not returned, so there should be no point in calling .read(), but try this: import urllib2 class HeadRequest(urllib2.Request): def get_method(self): return 'HEAD' url = 'http://www.youtube.com/watch?v=tCVqx2b-c7U' # Note: I had this problem with this URL, the video # is not available in my country (Finland) and it # may work fine for other countries req = HeadRequest(url) page = urllib2.urlopen(req) This is what happens here with Python 2.5: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.5/urllib2.py", line 124, in urlopen return _opener.open(url, data) File "/usr/lib/python2.5/urllib2.py", line 387, in open response = meth(req, response) File "/usr/lib/python2.5/urllib2.py", line 498, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib/python2.5/urllib2.py", line 419, in error result = self._call_chain(*args) File "/usr/lib/python2.5/urllib2.py", line 360, in _call_chain result = func(*args) File "/usr/lib/python2.5/urllib2.py", line 579, in http_error_302 fp.read() File "/usr/lib/python2.5/socket.py", line 291, in read data = self._sock.recv(recv_size) File "/usr/lib/python2.5/httplib.py", line 509, in read return self._read_chunked(amt) File "/usr/lib/python2.5/httplib.py", line 548, in _read_chunked chunk_left = int(line, 16) ValueError: invalid literal for int() with base 16: '' With Python 2.6 the error is slightly different: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Programs\Python26\lib\urllib2.py", line 124, in urlopen return _opener.open(url, data, timeout) File "C:\Programs\Python26\lib\urllib2.py", line 389, in open response = meth(req, response) File "C:\Programs\Python26\lib\urllib2.py", line 502, in http_response 'http', request, response, code, msg, hdrs) File "C:\Programs\Python26\lib\urllib2.py", line 421, in error result = self._call_chain(*args) File "C:\Programs\Python26\lib\urllib2.py", line 361, in _call_chain result = func(*args) File "C:\Programs\Python26\lib\urllib2.py", line 594, in http_error_302 fp.read() File "C:\Programs\Python26\lib\socket.py", line 327, in read data = self._sock.recv(rbufsize) File "C:\Programs\Python26\lib\httplib.py", line 517, in read return self._read_chunked(amt) File "C:\Programs\Python26\lib\httplib.py", line 563, in _read_chunked raise IncompleteRead(value) httplib.IncompleteRead With Py3.0 it is the same: [...] http.client.IncompleteRead: b'' In this case self.fp.readline() (and the data = recv(1) in socket.py) returns and the error happens a few lines later. This seems to happen when there's a redirection in between (the video is not available in my country, the server sends back a 303 status code, and redirects me to the home page). The redirection is not handled by httplib so there might be something wrong in urllib2 too (why it's trying to read the content if we sent and HEAD request and if there is a redirection in between?), but fixing httplib to return an empty string or something similar could be enough to solve this problem too. If there's actually a problem another issue should probably be created. With the same code and the url of a working youtube video (no redirections in between), "page = urllib2.urlopen(req)" works even if there's the "transfer-encoding: chunked" but it fails later if we do "page.read()": Traceback (most recent call last): File "C:\Programs\Python30\lib\http\client.py", line 520, in _read_chunked chunk_left = int(line, 16) ValueError: invalid literal for int() with base 16: '' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Programs\Python30\lib\http\client.py", line 479, in read return self._read_chunked(amt) File "C:\Programs\Python30\lib\http\client.py", line 525, in _read_chunked raise IncompleteRead(value) http.client.IncompleteRead: b'' |
|||
| msg89868 - (view) | Author: Chandru (chkneo) | Date: 2009年06月29日 17:03 | |
HEAD request wont return any data. So before calling _read_chunked we have to check the amt is none or not.If its none simply return b'' I've attached the patch too which is take in py3k branch |
|||
| msg99796 - (view) | Author: Michal Božoň (mykhal) | Date: 2010年02月22日 17:52 | |
i confirm.. in my case, the bug manifestated when calling HEAD method on a different server with chunked transfer encoding (http://obrazky.cz) my workaround is to call response.read() always, except from cases when method == 'HEAD' and resp.getheader('transfer-encoding') == 'chunked |
|||
| msg104404 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年04月28日 03:38 | |
I can take this up. The HEAD requests does not contain any data, so when the data is None and transfer encoding is chunked, we can return empty value for the next step. No need of attempting to read the chuncked amt. The patch is fine and tests need to be added. |
|||
| msg104443 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年04月28日 17:48 | |
Whenever the HEAD method is queried, the httplib recognizes it read method and returns an '' empty string as expected. Fixed in revision 80583, release26-maint: r80584, py3k: r80587 and release31-maint in 80588. |
|||
| msg106457 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2010年05月25日 18:07 | |
Thanks Senthil! |
|||
| msg106520 - (view) | Author: Dirkjan Ochtman (djc) * (Python committer) | Date: 2010年05月26日 10:40 | |
The fix in r80583 is bad. It fails to close() the response (which previously worked as expected), meaning that the connection can't be re-used. (I ran into this because Gentoo has backported the 2.6-maint fixes to their 2.6.5 distribution.) Shall I open a new issue, or re-open this one? |
|||
| msg106521 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年05月26日 11:12 | |
I am just reopening this, as per dcj's comment. |
|||
| msg107076 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年06月04日 16:46 | |
Fixed in r81687, r81688, r81689 and r81690. Yes, I see that before the original change was made any chuncked encoding went through _read_chunked which close the resp before returning. So, here for HEAD, the resp is closed thus fixing the problem mentioned by djc. |
|||
| msg107077 - (view) | Author: Dirkjan Ochtman (djc) * (Python committer) | Date: 2010年06月04日 17:06 | |
Might be useful to have a test for this? |
|||
| msg107080 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年06月04日 17:33 | |
I saw the earlier tests was closing it explicitly. Removed that and added a test which verifies the closed resp obj. Thanks. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:50 | admin | set | github: 50561 |
| 2010年06月04日 17:33:10 | orsenthil | set | messages: + msg107080 |
| 2010年06月04日 17:06:46 | djc | set | messages: + msg107077 |
| 2010年06月04日 16:46:07 | orsenthil | set | status: open -> closed priority: release blocker -> resolution: accepted -> fixed messages: + msg107076 |
| 2010年06月04日 14:42:02 | djc | set | priority: normal -> release blocker |
| 2010年05月26日 13:58:59 | Arfrever | set | nosy:
+ Arfrever |
| 2010年05月26日 11:12:47 | orsenthil | set | status: closed -> open resolution: fixed -> accepted messages: + msg106521 |
| 2010年05月26日 10:40:36 | djc | set | nosy:
+ djc messages: + msg106520 |
| 2010年05月25日 18:07:35 | ezio.melotti | set | messages:
+ msg106457 versions: + Python 3.1, Python 3.2, - Python 2.5, Python 3.0 |
| 2010年04月28日 17:48:21 | orsenthil | set | status: open -> closed resolution: accepted -> fixed messages: + msg104443 stage: patch review -> resolved |
| 2010年04月28日 03:38:07 | orsenthil | set | nosy:
+ orsenthil messages: + msg104404 assignee: orsenthil resolution: accepted |
| 2010年04月28日 03:23:46 | rcoup | set | nosy:
+ rcoup |
| 2010年02月22日 17:52:44 | mykhal | set | nosy:
+ mykhal messages: + msg99796 |
| 2009年06月30日 23:06:00 | ezio.melotti | set | stage: patch review |
| 2009年06月29日 17:03:26 | chkneo | set | files:
+ 6312.diff nosy: + chkneo messages: + msg89868 keywords: + patch |
| 2009年06月19日 13:53:28 | ezio.melotti | create | |