Message141039
| Author |
royliu |
| Recipients |
royliu |
| Date |
2011年07月24日.05:12:19 |
| SpamBayes Score |
0.00031865627 |
| Marked as misclassified |
No |
| Message-id |
<1311484340.86.0.612332840201.issue12628@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
When testing urllib.request.urlopen in Python 3, I found that it gave empty responses for some sites. In other words, reading from the file-like object gives zero bytes. Python 2.x's urllib2.urlopen did not give this behavior. I isolated the problem down to the following difference:
@@ -1137,8 +1137,6 @@
r = h.getresponse() # an HTTPResponse instance
except socket.error as err:
raise URLError(err)
- finally:
- h.close()
r.url = req.get_full_url()
# This line replaces the .msg attribute of the HTTPResponse
The "finally" clause is absent in urllib2.py but present in Python 3.2's request.py. I think it has something to do with the HTTPConnection being closed before data could be read. Still, it's puzzling because some sites still give expected answers. Please find attached a small test script for "www.wsj.com" for which the response body should be empty without applying the above patch. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2011年07月24日 05:12:20 | royliu | set | recipients:
+ royliu |
| 2011年07月24日 05:12:20 | royliu | set | messageid: <1311484340.86.0.612332840201.issue12628@psf.upfronthosting.co.za> |
| 2011年07月24日 05:12:20 | royliu | link | issue12628 messages |
| 2011年07月24日 05:12:19 | royliu | create |
|