homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author olemis
Recipients olemis
Date 2009年01月26日.19:22:50
SpamBayes Score 1.227698e-10
Marked as misclassified No
Message-id <1232997774.08.0.438260332945.issue5072@psf.upfronthosting.co.za>
In-reply-to
Content
Hello ... 
The first thing I have to say is that I searched the open issues and I 
found nothing similar to what I am going to report hereinafter. If this 
ticket is duplicate , I apologize ...
Yesterday I was testing how to access the wiki pages in a 
Trac [1]_ site and I realized that something wrong was happening 
(a bug? ...)
Initially the behavior was as follows :
{{{
#!python
>>> u = urllib.urlopen('http://localhost:8000/trac-dev')
>>> u.read()
'Environment not found'
>>> u.close()
}}}
And tracd reported a line like this 
{{{
127.0.0.1 - - [25/Jan/2009 17:32:08] "GET http://localhost:8000/trac-
dev HTTP/1.0" 404 -
}}}
Which means that a 'Not found' error code was sent back to urllib 
client.
I tried to access the same page from my browser and tracd reported
{{{
127.0.0.1 - - [25/Jan/2009 18:05:44] "GET /trac-dev HTTP/1.0" 200 -
}}}
The problem is obvious ... urllib was sending the full URL after GET
and it should send only the string after the network location.
I applied the following patch to urllib (yours will be better, I am 
sure about that ;)
{{{
#!diff
--- /usr/lib/python2.5/urllib.py 2008年07月31日 13:40:40.000000000 
-0500
+++ /media/urllib_unix.py 2009年01月26日 09:48:54.000000000 -0500
@@ -270,6 +270,7 @@
 def open_http(self, url, data=None):
 """Use HTTP protocol."""
 import httplib
+ from urlparse import urlparse
 user_passwd = None
 proxy_passwd= None
 if isinstance(url, str):
@@ -312,12 +313,17 @@
 else:
 auth = None
 h = httplib.HTTP(host)
+ target = ''.join(sep + part for sep, part in \
+ zip(['', ';', '?', '#'], \
+ urlparse(selector)[2:]) \
+ if part)
+ print target
 if data is not None:
- h.putrequest('POST', selector)
+ h.putrequest('POST', target)
 h.putheader('Content-Type', 'application/x-www-form-
urlencoded')
 h.putheader('Content-Length', '%d' % len(data))
 else:
- h.putrequest('GET', selector)
+ h.putrequest('GET', target)
 if proxy_auth: h.putheader('Proxy-Authorization', 'Basic %s' % 
proxy_auth)
 if auth: h.putheader('Authorization', 'Basic %s' % auth)
 if realhost: h.putheader('Host', realhost)
}}}
And everithing was «back» to normal ...
{{{
#!python
>>> u = urllib.urlopen('http://localhost:8000/trac-dev')
>>> u.read()
 ... # Lots of beautiful HTML code ;)
>>> u.close()
}}}
... tracd outputted ...
{{{
127.0.0.1 - - [25/Jan/2009 18:05:44] "GET /trac-dev HTTP/1.0" 200 -
}}}
The same picture is shown when using both Python 2.5.1 and 2.5.2 ...
I have not installed Python 2.6.x so I am not sure about whether this
issue has propagated onto newer versions of Python ... and I don't 
know euther if this issue is also present in urllib2 or not ...
... so further research is needed, but IMO this is a serious bug :(
PD: If this is a bug ... how could it be hidden so far ? Is there any 
 test case written to assert this kind of things ? I checked out 
 `test.test_urllib` and `test.test_urllibnet` modules and I saw
 nothing at all ... 
.. [1] Trac
 (http://trac.edgewall.org)
History
Date User Action Args
2009年01月26日 19:22:54olemissetrecipients: + olemis
2009年01月26日 19:22:54olemissetmessageid: <1232997774.08.0.438260332945.issue5072@psf.upfronthosting.co.za>
2009年01月26日 19:22:53olemislinkissue5072 messages
2009年01月26日 19:22:51olemiscreate

AltStyle によって変換されたページ (->オリジナル) /