This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年01月26日 19:22 by olemis, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Messages (9) | |||
|---|---|---|---|
| msg80586 - (view) | Author: Olemis Lang (olemis) | Date: 2009年01月26日 19:22 | |
Hello ...
The first thing I have to say is that I searched the open issues and I
found nothing similar to what I am going to report hereinafter. If this
ticket is duplicate , I apologize ...
Yesterday I was testing how to access the wiki pages in a
Trac [1]_ site and I realized that something wrong was happening
(a bug? ...)
Initially the behavior was as follows :
{{{
#!python
>>> u = urllib.urlopen('http://localhost:8000/trac-dev')
>>> u.read()
'Environment not found'
>>> u.close()
}}}
And tracd reported a line like this
{{{
127.0.0.1 - - [25/Jan/2009 17:32:08] "GET http://localhost:8000/trac-
dev HTTP/1.0" 404 -
}}}
Which means that a 'Not found' error code was sent back to urllib
client.
I tried to access the same page from my browser and tracd reported
{{{
127.0.0.1 - - [25/Jan/2009 18:05:44] "GET /trac-dev HTTP/1.0" 200 -
}}}
The problem is obvious ... urllib was sending the full URL after GET
and it should send only the string after the network location.
I applied the following patch to urllib (yours will be better, I am
sure about that ;)
{{{
#!diff
--- /usr/lib/python2.5/urllib.py 2008年07月31日 13:40:40.000000000
-0500
+++ /media/urllib_unix.py 2009年01月26日 09:48:54.000000000 -0500
@@ -270,6 +270,7 @@
def open_http(self, url, data=None):
"""Use HTTP protocol."""
import httplib
+ from urlparse import urlparse
user_passwd = None
proxy_passwd= None
if isinstance(url, str):
@@ -312,12 +313,17 @@
else:
auth = None
h = httplib.HTTP(host)
+ target = ''.join(sep + part for sep, part in \
+ zip(['', ';', '?', '#'], \
+ urlparse(selector)[2:]) \
+ if part)
+ print target
if data is not None:
- h.putrequest('POST', selector)
+ h.putrequest('POST', target)
h.putheader('Content-Type', 'application/x-www-form-
urlencoded')
h.putheader('Content-Length', '%d' % len(data))
else:
- h.putrequest('GET', selector)
+ h.putrequest('GET', target)
if proxy_auth: h.putheader('Proxy-Authorization', 'Basic %s' %
proxy_auth)
if auth: h.putheader('Authorization', 'Basic %s' % auth)
if realhost: h.putheader('Host', realhost)
}}}
And everithing was «back» to normal ...
{{{
#!python
>>> u = urllib.urlopen('http://localhost:8000/trac-dev')
>>> u.read()
... # Lots of beautiful HTML code ;)
>>> u.close()
}}}
... tracd outputted ...
{{{
127.0.0.1 - - [25/Jan/2009 18:05:44] "GET /trac-dev HTTP/1.0" 200 -
}}}
The same picture is shown when using both Python 2.5.1 and 2.5.2 ...
I have not installed Python 2.6.x so I am not sure about whether this
issue has propagated onto newer versions of Python ... and I don't
know euther if this issue is also present in urllib2 or not ...
... so further research is needed, but IMO this is a serious bug :(
PD: If this is a bug ... how could it be hidden so far ? Is there any
test case written to assert this kind of things ? I checked out
`test.test_urllib` and `test.test_urllibnet` modules and I saw
nothing at all ...
.. [1] Trac
(http://trac.edgewall.org)
|
|||
| msg80588 - (view) | Author: Olemis Lang (olemis) | Date: 2009年01月26日 19:28 | |
Ooops ... sorry, remove the print statement. The patch is as follows :
{{{
#!diff
--- /usr/lib/python2.5/urllib.py 2008年07月31日 13:40:40.000000000
-0500
+++ /media/urllib_unix.py 2009年01月26日 09:48:54.000000000 -0500
@@ -270,6 +270,7 @@
def open_http(self, url, data=None):
"""Use HTTP protocol."""
import httplib
+ from urlparse import urlparse
user_passwd = None
proxy_passwd= None
if isinstance(url, str):
@@ -312,12 +313,17 @@
else:
auth = None
h = httplib.HTTP(host)
+ target = ''.join(sep + part for sep, part in \
+ zip(['', ';', '?', '#'], \
+ urlparse(selector)[2:]) \
+ if part)
if data is not None:
- h.putrequest('POST', selector)
+ h.putrequest('POST', target)
h.putheader('Content-Type', 'application/x-www-form-
urlencoded')
h.putheader('Content-Length', '%d' % len(data))
else:
- h.putrequest('GET', selector)
+ h.putrequest('GET', target)
if proxy_auth: h.putheader('Proxy-Authorization', 'Basic %s' %
proxy_auth)
if auth: h.putheader('Authorization', 'Basic %s' % auth)
if realhost: h.putheader('Host', realhost)
}}}
I apologize once again ...
|
|||
| msg80600 - (view) | Author: Gabriel Genellina (ggenellina) | Date: 2009年01月27日 00:09 | |
I could not reproduce this issue neither with Python 2.6 nor 2.5.2 If I print host and selector near line 313, I get 'localhost:8000' and '/trac-dev', the expected results. Do you have an HTTP proxy? running at the *same* port? (!) |
|||
| msg80651 - (view) | Author: Olemis Lang (olemis) | Date: 2009年01月27日 14:02 | |
Actually I am using a proxy hosted in some other machine (i.e. not my
PC ... sorry, I didnt mention :S ...) I «debugged» urllib and, when
branching at this point (see below ;) in URLopener.open_http :
{{{
#!python
# urllib,py
def open_http(self, url, data=None):
"""Use HTTP protocol."""
import httplib
user_passwd = None
proxy_passwd= None
if isinstance(url, str): # Branching here !!!!!!!!!!
host, selector = splithost(url)
if host:
user_passwd, host = splituser(host)
host = unquote(host)
realhost = host
else:
host, selector = url
}}}
url variable is bound to the following binary tuple
{{{
#!python
('172.18.2.7:3128', 'http://localhost:8000/trac-dev')
}}}
My IP is 172.18.2.99 ... so the `else` branch is the one being executed
If you need further details ... dont hesitate and ask anything you
want ;)
PD: What d'u mean when you said?
> Do you have an HTTP proxy? running at the *same* port? (!)
I dont understand this since *I already said* that *I accessed* my Trac
environment using my web browser (Opera 9.63, I dont know whether this
is relevant at all ... ), *I sent you* the lines outputted by tracd to
stdout (or stderr ... I am not very sure right now ... ;) and *I told
you* that, once I applied the path *I submitted*, everything was *back
to normal* ...
I dont understand how could all this be possible if I were running
tracd and an HTTP proxy in the *same* port, or even in case
`http_proxy` envvar be set to the hostname + port where my Trac
instance is listening for incoming connections ...
Anyway ... CMIIW ...
I also checked that immediately before executing the following
statements ...
{{{
#!python
# urllib,py
h = httplib.HTTP(host)
if data is not None:
h.putrequest('POST', selector)
h.putheader('Content-Type', 'application/x-www-form-
urlencoded')
h.putheader('Content-Length', '%d' % len(data))
else:
h.putrequest('GET', selector)
}}}
... `selector` is bound to 'http://localhost:8000/trac-dev' ... BTW the
`else` clause *is the one executed* in this case, and this is
consistent with tracd reports *I sent before* and is logical since
`data` arg *is missing* in the code snippet I sent before.
|
|||
| msg80653 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年01月27日 14:37 | |
I suppose 172.18.2.7:3128 is the address:port of the your proxy, right? In which case, urllib seems to do the right thing. When talking to an HTTP proxy, requests are of the form "GET http://site.com/path", rather than "GET /path". It's up to the proxy to strip the host part of the URL when forwarding the request to the target server. (but I suppose tracd could also be more permissive and allow the "GET http://site.com/path" variant. It seems Apache does) |
|||
| msg80654 - (view) | Author: Olemis Lang (olemis) | Date: 2009年01月27日 15:11 | |
> Quoting Antoine Pitrou ... > I suppose 172.18.2.7:3128 is the address:port of the your proxy, right? Yes ... > In which case, urllib seems to do the right thing. When talking to an HTTP proxy, requests are of the form "GET http://site.com/path", rather than "GET /path". It's up to the proxy to strip the host part of the URL when forwarding the request to the target server. This being said ... > (but I suppose tracd could also be more permissive and allow the "GET http://site.com/path" variant. It seems Apache does) ... It works with Apache (I am talking about trac once again ...) therefore I will report this issue to Trac devs instead ... Thnx a lot ! Sorry if I caused you any trouble ... |
|||
| msg80683 - (view) | Author: Gabriel Genellina (ggenellina) | Date: 2009年01月28日 00:38 | |
> > Do you have an HTTP proxy? running at the *same* port?
> (!)
>
> I dont understand this since *I already said* that *I
> accessed* my Trac
> environment using my web browser (Opera 9.63, I dont know
> whether this
> is relevant at all ... ), *I sent you* the lines outputted
> by tracd to
> stdout (or stderr ... I am not very sure right now ... ;)
> and *I told
> you* that, once I applied the path *I submitted*,
> everything was *back
> to normal* ...
If you had configured a proxy at localhost:8000, and *also* a Trac instance at that port, and Trac had "won the race" for the port, then you would observe exactly the symthoms you describe. That is, urllib talking to 8000 as it were a proxy, and the Trac instance actually there getting confused.
Your patch, as you surely understand now, is not correct; in fact, the code is OK as it is. urllib builds the request in that specific way *because* he thinks there is a proxy. If the proxy is buggy, misconfigured, or inexistent, it's not the library's fault :)
--
Gabriel Genellina
>
> I dont understand how could all this be possible if I were
> running
> tracd and an HTTP proxy in the *same* port, or even in case
>
> `http_proxy` envvar be set to the hostname + port where my
> Trac
> instance is listening for incoming connections ...
>
> Anyway ... CMIIW ...
>
> I also checked that immediately before executing the
> following
> statements ...
>
> {{{
> #!python
>
> # urllib,py
>
> h = httplib.HTTP(host)
> if data is not None:
> h.putrequest('POST', selector)
> h.putheader('Content-Type',
> 'application/x-www-form-
> urlencoded')
> h.putheader('Content-Length',
> '%d' % len(data))
> else:
> h.putrequest('GET', selector)
>
> }}}
>
> ... `selector` is bound to
> 'http://localhost:8000/trac-dev' ... BTW the
> `else` clause *is the one executed* in this case, and this
> is
> consistent with tracd reports *I sent before* and is
> logical since
> `data` arg *is missing* in the code snippet I sent before.
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue5072>
> _______________________________________
Yahoo! Cocina
Recetas prácticas y comida saludable
http://ar.mujer.yahoo.com/cocina/
|
|||
| msg81798 - (view) | Author: Daniel Diniz (ajaksu2) * (Python triager) | Date: 2009年02月12日 18:37 | |
Anyone against closing this as "works for me"? |
|||
| msg82402 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2009年02月18日 01:57 | |
Yup, This should be closed too. Thanks. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:44 | admin | set | github: 49322 |
| 2009年02月18日 14:38:21 | ajaksu2 | set | status: pending -> closed |
| 2009年02月18日 01:57:32 | orsenthil | set | messages: + msg82402 |
| 2009年02月18日 01:52:33 | ajaksu2 | set | status: open -> pending priority: low |
| 2009年02月12日 18:37:06 | ajaksu2 | set | keywords:
+ patch nosy: + ajaksu2, orsenthil stage: test needed messages: + msg81798 versions: + Python 2.6, - Python 2.5 |
| 2009年01月28日 00:38:41 | ggenellina | set | messages: + msg80683 |
| 2009年01月27日 15:11:52 | olemis | set | messages: + msg80654 |
| 2009年01月27日 14:37:56 | pitrou | set | nosy:
+ pitrou messages: + msg80653 |
| 2009年01月27日 14:02:43 | olemis | set | messages: + msg80651 |
| 2009年01月27日 00:09:17 | ggenellina | set | nosy:
+ ggenellina messages: + msg80600 |
| 2009年01月26日 19:28:43 | olemis | set | messages: + msg80588 |
| 2009年01月26日 19:22:53 | olemis | create | |