homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib2 cannot handle https with proxy requiring auth
Type: Stage: needs patch
Components: Library (Lib) Versions: Python 3.10, Python 3.9, Python 3.8, Python 3.7, Python 3.6, Python 3.5, Python 2.7
process
Status: open Resolution: accepted
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: Jessica Ridgley, alexey.namyotkin, b.a.scott, dieresys, martin.panter, mbeachy, orsenthil, ronaldoussoren, tsujikawa, vzafzal, yan12125
Priority: normal Keywords: patch

Created on 2009年11月09日 04:38 by tsujikawa, last changed 2022年04月11日 14:56 by admin.

Files
File name Uploaded Description Edit
https_proxy_auth.patch tsujikawa, 2009年11月09日 06:56 patch to send Proxy-Authorization header in CONNECT method(https proxy) review
urllib2_with_proxy_auth_comparison.py dieresys, 2009年12月23日 22:16
2_7_x.patch mbeachy, 2011年02月19日 17:23 2.7 maintenance branch patch review
monkey_2_6_4.py mbeachy, 2011年02月19日 17:25 2.6.4 monkey patch
urllib2_tests.tar.gz b.a.scott, 2011年02月21日 11:09 Test code, results and instructions
http_proxy_https.patch b.a.scott, 2011年02月21日 11:19 Fix handling of 407 and 401 in urllib2 and httplib review
new_http_proxy_patch.py vzafzal, 2014年02月21日 13:45
Messages (24)
msg95058 - (view) Author: Tatsuhiro Tsujikawa (tsujikawa) Date: 2009年11月09日 04:38
urllib2 cannot handle https with proxy requiring authorization.
After https_proxy is set correctly,
Python 2.6.4 (r264:75706, Oct 29 2009, 15:38:25)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> c=urllib2.urlopen("https://sourceforge.net")
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/usr/lib/python2.6/urllib2.py", line 124, in urlopen
 return _opener.open(url, data, timeout)
 File "/usr/lib/python2.6/urllib2.py", line 389, in open
 response = self._open(req, data)
 File "/usr/lib/python2.6/urllib2.py", line 407, in _open
 '_open', req)
 File "/usr/lib/python2.6/urllib2.py", line 367, in _call_chain
 result = func(*args)
 File "/usr/lib/python2.6/urllib2.py", line 1154, in https_open
 return self.do_open(httplib.HTTPSConnection, req)
 File "/usr/lib/python2.6/urllib2.py", line 1121, in do_open
 raise URLError(err)
urllib2.URLError: <urlopen error Tunnel connection failed: 407 Proxy
Authentication Required>
This is because HTTPConnection::_tunnel() in httplib.py doesn't send
Proxy-Authorization header.
msg95060 - (view) Author: Tatsuhiro Tsujikawa (tsujikawa) Date: 2009年11月09日 06:56
I created a patch.
I added additional argument 'headers' to HTTPConnection::set_tunnel()
method,
which is a mapping of HTTP headers to sent with CONNECT method. Since
authorization
credential is already set to Request object, in
AbstractHTTPHandler::do_open(),
if "Proxy-Authorization" header is found, pass it to set_tunnel().
It works fine for me.
msg95373 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2009年11月17日 10:40
The patch looks good to me.
IMHO this should be backported to 2.6 as well.
msg95377 - (view) Author: Ronald Oussoren (ronaldoussoren) * (Python committer) Date: 2009年11月17日 10:52
I've tested a backport of the patch to 2.6 (just replace set_proxy by 
_set_proxy in the patch) and the resulting version of urllib2 can login to 
the proxy (as expected).
Thanks for the patch.
msg96659 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009年12月20日 06:05
Fixed and Committed revision 76908 in the trunk.
msg96660 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009年12月20日 07:22
Fixed through reversions r76908, r76909, r76910, r76911
Thanks for the patch, Tatsuhiro Tsujikawa.
msg96661 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009年12月20日 07:22
meant revisions.
msg96844 - (view) Author: Manuel Muradás (dieresys) Date: 2009年12月23日 22:11
Hi! 2.6 backport is missing an argument in _set_tunnel definition. It
should be:
 def _set_tunnel(self, host, port=None, headers=None):
msg96845 - (view) Author: Manuel Muradás (dieresys) Date: 2009年12月23日 22:16
The patch fixes only when you pass the authentication info in the proxy
handler's URL. Like:
 proxy_handler = urllib2.ProxyHandler({'https':
'http://user:pass@proxy-example.com:3128/'})
But setting the authentication using a ProxyBasicAuthHandler is still
broken:
 proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
 proxy_auth_handler.add_password('realm', 'proxy-example.com:3128',
'user', 'pass')
In the attached file (urllib2_with_proxy_auth_comparison.py) we've wrote
a comparison between what works with HTTP and HTTPS.
The problem is the 407 error never reaches the ProxyBasicAuthHandler
because HTTPConnection._tunnel raises an exception when the http
response status code is not 200.
msg96846 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2009年12月24日 00:53
Thanks for the note, Manuel. Fixed it in revision 77013.
msg100840 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010年03月11日 09:31
In this ticket, setting the authentication using a ProxyBasicAuthHandler is not yet addressed yet. (this was informed in the last note). Reopening this one to track it.
msg103090 - (view) Author: Mike Beachy (mbeachy) Date: 2010年04月13日 23:11
I have worked up a monkey patch for urllib2/httplib for the issue of setting the authentication using a Proxy(Basic|Digest)AuthHandler.
The basic approach was to create a new httplib exception (ProxyTunnelError) and raise that with the http response attached so that the HTTPSHandler can determine when 407 Proxy authentication required is present, and then reroute the urllib2.OpenerDirector to error handling mode.
Unfortunately, there is a backwards compatibility issue - cases where a socket.error was previously being raised now get an ProxyTunnelError. Not that you could do much useful with the socket.error in the first place, but I suppose you could look for '407' in the text. Ugh.
If you think this might prove useful, let me know and I can rework it into a real patch - just let me know what branch/version to base it off of. (My monkey patch is for python 2.6.4.)
msg128861 - (view) Author: Mike Beachy (mbeachy) Date: 2011年02月19日 17:23
I've been in contact w/ Barry Scott offline re: the monkey patch previously mentioned. I'm attaching a 2.7 maintenance branch patch that he has needed to extend, and plans to follow up on.
msg128862 - (view) Author: Mike Beachy (mbeachy) Date: 2011年02月19日 17:25
Attached to this comment (can you attach multiple files at once?) is the somewhat moldy 2.6.4 monkey patch, mercilessly ripped from my own code and probably not good for much.
msg128952 - (view) Author: Barry Scott (b.a.scott) Date: 2011年02月21日 10:41
The attached patch builds on Mike's work.
The core of the problem is that the Request object
did not know what was going on. This means that it
was not possible for get_authorization() to work
for proxy-auth and www-auth.
I change Request to know which of the four types of
connection it represents. There are new methods on
Request that return the right information based on
the connection type.
To understand how to make this work I needed to
instrument the code. There is now a set_debuglevel
on the OpenerDirector object that turns on debug in
all the handlers and the director. I have added
more debug messages to help understand this code.
This code now passes the 72 test cases I run. I'll
attach the code I used to test as a follow up to this.
msg128953 - (view) Author: Barry Scott (b.a.scott) Date: 2011年02月21日 11:09
Attached is the code I used to test these changes.
See the README.txt file for details include
the results of a test run.
msg128955 - (view) Author: Barry Scott (b.a.scott) Date: 2011年02月21日 11:19
I left out some white space changes to match the style
of the std lib code. Re posting with white space cleanup.
msg211850 - (view) Author: Vackar Afzal (vzafzal) Date: 2014年02月21日 11:40
I've found that for the Python2.6.x patch to play nicely with the popular rquests library, I've had to set some defaults on the modified __init__ function so that it reads as follows:
 def __init__(self, *args, **kwargs):
 _orig_init(self, *args, **kwargs)
 self._tunnel_headers = {}
 self._tunnel_host = ''
 self._tunnel_port = ''
Also seems to work with python 2.6.1. Note: Change the entry condition to:
if os.environ.get('https_proxy', None) and sys.version_info[:2] == (2, 6) :
msg211860 - (view) Author: Vackar Afzal (vzafzal) Date: 2014年02月21日 13:45
Also needed to make another minor update to the monkey patch.
Have uploaded the new files as new_http_proxy_patch.py for use with python 2.6.x
msg228406 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014年10月03日 22:49
Is there any work still needed here? Surely the 2.6.x patches can't be applied unless there are security issues?
msg244513 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015年05月31日 03:57
I believe this also affects Python 3; see Issue 24333.
I think making the CONNECT response object available to the caller is the right general approach. But I really dislike raising an exception that holds a socket connection to be closed. (I know this is already done with urllib.error.HTTPError; let’s not repeat this in the "http.client" module!)
Ideally, at the "http.client" level I would prefer to avoid all the special set_tunnel() calls, and use the usual request() and getresponse() API to make the CONNECT request. This way the response status and header fields would be available just like any other response. For this approach to work, we would probably need to add a new HTTPConnection.detach() method that released the original socket reader and writer, and add a way to create a new HTTPConnection instance using these socket objects. This enhancement probably wouldn’t be appropriate for Python 2 or a bug fix release. But it seems the cleanest approach to me, and may also allow using HTTPConnection with the Upgrade header (e.g. for opportunistic encryption, HTTP 2, Web etc), and proxying non-HTTP connections, as bonuses.
A less revolutionary approach might be to add a HTTPSConnection.tunnel() method, that always returns the proxy’s response, but only does the SSL wrapping for a successful response.
msg247805 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015年08月01日 10:14
For the record, a while ago I think I made a patch implementing my HTTPConnection.detach() proposal. I can probably dig it up if anyone is interested.
However I gave up on fixing this bug in "urllib.request". As far as I understand it, the framework does not distinguish the 407 Proxy Authentication Required error of the initial proxy CONNECT request from any potential 407 response from a tunnelled connection. Perhaps a special case could be made; I think there are already lots of special cases. But the maze of urlopen() handlers is already too complicated and I decided this was too hard to bother working on. Sorry :)
msg283306 - (view) Author: (yan12125) * Date: 2016年12月15日 14:00
Modify target versions to bugfix and feature branches
msg374687 - (view) Author: Alexey Namyotkin (alexey.namyotkin) * Date: 2020年08月02日 19:06
It has been 5 years, now the urllib3 is actively used, but it also inherited this problem: if no authentication data has been received, then the method _tunnel raises an exception OSError that does not contain response headers. Accordingly, this exception cannot be handled. And perhaps this is an obstacle to building a convenient system of authentication on a proxy server in a widely used library requests (it would be nice to be able to just provide an argument proxy_auth, similar to how it is done for server authorization). Now, if a user wants to send a https request through a proxy that requires complex authentication (Kerberos, NTLM, Digest, other) using the urllib3, he must first send a separate request to the proxy, receive a response, extract the necessary data to form the header Proxy-Authorization, then generate this header and pass it to the ProxyManager. And if we are talking about Requests, then the situation there is worse, because you cannot pass proxy headers directly (https://github.com/psf/requests/issues/2708).
If we were to aim to simplify the authentication procedure on the proxy server for the user, then where would we start, do we need to change the http.client so that the error returned by the method _tunnel contains headers? Or maybe it's not worth changing anything at all and the path with preliminary preparation by user of the header Proxy-Authorization is the only correct one? Martin Panter, could you also give your opinion? Thank you in advance.
History
Date User Action Args
2022年04月11日 14:56:54adminsetgithub: 51540
2020年08月02日 19:06:17alexey.namyotkinsetmessages: + msg374687
2020年07月29日 15:36:59Jessica Ridgleysetnosy: + Jessica Ridgley

versions: + Python 3.8, Python 3.9, Python 3.10
2020年07月29日 14:46:07alexey.namyotkinsetnosy: + alexey.namyotkin
2016年12月15日 14:58:30BreamoreBoysetnosy: - BreamoreBoy
2016年12月15日 14:00:36yan12125setmessages: + msg283306
versions: + Python 3.7, - Python 2.6, Python 3.4
2015年12月11日 07:25:49yan12125setnosy: + yan12125
2015年08月01日 10:14:33martin.pantersetmessages: + msg247805
stage: needs patch
2015年05月31日 04:00:10martin.panterlinkissue24333 superseder
2015年05月31日 03:57:45martin.pantersetnosy: + martin.panter

messages: + msg244513
versions: + Python 3.4, Python 3.5, Python 3.6
2014年10月03日 22:49:46BreamoreBoysetnosy: + BreamoreBoy
messages: + msg228406
2014年02月21日 13:45:45vzafzalsetfiles: + new_http_proxy_patch.py

messages: + msg211860
2014年02月21日 11:40:19vzafzalsetnosy: + vzafzal
messages: + msg211850
2011年02月21日 11:19:31b.a.scottsetfiles: - http_proxy_https.patch
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa, b.a.scott
2011年02月21日 11:19:23b.a.scottsetfiles: + http_proxy_https.patch
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa, b.a.scott
messages: + msg128955
2011年02月21日 11:09:42b.a.scottsetfiles: + urllib2_tests.tar.gz
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa, b.a.scott
messages: + msg128953
2011年02月21日 10:41:27b.a.scottsetfiles: + http_proxy_https.patch
nosy: + b.a.scott
messages: + msg128952

2011年02月19日 17:25:17mbeachysetfiles: + monkey_2_6_4.py
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa
messages: + msg128862
2011年02月19日 17:23:26mbeachysetfiles: + 2_7_x.patch
nosy: ronaldoussoren, orsenthil, mbeachy, dieresys, tsujikawa
messages: + msg128861
2010年04月13日 23:11:40mbeachysetnosy: + mbeachy
messages: + msg103090
2010年03月11日 09:31:44orsenthilsetstatus: closed -> open
resolution: fixed -> accepted
messages: + msg100840
2010年02月22日 16:13:43floxlinkissue7986 superseder
2009年12月24日 00:53:30orsenthilsetmessages: + msg96846
2009年12月23日 22:16:00dieresyssetfiles: + urllib2_with_proxy_auth_comparison.py

messages: + msg96845
2009年12月23日 22:11:04dieresyssetnosy: + dieresys
messages: + msg96844
2009年12月20日 07:22:49orsenthilsetmessages: + msg96661
2009年12月20日 07:22:10orsenthilsetstatus: open -> closed

messages: + msg96660
2009年12月20日 06:06:00orsenthilsetkeywords: - needs review
resolution: accepted -> fixed
messages: + msg96659
2009年11月17日 10:52:27ronaldoussorensetmessages: + msg95377
2009年11月17日 10:40:36ronaldoussorensetkeywords: + needs review
nosy: + ronaldoussoren
messages: + msg95373

2009年11月15日 09:21:35orsenthilsetassignee: orsenthil

resolution: accepted
nosy: + orsenthil
2009年11月11日 01:06:09tsujikawasetversions: + Python 2.7
2009年11月09日 06:56:13tsujikawasetfiles: + https_proxy_auth.patch
keywords: + patch
messages: + msg95060
2009年11月09日 04:38:16tsujikawacreate

AltStyle によって変換されたページ (->オリジナル) /