homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urllib2 needs to remove scope from IPv6 address when creating Host header
Type: behavior Stage:
Components: Library (Lib) Versions: Python 2.7
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: JonathanGuthrie, gregory.p.smith, martin.panter, ngierman
Priority: normal Keywords:

Created on 2015年02月11日 18:40 by ngierman, last changed 2022年04月11日 14:58 by admin.

Messages (3)
msg235762 - (view) Author: Neil Gierman (ngierman) Date: 2015年02月11日 18:40
Using a scoped IPv6 address with urllib2 creates an invalid Host header that Apache will not accept.
 IP = "fe80::0000:0000:0000:0001%eth0"
 req = urllib2.Request("http://[" + IP + "]/")
 req.add_header('Content-Type', 'application/json')
 res = urllib2.urlopen(req, json.dumps(data))
Apache will reject the above request because the Host header is "[fe80::0000:0000:0000:0001%eth0]". This behavior was reported to Apache at https://issues.apache.org/bugzilla/show_bug.cgi?id=35122 and the Apache devs will not fix this as there are new RFCs prohibiting scopes in the Host header. Firefox had the same issue and their fix was to strip out the scope from the Host header: https://bugzilla.mozilla.org/show_bug.cgi?id=464162 and http://hg.mozilla.org/mozilla-central/rev/bb80e727c531.
My suggestion is to change urllib2.py's do_request_ method from:
 if not request.has_header('Host'):
 request.add_unredirected_header('Host', sel_host)
to:
 if not request.has_header('Host'):
 request.add_unredirected_header('Host', re.compile(r"%.*$").sub("", sel_host, 1))
I have not tested this patch to urllib2.py however I am now using similar logic in my code to override the Host header when I create my request:
 IP = "fe80::0000:0000:0000:0001%eth0"
 req = urllib2.Request("http://[" + IP + "]/")
 req.add_header('Host', '[' + re.compile(r"%.*").sub("", IP, 1) + ']')
 req.add_header('Content-Type', 'application/json')
 res = urllib2.urlopen(req, json.dumps(data))
msg235768 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015年02月11日 21:03
I’m no IPv6 expert, but there seems to be a few standards:
* <https://tools.ietf.org/html/rfc6874> (Feb 2013). Encodes as http://[fe80::1%25eth0]/; says Windows uses this form. Also mentions the unencoded http://[fe80::1%eth0]/ form. Says that the HTTP Host header should not include the scope zone identifier, since it is not necessarily relevant to the server.
* <https://tools.ietf.org/html/draft-sweet-uri-zoneid-01> (Nov 2013). Encodes as http://[v1.fe80::1+eth0]/; says CUPS uses this form. Also acknowledges the RFC %25 form. Says that the Host header _should_ include the scope, to help with servers that send back self-referencing absolute URLs.
Also, I would probably find IP.split('%', 1)[0] easier to read than a regular expression.
msg286334 - (view) Author: Jonathan Guthrie (JonathanGuthrie) Date: 2017年01月26日 21:22
Michael Sweet's draft RFC requiring that the scope should be included in the Host line expired in May 2014 and I can't find where it ever went anywhere. Does anyone have any updated information?
History
Date User Action Args
2022年04月11日 14:58:12adminsetgithub: 67636
2017年01月26日 21:22:28JonathanGuthriesetnosy: + JonathanGuthrie
messages: + msg286334
2017年01月25日 22:23:59gregory.p.smithsetnosy: + gregory.p.smith
2015年02月13日 01:27:08demian.brechtsetnosy: - demian.brecht
2015年02月11日 21:23:46demian.brechtsetnosy: + demian.brecht
2015年02月11日 21:03:41martin.pantersetnosy: + martin.panter
messages: + msg235768
2015年02月11日 18:40:23ngiermancreate

AltStyle によって変換されたページ (->オリジナル) /