This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年08月03日 13:33 by albert, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| unsplit.py | albert, 2009年08月03日 13:33 | patched function + some doc/test | ||
| Messages (9) | |||
|---|---|---|---|
| msg91222 - (view) | Author: albert Mietus (albert) | Date: 2009年08月03日 13:33 | |
The functions urlparse.url{,un}split() and urllib{,2}.open() do not work
together for relative, local files, due a bug in urlunsplit.
Given a file f='./rel/path/to/file.html' it can be open directly by
urllib.open(f), but not in urllib2! as the later needs a scheme.
We can create a sound url with spilt/unspilt and a default scheme:
f2=urlparse.urlunsplit(urlparse.urlsplit(f,'file')); which works most
cases, HOWEVER a bogus netloc is added for relative filepaths.
If have isolated this "buggy" function, added some local testcode and
made patch/workaround in my file 'unsplit.py' Which is included. Hope
this will contribute to a real patch.
--Groetjes, Albert
ALbert Mietus
Don't send spam mail!
Mijn missie: http://SoftwareBeterMaken.nl product, proces & imago.
Mijn leven in het kort:
http://albert.mietus.nl/Doc/CV_ALbert.html
|
|||
| msg91402 - (view) | Author: albert Mietus (albert) | Date: 2009年08月07日 12:41 | |
There was a bug in the workaround:
if not ( scheme == 'file' and not netloc and url[0] != '/'):
---------------------------------------------=================---
The {{{and url[0] != '/'}}} was missing (above is corrected)
The effect: split/unspilt file:///path resulted in file:/path
|
|||
| msg100175 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年02月26日 21:17 | |
The bug here seems to me that urllib.urlopen() should not allow a relative file path like the one specified. f='./rel/path/to/file.html urllib2's behavior seems proper that it is raising an Exception. According to the RFCs the local files are to be acceessed by: file://localhost/path/to/file file:///path/to/file Both are absolute paths to the file where in the second one localhost is omitted. Let me see if urllib's urlopen be made a little stricter. |
|||
| msg151715 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年01月21日 03:43 | |
New changeset f6008e936fbc by Senthil Kumaran in branch '2.7': Fix Issue6631 - Disallow relative files paths in urllib*.open() http://hg.python.org/cpython/rev/f6008e936fbc |
|||
| msg151716 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年01月21日 03:55 | |
New changeset 4366c0df2c73 by Senthil Kumaran in branch '2.7': NEWS entry for Issue6631 http://hg.python.org/cpython/rev/4366c0df2c73 New changeset 514994d7a9f2 by Senthil Kumaran in branch '3.2': Fix Issue6631 - Disallow relative file paths in urllib urlopen http://hg.python.org/cpython/rev/514994d7a9f2 |
|||
| msg151726 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2012年01月21日 08:35 | |
Sorry, why was this change backported? Does this fix a specific issue in 2.7 or 3.2? On the contrary, it seems to me that code which (incorrectly) used urllib.urlopen() to allow both urls and local files will suddenly break. |
|||
| msg151727 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2012年01月21日 09:19 | |
Actually, I saw this as a bug with urllib.urlopen and urllib2 had exhibited proper behaviour previously. Now, both behaviour will be consistent now. But, you are right that an *incorrect* usage of urllib.urlopen would break in 2.7.2. If we need to be lenient on that incorrect usage, then this change can be there in 3.x series, because of urllib.request.urlopen would be interface which users will be using and it can be reverted from 2.7. Personally, I am +/- 0 on reverting this in 2.7. Initially, I saw this as a bug, but later when I added tests for ValueError and checkedin, I realized that it can break some incorrect usages, as you say. |
|||
| msg241084 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年04月15日 05:58 | |
I’m confused what the intention of this bug is. The normal urllib.request.urlopen() function (or equivalent) still allows file URLs with relative paths, in various Python versions I tried, ranging from 2.6 to 3.5:
>>> import urllib.request
>>> urllib.request.urlopen("file:README")
<addinfourl at 140061019639088 whose fp = <_io.BufferedReader name='README'>>
Passing a relative URL without a scheme seems to have never been supported, for a different reason:
>>> urllib.request.urlopen("README")
Traceback (most recent call last):
[. . .]
File "/home/proj/python/cpython/Lib/urllib/request.py", line 321, in _parse
raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: 'README'
Looking closer at the changes made here, they only seem to affect the urllib.request.URLopener class, which is listed as deprecated in 3.3 and never used by urlopen() as far as I know.
Personally, I would prefer to keep allowing relative paths in "file:" scheme URLs in urlopen(). This is useful e.g. if you don’t want to put the full working directory in a URL on the command line or whatever.
It is inconsistent that urljoin(), urlunsplit(), etc don’t support relative paths, but I would actually prefer that such support be added. (I think my patch for Issue 22852 would probably do the trick.) Currently, urljoin() reinterprets the path as absolute:
>>> urljoin("file:subdir/index.html", "link.html")
'file:///subdir/link.html'
|
|||
| msg247907 - (view) | Author: Robert Collins (rbcollins) * (Python committer) | Date: 2015年08月02日 22:31 | |
test_relativelocalfile is still in place in the urllib tests, so its affecting urlopen to this point. So I think the bug is fixed at least to the extent of the original report. I'm going to close this. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:51 | admin | set | github: 50880 |
| 2015年08月02日 22:31:35 | rbcollins | set | status: open -> closed nosy: + rbcollins messages: + msg247907 stage: commit review -> resolved |
| 2015年04月15日 05:58:14 | martin.panter | set | messages: + msg241084 |
| 2014年07月13日 01:21:58 | martin.panter | set | nosy:
+ martin.panter |
| 2012年01月21日 09:19:51 | orsenthil | set | status: pending -> open messages: + msg151727 |
| 2012年01月21日 08:35:31 | amaury.forgeotdarc | set | status: closed -> pending nosy: + amaury.forgeotdarc messages: + msg151726 stage: resolved -> commit review |
| 2012年01月21日 03:57:05 | orsenthil | set | status: open -> closed type: performance -> behavior stage: resolved resolution: fixed versions: + Python 2.7, Python 3.2, Python 3.3 |
| 2012年01月21日 03:55:58 | python-dev | set | messages: + msg151716 |
| 2012年01月21日 03:43:26 | python-dev | set | nosy:
+ python-dev messages: + msg151715 |
| 2012年01月21日 03:42:49 | orsenthil | set | title: urlparse.urlunsplit() can't handle relative files (for urllib*.open() -> Disallow relative files paths in urllib*.open() |
| 2010年02月26日 21:17:27 | orsenthil | set | assignee: orsenthil messages: + msg100175 nosy: + orsenthil |
| 2009年08月07日 12:41:03 | albert | set | messages: + msg91402 |
| 2009年08月03日 13:33:02 | albert | create | |