homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: urlunparse(urlparse('x://')) now returns 'x:' instead of 'x://'
Type: behavior Stage: test needed
Components: Versions: Python 3.1, Python 3.2, Python 2.7, Python 2.6
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: orsenthil Nosy List: Michael Glassford, eric.araujo, orsenthil, r.david.murray
Priority: normal Keywords: easy

Created on 2010年04月07日 21:42 by Michael Glassford, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Messages (9)
msg102569 - (view) Author: Michael Glassford (Michael Glassford) Date: 2010年04月07日 21:42
An unfortunate side-effect of this change: 
http://svn.python.org/view/python/branches/release26-maint/Lib/urlparse.py?r1=66717&r2=78235
which was made to fix this issue: 
http://bugs.python.org/issue7904
is that urlparse.urlunparse(urlparse.urlparse('x://')) now returns 'x:' instead of 'x://', and urlparse.urlunparse(urlparse.urlparse('x:///y')) now returns 'x:/y' instead of 'x:///y'.
This behavior exists in at least Python 2.6 and 3.1, but not in 2.5.
msg102580 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年04月08日 00:19
To fix this, urlparse would have to differentiate between a null netloc and no netloc characters specified at all. This could be done by using None for one an '' for the other. I'm not sure that behavior change could be backported to 2.6, though.
Did this issue actually cause a program failure for you in 2.6? If so the original patch might wind up getting classed as a regression for the 2.6 line.
msg102589 - (view) Author: Michael Glassford (Michael Glassford) Date: 2010年04月08日 02:35
It caused a minor issue with the Schemes extension for Mercurial: the output changed, which caused a unit test to fail. I'm pretty sure I have a Mercurial patch to fix that issue, however.
msg102737 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010年04月09日 18:00
Hello Michael, 
Looking a bit deeper into this issue, I don't see that 'x://' and 'x:///y' qualifies as valid URLS as per RFC 3986. (Well, urlparse has been not strictly conforming to it, but that is a different issue)
If you look at the section 3. it states the following for validity.
 hier-part = "//" authority path-abempty
 / path-absolute
 / path-rootless
 / path-empty
 
In those cases, I would assume that 'x://y', x:/y','x:/','/' as valid URLS, but not the two examples you mentioned.
For the issue7904, we had just gone by the definition of RFC to make that minor change and it has resulted in this issue. I looked at the code to see if this can be addressed, but I see that your examples did not fit in as valid urls.
Do you have any opinions on this?
We can just the test_urlparse.py a little like below, and you might fix the break your code.
 def test_unparse_parse(self):
- for u in ['Python', './Python','x-newscheme://foo.com/stuff']:
+ for u in ['Python', './Python','x-newscheme://foo.com/stuff','x://y','x:/y','x:/','/',]:
 self.assertEqual(urlparse.urlunsplit(urlparse.urlsplit(u)), u)
 self.assertEqual(urlparse.urlunparse(urlparse.urlparse(u)), u)
msg102922 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010年04月12日 07:08
Added additional examples of valid urls in r79988 and branches under the roundtrip test cases.
Michael, RDM: If you have any comments on msg102737, let me know. Otherwise we can close this issue as wont-fix.
msg102932 - (view) Author: Éric Araujo (eric.araujo) * (Python committer) Date: 2010年04月12日 09:28
By the way, I’ve been meaning to file a bug against the Mercurial schemes extension since I saw how it misused URIs (i.e. "py://trunk/" should be "py:trunk/"); I hope it’s not too late now that this extension is shipped with the program.
Regards
msg102939 - (view) Author: Michael Glassford (Michael Glassford) Date: 2010年04月12日 11:50
> In those cases, I would assume that 'x://y', x:/y','x:/','/' 
> as valid URLS, but not the two examples you mentioned.
Only 2 comments about this:
1) Although the urlparse documentation does mention the relevant RFCs, on a quick read-through I don't see that it actually requires its input to be a valid URL.
2) Obviously, some code is using it for invalid URLs.
> Michael, RDM: If you have any comments on msg102737, 
> let me know. Otherwise we can close this issue as wont-fix.
I don't have a problem with this. I reported the issue to bring attention to the fact that the change affected real code and to see if anyone had suggestions for dealing with the problem in a better way (at the moment, I don't).
> By the way, I’ve been meaning to file a bug against the 
> Mercurial schemes extension...
Please do. I don't have anything to do with the Schemes extension except that I found this issue while investigating a unit test failure, but if you file an issue probably I will look into it--or if I don't, then someone else probably will.
msg102940 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010年04月12日 12:25
On Mon, Apr 12, 2010 at 11:50:57AM +0000, Michael Glassford wrote:
> 1) Although the urlparse documentation does mention the relevant RFCs, on a quick read-through I don't see that it actually requires its input to be a valid URL.
> 2) Obviously, some code is using it for invalid URLs.
Which is true, because in some cases like urlparse.urljoin(base, rel),
the rel is a relative url it undergoes the same parsing mechanism
applicable to any url and thus there is no way a 'rigorous' check for
authorized url is happening. 
Checks are done for valid chars in scheme and other parsing
behaviours.
> I don't have a problem with this. I reported the issue to bring attention to the fact that the change affected real code and to see if anyone had suggestions for dealing with the problem in a better way (at the moment, I don't).
>
Yeah, I understand the situation with 'changes which break the
existing code', which we try to avoid many times. I shall if this can
be addressed without reverting any recent changes and still being
complaint.
msg103141 - (view) Author: Senthil Kumaran (orsenthil) * (Python committer) Date: 2010年04月14日 19:09
It looks like any change for this specific case would make tests like handling anyscheme fail. I am less inclined to do that as opposed to handling this case for invalid url,
I suggest this be handled at user end to fix the proper parsing and constructing correct url. I am closing this as wont-fix.
History
Date User Action Args
2022年04月11日 14:56:59adminsetgithub: 52586
2010年04月14日 19:09:23orsenthilsetstatus: open -> closed
resolution: wont fix
messages: + msg103141
2010年04月12日 12:25:17orsenthilsetmessages: + msg102940
2010年04月12日 11:50:55Michael Glassfordsetmessages: + msg102939
2010年04月12日 09:28:55eric.araujosetnosy: + eric.araujo
messages: + msg102932
2010年04月12日 07:08:08orsenthilsetmessages: + msg102922
2010年04月09日 18:00:03orsenthilsetassignee: orsenthil
messages: + msg102737
2010年04月08日 02:35:32Michael Glassfordsetmessages: + msg102589
2010年04月08日 00:19:46r.david.murraysetpriority: normal

versions: + Python 2.7, Python 3.2
keywords: + easy
nosy: + r.david.murray, orsenthil

messages: + msg102580
stage: test needed
2010年04月07日 21:42:04Michael Glassfordcreate

AltStyle によって変換されたページ (->オリジナル) /