Issue 8339: urlunparse(urlparse('x://')) now returns 'x:' instead of 'x://'

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/52586

classification

Title:	urlunparse(urlparse('x://')) now returns 'x:' instead of 'x://'
Type:	behavior	Stage:	test needed
Components:	Versions:	Python 3.1, Python 3.2, Python 2.7, Python 2.6

process

Dependencies:	Superseder:
Status:	closed	Resolution:	wont fix
Assigned To:	orsenthil	Nosy List:	Michael Glassford, eric.araujo, orsenthil, r.david.murray
Priority:	normal	Keywords:	easy

Created on 2010年04月07日 21:42 by Michael Glassford, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Messages (9)
msg102569 - (view)	Author: Michael Glassford (Michael Glassford)	Date: 2010年04月07日 21:42
An unfortunate side-effect of this change: http://svn.python.org/view/python/branches/release26-maint/Lib/urlparse.py?r1=66717&r2=78235 which was made to fix this issue: http://bugs.python.org/issue7904 is that urlparse.urlunparse(urlparse.urlparse('x://')) now returns 'x:' instead of 'x://', and urlparse.urlunparse(urlparse.urlparse('x:///y')) now returns 'x:/y' instead of 'x:///y'. This behavior exists in at least Python 2.6 and 3.1, but not in 2.5.
msg102580 - (view)	Author: R. David Murray (r.david.murray) * (Python committer)	Date: 2010年04月08日 00:19
To fix this, urlparse would have to differentiate between a null netloc and no netloc characters specified at all. This could be done by using None for one an '' for the other. I'm not sure that behavior change could be backported to 2.6, though. Did this issue actually cause a program failure for you in 2.6? If so the original patch might wind up getting classed as a regression for the 2.6 line.
msg102589 - (view)	Author: Michael Glassford (Michael Glassford)	Date: 2010年04月08日 02:35
It caused a minor issue with the Schemes extension for Mercurial: the output changed, which caused a unit test to fail. I'm pretty sure I have a Mercurial patch to fix that issue, however.
msg102737 - (view)	Author: Senthil Kumaran (orsenthil) * (Python committer)	Date: 2010年04月09日 18:00
Hello Michael, Looking a bit deeper into this issue, I don't see that 'x://' and 'x:///y' qualifies as valid URLS as per RFC 3986. (Well, urlparse has been not strictly conforming to it, but that is a different issue) If you look at the section 3. it states the following for validity. hier-part = "//" authority path-abempty / path-absolute / path-rootless / path-empty In those cases, I would assume that 'x://y', x:/y','x:/','/' as valid URLS, but not the two examples you mentioned. For the issue7904, we had just gone by the definition of RFC to make that minor change and it has resulted in this issue. I looked at the code to see if this can be addressed, but I see that your examples did not fit in as valid urls. Do you have any opinions on this? We can just the test_urlparse.py a little like below, and you might fix the break your code. def test_unparse_parse(self): - for u in ['Python', './Python','x-newscheme://foo.com/stuff']: + for u in ['Python', './Python','x-newscheme://foo.com/stuff','x://y','x:/y','x:/','/',]: self.assertEqual(urlparse.urlunsplit(urlparse.urlsplit(u)), u) self.assertEqual(urlparse.urlunparse(urlparse.urlparse(u)), u)
msg102922 - (view)	Author: Senthil Kumaran (orsenthil) * (Python committer)	Date: 2010年04月12日 07:08
Added additional examples of valid urls in r79988 and branches under the roundtrip test cases. Michael, RDM: If you have any comments on msg102737, let me know. Otherwise we can close this issue as wont-fix.
msg102932 - (view)	Author: Éric Araujo (eric.araujo) * (Python committer)	Date: 2010年04月12日 09:28
By the way, I’ve been meaning to file a bug against the Mercurial schemes extension since I saw how it misused URIs (i.e. "py://trunk/" should be "py:trunk/"); I hope it’s not too late now that this extension is shipped with the program. Regards
msg102939 - (view)	Author: Michael Glassford (Michael Glassford)	Date: 2010年04月12日 11:50
> In those cases, I would assume that 'x://y', x:/y','x:/','/' > as valid URLS, but not the two examples you mentioned. Only 2 comments about this: 1) Although the urlparse documentation does mention the relevant RFCs, on a quick read-through I don't see that it actually requires its input to be a valid URL. 2) Obviously, some code is using it for invalid URLs. > Michael, RDM: If you have any comments on msg102737, > let me know. Otherwise we can close this issue as wont-fix. I don't have a problem with this. I reported the issue to bring attention to the fact that the change affected real code and to see if anyone had suggestions for dealing with the problem in a better way (at the moment, I don't). > By the way, I’ve been meaning to file a bug against the > Mercurial schemes extension... Please do. I don't have anything to do with the Schemes extension except that I found this issue while investigating a unit test failure, but if you file an issue probably I will look into it--or if I don't, then someone else probably will.
msg102940 - (view)	Author: Senthil Kumaran (orsenthil) * (Python committer)	Date: 2010年04月12日 12:25
On Mon, Apr 12, 2010 at 11:50:57AM +0000, Michael Glassford wrote: > 1) Although the urlparse documentation does mention the relevant RFCs, on a quick read-through I don't see that it actually requires its input to be a valid URL. > 2) Obviously, some code is using it for invalid URLs. Which is true, because in some cases like urlparse.urljoin(base, rel), the rel is a relative url it undergoes the same parsing mechanism applicable to any url and thus there is no way a 'rigorous' check for authorized url is happening. Checks are done for valid chars in scheme and other parsing behaviours. > I don't have a problem with this. I reported the issue to bring attention to the fact that the change affected real code and to see if anyone had suggestions for dealing with the problem in a better way (at the moment, I don't). > Yeah, I understand the situation with 'changes which break the existing code', which we try to avoid many times. I shall if this can be addressed without reverting any recent changes and still being complaint.
msg103141 - (view)	Author: Senthil Kumaran (orsenthil) * (Python committer)	Date: 2010年04月14日 19:09
It looks like any change for this specific case would make tests like handling anyscheme fail. I am less inclined to do that as opposed to handling this case for invalid url, I suggest this be handled at user end to fix the proper parsing and constructing correct url. I am closing this as wont-fix.

History
Date	User	Action	Args
2022年04月11日 14:56:59	admin	set	github: 52586
2010年04月14日 19:09:23	orsenthil	set	status: open -> closed resolution: wont fix messages: + msg103141
2010年04月12日 12:25:17	orsenthil	set	messages: + msg102940
2010年04月12日 11:50:55	Michael Glassford	set	messages: + msg102939
2010年04月12日 09:28:55	eric.araujo	set	nosy: + eric.araujo messages: + msg102932
2010年04月12日 07:08:08	orsenthil	set	messages: + msg102922
2010年04月09日 18:00:03	orsenthil	set	assignee: orsenthil messages: + msg102737
2010年04月08日 02:35:32	Michael Glassford	set	messages: + msg102589
2010年04月08日 00:19:46	r.david.murray	set	priority: normal versions: + Python 2.7, Python 3.2 keywords: + easy nosy: + r.david.murray, orsenthil messages: + msg102580 stage: test needed
2010年04月07日 21:42:04	Michael Glassford	create

homepage