This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年02月10日 23:24 by mbloore, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| fix7904.txt | mbloore, 2010年02月11日 18:20 | svn diff of fix and unit test against 2.7 repository. | ||
| fix7904-2.txt | mbloore, 2010年02月17日 21:09 | svn diff of fix and unit test against 2.7 repository. | ||
| Messages (14) | |||
|---|---|---|---|
| msg99181 - (view) | Author: mARK (mbloore) | Date: 2010年02月10日 23:24 | |
urlparse.urlsplit('s3://example/files/photos/161565.jpg')
returns
('s3', '', '//example/files/photos/161565.jpg', '', '')
instead of
('s3', 'example', '/files/photos/161565.jpg', '', '')
according to rfc 3986 's3' is a valid scheme name, so the '://' indicates a URL with netloc and path parts.
|
|||
| msg99183 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2010年02月10日 23:28 | |
Thanks for the report, could you provide a patch with unit tests? |
|||
| msg99196 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年02月11日 03:48 | |
Does s3 stand for the amazon s3 services? urlparse does not have it under its list of known schemes yet. Does s3 have any specifications as such or is aligned towards any of the known schemes (like http or ftp)? s3 is valid scheme name according to rfc 3986, but urlparse module does not seem to recognize it. If we say, s3 to be much similar to http, then it can be added to list of known schemes. Does Amazon say anything about it? |
|||
| msg99198 - (view) | Author: mARK (mbloore) | Date: 2010年02月11日 04:53 | |
it's not actually necessary to have a list of known schemes. any url that has a double slash after the colon is expected to follow that with an authority section (what urlparse calls "netloc"), optionally followed by a path, which starts with a slash. there are various defined schemes with their own syntax within the URL framework, but one is free to invent new ones with the general form scheme://netloc/path |
|||
| msg99229 - (view) | Author: mARK (mbloore) | Date: 2010年02月11日 18:20 | |
i have attached an svn diff of my (very simple!) fix and added unit test for python 2.7. |
|||
| msg99256 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年02月12日 02:58 | |
Hello Mark, Thanks for the patch. However there are reasons why the check is: "if scheme in uses_netloc and url[:2] == '//':" It cannot be replaced by just url[:2] == '//' as in your patch. Different protocols have different parsing requirements. (for e.g. some wish to consider (or act as if), after the scheme, the rest is their path) The better way is to add 's3' to uses_netloc list and it should be fine too. I shall add it and include your tests. Thanks. |
|||
| msg99265 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2010年02月12日 13:41 | |
I think Mark is correct. RFC 3986 says:
When authority is present, the path must either be empty or begin with a slash ("/") character. When authority is not present, the path cannot begin with two slash characters ("//").
I think it would make sense to have urlparse fall back to doing a generic RFC 3986 parse when it does not recognize the scheme.
|
|||
| msg99290 - (view) | Author: mARK (mbloore) | Date: 2010年02月12日 21:12 | |
The case which prompted this issue was a purely private set of URLs, sent to me by a client but never sent to Amazon or anywhere else outside our systems (though I'm sure many others have invented this particular scheme for their own use). It would have been convenient if urlparse had handled it properly. That is true for any scheme one may invent at need. On second thought it does make sense to enforce the use of :// for the schemes in uses_netloc, but still not to ignore its meaning for other schemes. It also makes sense to add s3 to uses_netloc despite the fact that it is not (afaik) registered, since it is an obvious invention. I'll make another patch, but I don't have time to do it just now. |
|||
| msg99480 - (view) | Author: mARK (mbloore) | Date: 2010年02月17日 21:09 | |
Doing a fallback test for // would look like if scheme in uses_netloc and url[:2] == '//' or url[:2] == '//': but this is equivalent to if url[:2] == '//': i.e., an authority appears if and only if there is a // after the scheme. This still allows a uses_netloc scheme to appear without //. I have attached a patch against Python 2.7, svn revision 78212. It adds s3 to netloc. |
|||
| msg99560 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年02月19日 07:47 | |
Fixed in the r78234 and merged back to other branches. I fell back to RFC's definition of scheme, as anything before the ://. I did not see the need to add s3 specifically as a valid scheme type, because s3 itself is not registered a schemetype. So, the fix should work for s3 and other undefined schemes as per RFC. Thanks for the patch. |
|||
| msg104261 - (view) | Author: Tres Seaver (tseaver) * | Date: 2010年04月26日 17:38 | |
The fix for this bug breaks any code which worked with non-standard schemes in 2.6.4 (by working around the issue). This kind of backward incompatibility should be called out prominently in NEWS.txt (assuming that such a fix is considered appropriate in a third-dot release). |
|||
| msg105078 - (view) | Author: Éric Araujo (eric.araujo) * (Python committer) | Date: 2010年05月05日 19:14 | |
I remember seeing a discussion on python-dev archives about that months or years ago. Someone pointed to Guido that the new RFC removed the need for uses_netloc thanks to the generic syntax. Isn’t there already a bug about that? |
|||
| msg123300 - (view) | Author: Fred Drake (fdrake) (Python committer) | Date: 2010年12月03日 22:33 | |
Though msg104261 suggests this change be documented in NEWS.txt, it doesn't appear to have made it. Sure enough, we just found application code that this broke. |
|||
| msg123327 - (view) | Author: Senthil Kumaran (orsenthil) * (Python committer) | Date: 2010年12月04日 10:02 | |
On Fri, Dec 03, 2010 at 10:33:50PM +0000, Fred L. Drake, Jr. wrote: > Though msg104261 suggests this change be documented in NEWS.txt, it > doesn't appear to have made it. Better late than never. I just added the NEWS in r87014 (py3k) ,r87015(release31-maint) ,r87016(release27-maint). |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:57 | admin | set | github: 52152 |
| 2010年12月04日 10:02:42 | orsenthil | set | messages: + msg123327 |
| 2010年12月03日 22:33:47 | fdrake | set | nosy:
+ fdrake messages: + msg123300 |
| 2010年05月05日 19:14:32 | eric.araujo | set | nosy:
+ eric.araujo messages: + msg105078 |
| 2010年04月26日 17:38:54 | tseaver | set | nosy:
+ tseaver messages: + msg104261 |
| 2010年02月19日 07:47:30 | orsenthil | set | status: open -> closed resolution: fixed messages: + msg99560 |
| 2010年02月17日 21:09:37 | mbloore | set | files:
+ fix7904-2.txt messages: + msg99480 |
| 2010年02月12日 21:13:58 | mbloore | set | nosy:
orsenthil, ezio.melotti, mbloore, r.david.murray components: + Library (Lib), - Extension Modules versions: + Python 3.1, Python 3.2 |
| 2010年02月12日 21:12:06 | mbloore | set | nosy:
orsenthil, ezio.melotti, mbloore, r.david.murray messages: + msg99290 components: + Extension Modules, - Library (Lib) versions: - Python 3.1, Python 3.2 |
| 2010年02月12日 13:41:48 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg99265 versions: + Python 3.1, Python 3.2 |
| 2010年02月12日 02:58:48 | orsenthil | set | nosy:
orsenthil, ezio.melotti, mbloore messages: + msg99256 components: + Library (Lib), - Extension Modules |
| 2010年02月11日 18:20:37 | mbloore | set | files:
+ fix7904.txt messages: + msg99229 title: urllib.urlparse mishandles novel schemes -> urlparse.urlsplit mishandles novel schemes |
| 2010年02月11日 04:53:11 | mbloore | set | messages: + msg99198 |
| 2010年02月11日 03:48:18 | orsenthil | set | assignee: orsenthil messages: + msg99196 nosy: + orsenthil |
| 2010年02月10日 23:28:06 | ezio.melotti | set | priority: normal versions: + Python 2.6, Python 2.7, - Python 2.5 nosy: + ezio.melotti messages: + msg99183 stage: test needed |
| 2010年02月10日 23:24:49 | mbloore | create | |