This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年11月06日 20:13 by davide.rizzo, last changed 2022年04月11日 14:57 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue13359.patch | krisys, 2011年11月09日 11:26 | percent encoding of urls to fix the issue reported. | ||
| issue13359.patch | maker, 2012年01月12日 15:04 | review | ||
| issue13359_py2.patch | maker, 2012年01月12日 15:30 | review | ||
| urllib-request-space-encode.diff | senko, 2013年07月06日 10:08 | review | ||
| Messages (10) | |||
|---|---|---|---|
| msg147180 - (view) | Author: Davide Rizzo (davide.rizzo) * | Date: 2011年11月06日 20:13 | |
urllib2.urlopen('http://foo/url and spaces') will send a HTTP request line like this to the server:
GET /url and spaces HTTP/1.1
which the server obviously does not understand. This contrasts with urllib's behaviour which replaces the spaces (' ') in the url with '%20'.
Related: #918368 #1153027
|
|||
| msg147349 - (view) | Author: Krishna Bharadwaj (krisys) | Date: 2011年11月09日 11:26 | |
I have used the quote method to percent encode the url for spaces and similar characters. This is my first patch. Please let me know if there is anything wrong. I will correct and re-submit it. I ran the test_urllib2.py which gave an OK for 34 tests. Changes are made in two instances: 1. in the open method. 2. in the __init__ of Request class to ensure that the same issue is addressed at the time of creating Request objects. |
|||
| msg149441 - (view) | Author: Ramchandra Apte (Ramchandra Apte) * | Date: 2011年12月14日 12:08 | |
Seems good. |
|||
| msg151126 - (view) | Author: Michele Orrù (maker) * | Date: 2012年01月12日 15:04 | |
Patch attached for python3, with unit tests. |
|||
| msg151127 - (view) | Author: Mads Kiilerich (kiilerix) * | Date: 2012年01月12日 15:10 | |
FWIW, I don't think it is a good idea to escape automatically. It will change the behaviour in a non-backward compatible way for existing applications that pass encoded urls to this function. I think the existing behaviour is better. The documentation and the failure mode for passing URLs with spaces could however be improved. |
|||
| msg151129 - (view) | Author: Michele Orrù (maker) * | Date: 2012年01月12日 15:30 | |
Here the patch for python2. kiilerix, RFC 1738 explicitly says that the space character shall not be used. |
|||
| msg151131 - (view) | Author: Mads Kiilerich (kiilerix) * | Date: 2012年01月12日 15:35 | |
Yes, the url sent by urllib2 must not contain spaces. In my opinion the only way to handle that correctly is to not pass urls with spaces to urlopen. Escaping the urls is not a good solution - even if the API was to be designed from scratch. It would be better to raise an exception if it is passed an invalid url. Note for example that '/' and the %-encoding of '/' are different, and it must thus be possible to pass an url containing both to urlopen. That is not possible if it automically escapes. |
|||
| msg183576 - (view) | Author: karl (karlcow) * | Date: 2013年03月06日 03:20 | |
The issue with the current patch is that it is escaping more than only the spaces, with possibly indirect border effect. Anne van Kesteren is in the process of creating a parsing/writing specification for URL. Not finished but putting it here for future reference. http://url.spec.whatwg.org/ |
|||
| msg192400 - (view) | Author: Senko Rasic (senko) * | Date: 2013年07月06日 10:08 | |
I vote for the parse method converting the spaces (and only the spaces) explicitly, for the following reasons:
* the spaces must be encoded for the server to accept them
* no user-encoded url will ever have spaces in them
* space quoting is idempotent: quote(quote(' ')) == quote(' ')
* if the user did get an exception from Request in case of invalid url containing the spaces, the only thing he or she can do is to quote the url string
Here's a patch implementing this. The change allows for any whitespace character in the selector part of the url (and in particular, '\n'), not only ' '.
|
|||
| msg295066 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2017年06月03日 05:54 | |
I think this could be merged with Issue 14826. Maybe it is sensible to handle all control characters the same way. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:23 | admin | set | github: 57568 |
| 2017年06月03日 05:54:20 | martin.panter | set | nosy:
+ martin.panter messages: + msg295066 resolution: duplicate superseder: urlopen URL with unescaped space |
| 2013年07月06日 10:08:12 | senko | set | files:
+ urllib-request-space-encode.diff nosy: + senko messages: + msg192400 |
| 2013年03月06日 03:20:50 | karlcow | set | nosy:
+ karlcow messages: + msg183576 |
| 2012年01月12日 15:35:57 | kiilerix | set | messages: + msg151131 |
| 2012年01月12日 15:30:04 | maker | set | files:
+ issue13359_py2.patch messages: + msg151129 |
| 2012年01月12日 15:10:58 | kiilerix | set | nosy:
+ kiilerix messages: + msg151127 |
| 2012年01月12日 15:04:33 | maker | set | files:
+ issue13359.patch nosy: + maker messages: + msg151126 |
| 2011年12月14日 12:08:23 | Ramchandra Apte | set | nosy:
+ Ramchandra Apte messages: + msg149441 |
| 2011年12月14日 10:55:00 | sandro.tosi | set | nosy:
+ sandro.tosi |
| 2011年11月09日 11:26:12 | krisys | set | files:
+ issue13359.patch nosy: + krisys messages: + msg147349 keywords: + patch |
| 2011年11月06日 20:14:48 | ezio.melotti | set | nosy:
+ ezio.melotti stage: test needed |
| 2011年11月06日 20:13:46 | davide.rizzo | create | |