Message271719
| Author |
martin.panter |
| Recipients |
Björn.Lindqvist, martin.panter, orsenthil, r.david.murray |
| Date |
2016年07月31日.02:37:10 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1469932632.75.0.192070134574.issue27657@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
The main backward compatibility consideration would be Issue 754016, but don’t agree with the changes made, and would support reverting them. The original bug reporter wanted urlparse("1.2.3.4:80", "http") to be treated as the URL http://1.2.3.4:80, but the IP address was being parsed as a scheme, so the default "http" scheme was ignored.
The original fix (r83701) affected any URL that had a digit 0–9 immediately after the "scheme:" prefix. In such URLs, the scheme component was no longer parsed. A test case for "path:80" was added, and a demonstration of not parsing any scheme from www.cwi.nl:80/%7Eguido/Python.html was added in the documentation.
Later, the logic was altered to test if the URL looked like an integer (revision 495d12196487, Issue 11467). This restored proper parsing of clsid:85bbd92o-42a0-1o69-a2e4-08002b30309d and mailto:1337@example.org, although another URL given, javascript:123, remains misparsed. The documentation was subsequently adjusted in Issue 16932 to just demonstrate www.cwi.nl/%7Eguido/Python.html being parsed as a path.
The logic was watered down to its current form by revision 9f6b7576c08c, Issue 14072. Now it tests for a non-digit anywhere after the scheme, so that tel:+31641044153 is again parsed properly. But it was pointed out that tel:1234 remains misparsed.
What’s the next step in the watering-down process? All the attempts so far break valid URLs in favour of special-casing inputs that are not valid URLs. |
|