homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author soilandreyes
Recipients soilandreyes
Date 2014年11月12日.10:23:30
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1415787811.02.0.365080072485.issue22852@psf.upfronthosting.co.za>
In-reply-to
Content
urllib.parse can't handle URIs with empty #fragments. The fragment is removed and not reconsituted.
http://tools.ietf.org/html/rfc3986#section-3.5 permits empty fragment strings:
 URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
 fragment = *( pchar / "/" / "?" )
And even specifies component recomposition to distinguish from not being defined and being an empty string:
http://tools.ietf.org/html/rfc3986#section-5.3
 Note that we are careful to preserve the distinction between a
 component that is undefined, meaning that its separator was not
 present in the reference, and a component that is empty, meaning that
 the separator was present and was immediately followed by the next
 component separator or the end of the reference.
This seems to be caused by missing components being represented as '' instead of None.
>>> import urllib.parse
>>> urllib.parse.urlparse("http://example.com/file#")
ParseResult(scheme='http', netloc='example.com', path='/file', params='', query='', fragment='')
>>> urllib.parse.urlunparse(urllib.parse.urlparse("http://example.com/file#"))
'http://example.com/file'
>>> urllib.parse.urlparse("http://example.com/file#").geturl()
'http://example.com/file'
>>> urllib.parse.urlparse("http://example.com/file# ").geturl()
'http://example.com/file# '
>>> urllib.parse.urlparse("http://example.com/file#nonempty").geturl()
'http://example.com/file#nonempty'
>>> urllib.parse.urlparse("http://example.com/file#").fragment
''
The suggested fix is to use None instead of '' to represent missing components, and to check with "if fragment is not None" instead of "if not fragment".
The same issue applies to query and authority. E.g.
http://example.com/file? != http://example.com/file
... but be careful about the implications of
file:///file != file:/file
History
Date User Action Args
2014年11月12日 10:23:31soilandreyessetrecipients: + soilandreyes
2014年11月12日 10:23:31soilandreyessetmessageid: <1415787811.02.0.365080072485.issue22852@psf.upfronthosting.co.za>
2014年11月12日 10:23:30soilandreyeslinkissue22852 messages
2014年11月12日 10:23:30soilandreyescreate

AltStyle によって変換されたページ (->オリジナル) /