Message 69366 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	mgiuca
Recipients	loewis, mgiuca
Date	2008年07月07日.01:45:23
SpamBayes Score	0.0014411152
Marked as misclassified	No
Message-id	<1215395126.18.0.281556028365.issue3300@psf.upfronthosting.co.za>

Content
Point taken. But the RFC certainly doesn't say that ISO-8859-1 should be used. Since we're outputting a Unicode string in Python 3, we need to decode with some encoding, and UTF-8 seems the most sensible and standardised. (Even the existing test case in test_urllib.py:466 uses a UTF-8-encoded URL, and I had to fix it so it decodes into a meaningful string). Having said that, it's possible that you may wish to use another encoding, and legal to do so. Therefore, I'd suggest we add an "encoding" argument to both quote and unquote, which defaults to "utf-8". Note that in the current implementation, unquote is not an inverse of quote, because quote uses UTF-8 to encode characters with code points >= 256, while unquote decodes them as ISO-8859-1. I think it's important these two functions are inverses of each other.

Content

Point taken. But the RFC certainly doesn't say that ISO-8859-1 should be
used. Since we're outputting a Unicode string in Python 3, we need to
decode with some encoding, and UTF-8 seems the most sensible and
standardised.
(Even the existing test case in test_urllib.py:466 uses a UTF-8-encoded
URL, and I had to fix it so it decodes into a meaningful string).
Having said that, it's possible that you may wish to use another
encoding, and legal to do so. Therefore, I'd suggest we add an
"encoding" argument to both quote and unquote, which defaults to "utf-8".
Note that in the current implementation, unquote is not an inverse of
quote, because quote uses UTF-8 to encode characters with code points >=
256, while unquote decodes them as ISO-8859-1. I think it's important
these two functions are inverses of each other.

History
Date	User	Action	Args
2008年07月07日 01:45:26	mgiuca	set	spambayes_score: 0.00144112 -> 0.0014411152 recipients: + mgiuca, loewis
2008年07月07日 01:45:26	mgiuca	set	spambayes_score: 0.00144112 -> 0.00144112 messageid: <1215395126.18.0.281556028365.issue3300@psf.upfronthosting.co.za>
2008年07月07日 01:45:25	mgiuca	link	issue3300 messages
2008年07月07日 01:45:24	mgiuca	create

homepage