On Fri, May 30, 2008 at 02:19:23PM +0200, Georg Brandl wrote: > Python 3.0's urllib.quote() and unquote() handle non-ASCII data strangely. > quote() encodes characters with codepoint < 256 using latin-1, but others > using utf-8. unquote() decodes everything using latin-1. >> Is the correct behavior to always use utf-8? Always UTF-8. See http://en.wikipedia.org/wiki/Percent-encoding#Current_standard Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN.