Message 97130 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	ezio.melotti, lemburg, rhansen
Date	2010年01月02日.14:46:41
SpamBayes Score	4.0745185e-14
Marked as misclassified	No
Message-id	<4B3F5C50.2010607@egenix.com>
In-reply-to	<1262316507.32.0.63969926675.issue7615@psf.upfronthosting.co.za>

Content
Richard Hansen wrote: > > New submission from Richard Hansen <rhansen@bbn.com>: > > The description of the unicode_escape codec says that it produces "a > string that is suitable as Unicode literal in Python source code." [1] > Unfortunately, this is not true as it does not escape quotes. For example: > > print u'a\'b"c\'\'\'d"""e'.encode('unicode_escape') > > outputs: > > a'b"c'''d"""e Indeed. Python only uses the decoder of that codec internally. > I have attached a patch that fixes this issue by escaping single quotes. > With the patch applied, the output is: > > a\'b"c\'\'\'d"""e > > I chose to only escape single quotes because: > 1. it simplifies the patch, and > 2. it matches string_escape's behavior. If we change this, the encoder should quote both single and double quotes - simply because it is not known whether the literal will use single or double quotes. The raw_unicode_escape codec would have to be fixed as well.

Content

Richard Hansen wrote:
> 
> New submission from Richard Hansen <rhansen@bbn.com>:
> 
> The description of the unicode_escape codec says that it produces "a
> string that is suitable as Unicode literal in Python source code." [1] 
> Unfortunately, this is not true as it does not escape quotes. For example:
> 
> print u'a\'b"c\'\'\'d"""e'.encode('unicode_escape')
> 
> outputs:
> 
> a'b"c'''d"""e
Indeed. Python only uses the decoder of that codec internally.
> I have attached a patch that fixes this issue by escaping single quotes.
> With the patch applied, the output is:
> 
> a\'b"c\'\'\'d"""e
> 
> I chose to only escape single quotes because:
> 1. it simplifies the patch, and
> 2. it matches string_escape's behavior.
If we change this, the encoder should quote both single and double
quotes - simply because it is not known whether the literal
will use single or double quotes.
The raw_unicode_escape codec would have to be fixed as well.

History
Date	User	Action	Args
2010年01月02日 14:46:44	lemburg	set	recipients: + lemburg, ezio.melotti, rhansen
2010年01月02日 14:46:42	lemburg	link	issue7615 messages
2010年01月02日 14:46:42	lemburg	create

homepage