Message 195937 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	ncoghlan
Recipients	ncoghlan
Date	2013年08月23日.04:02:31
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1377230552.02.0.907422956718.issue18814@psf.upfronthosting.co.za>

Content
Prompted by issue 18713 and http://lucumr.pocoo.org/2013/7/2/the-updated-guide-to-unicode/, here are some possible utilities we could add to the codecs module to help deal with/debug issues related to surrogate escaped strings: def has_escaped_bytes(s): """Returns true if string contains surrogate escaped bytes""" ... def replace_escaped_bytes(s): """Replaces each surrogate escaped byte with a valid code point""" ... def decode_escaped_bytes(s, nominal_encoding, actual_encoding): """Reinterprets incorrectly decoded text using a new encoding""" return s.encode(nominal_encoding, 'surrogateescape').decode(actual_encoding)

Content

Prompted by issue 18713 and http://lucumr.pocoo.org/2013/7/2/the-updated-guide-to-unicode/, here are some possible utilities we could add to the codecs module to help deal with/debug issues related to surrogate escaped strings:
 def has_escaped_bytes(s):
 """Returns true if string contains surrogate escaped bytes"""
 ...
 def replace_escaped_bytes(s):
 """Replaces each surrogate escaped byte with a valid code point"""
 ...
 def decode_escaped_bytes(s, nominal_encoding, actual_encoding):
 """Reinterprets incorrectly decoded text using a new encoding"""
 return s.encode(nominal_encoding, 'surrogateescape').decode(actual_encoding)

History
Date	User	Action	Args
2013年08月23日 04:02:32	ncoghlan	set	recipients: + ncoghlan
2013年08月23日 04:02:32	ncoghlan	set	messageid: <1377230552.02.0.907422956718.issue18814@psf.upfronthosting.co.za>
2013年08月23日 04:02:31	ncoghlan	link	issue18814 messages
2013年08月23日 04:02:31	ncoghlan	create

homepage