Message 130498 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	ply
Recipients	ply
Date	2011年03月10日.10:19:57
SpamBayes Score	5.2878697e-08
Marked as misclassified	No
Message-id	<1299752398.22.0.888282532941.issue11461@psf.upfronthosting.co.za>

Content
Reading UTF-16 text file with module 'codecs' fails, if surrogate pair is located at 72-character boundary. Attached python script fails with message: UnicodeDecodeError: 'utf16' codec can't decode bytes in position 70-71: unexpected end of data The reason is splitting of input data for readline() into chunks, namely readsize = size or 72

Content

Reading UTF-16 text file with module 'codecs' fails, if surrogate pair is located at 72-character boundary.
Attached python script fails with message:
UnicodeDecodeError: 'utf16' codec can't decode bytes in position 70-71: unexpected end of data
The reason is splitting of input data for readline() into chunks, namely
 readsize = size or 72

History
Date	User	Action	Args
2011年03月10日 10:19:58	ply	set	recipients: + ply
2011年03月10日 10:19:58	ply	set	messageid: <1299752398.22.0.888282532941.issue11461@psf.upfronthosting.co.za>
2011年03月10日 10:19:57	ply	link	issue11461 messages
2011年03月10日 10:19:57	ply	create

homepage