homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients ezio.melotti, methane, serhiy.storchaka, vstinner
Date 2015年09月29日.11:30:33
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1443526234.91.0.845116536507.issue25267@psf.upfronthosting.co.za>
In-reply-to
Content
Attached patch optimizes the UTF-8 encoder for error handlers: ignore, replace, surrogateescape, surrogatepass. It is based on the patch faster_surrogates_hadling.patch written by Serhiy Storchaka in the issue #24870.
It also modifies unicode_encode_ucs1() to use memset() for the replace error handler. It should be faster for long sequences of unencodable characters, but it may be slower for short sequences of unencodable characters.
The patch adds new unit tests and fix unit tests to ensure that utf-8-sig codec is also well tested.
TODO: write a benchmark.
See also the issue #25227 which optimized ASCII and latin1 encoders with the surrogateescape error handlers.
History
Date User Action Args
2015年09月29日 11:30:34vstinnersetrecipients: + vstinner, ezio.melotti, methane, serhiy.storchaka
2015年09月29日 11:30:34vstinnersetmessageid: <1443526234.91.0.845116536507.issue25267@psf.upfronthosting.co.za>
2015年09月29日 11:30:34vstinnerlinkissue25267 messages
2015年09月29日 11:30:34vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /