Message 79338 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	pitrou
Recipients	lemburg, pitrou
Date	2009年01月07日.15:24:58
SpamBayes Score	0.065116405
Marked as misclassified	No
Message-id	<1231341903.79.0.54860955311.issue4868@psf.upfronthosting.co.za>

Content
Here is a patch to speedup utf8 decoding. On a 64-bit build, the maximum speedup is around 30%, and on a 32-bit build around 15%. () The patch may look disturbingly trivial, and I haven't studied the assembler output, but I think it is explained by the fact that having a separate loop counter breaks the register dependencies (when the 's' pointer was incremented, other operations had to wait for the incrementation to be committed). [side note: utf8 encoding is still much faster than decoding, but it may be because it allocates a smaller object, regardless of the iteration count] The same principle can probably be applied to the other decoding functions in unicodeobject.c, but first I wanted to know whether the principle is ok to apply. Marc-André, what is your take? () the benchmark I used is: ./python -m timeit -s "import codecs;c=codecs.utf_8_decode;s=b'abcde'1000" "c(s)" More complex input also gets a speedup, albeit a smaller one (~10%): ./python -m timeit -s "import codecs;c=codecs.utf_8_decode;s=b'\xc3\xa9\xe7\xb4\xa2'1000" "c(s)"

Content

Here is a patch to speedup utf8 decoding. On a 64-bit build, the maximum
speedup is around 30%, and on a 32-bit build around 15%. (*)
The patch may look disturbingly trivial, and I haven't studied the
assembler output, but I think it is explained by the fact that having a
separate loop counter breaks the register dependencies (when the 's'
pointer was incremented, other operations had to wait for the
incrementation to be committed).
[side note: utf8 encoding is still much faster than decoding, but it may
be because it allocates a smaller object, regardless of the iteration count]
The same principle can probably be applied to the other decoding
functions in unicodeobject.c, but first I wanted to know whether the
principle is ok to apply. Marc-André, what is your take?
(*) the benchmark I used is:
./python -m timeit -s "import
codecs;c=codecs.utf_8_decode;s=b'abcde'*1000" "c(s)"
More complex input also gets a speedup, albeit a smaller one (~10%):
./python -m timeit -s "import
codecs;c=codecs.utf_8_decode;s=b'\xc3\xa9\xe7\xb4\xa2'*1000" "c(s)"

History
Date	User	Action	Args
2009年01月07日 15:25:04	pitrou	set	recipients: + pitrou, lemburg
2009年01月07日 15:25:03	pitrou	set	messageid: <1231341903.79.0.54860955311.issue4868@psf.upfronthosting.co.za>
2009年01月07日 15:25:02	pitrou	link	issue4868 messages
2009年01月07日 15:24:59	pitrou	create

homepage