Message 79360 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author	lemburg
Recipients	lemburg, loewis, pitrou
Date	2009年01月07日.18:35:11
SpamBayes Score	0.032027345
Marked as misclassified	No
Message-id	<4964F5DE.5020408@egenix.com>
In-reply-to	<1231341903.79.0.54860955311.issue4868@psf.upfronthosting.co.za>

Content
On 2009年01月07日 16:25, Antoine Pitrou wrote: > New submission from Antoine Pitrou <pitrou@free.fr>: > > Here is a patch to speedup utf8 decoding. On a 64-bit build, the maximum > speedup is around 30%, and on a 32-bit build around 15%. () > > The patch may look disturbingly trivial, and I haven't studied the > assembler output, but I think it is explained by the fact that having a > separate loop counter breaks the register dependencies (when the 's' > pointer was incremented, other operations had to wait for the > incrementation to be committed). > > [side note: utf8 encoding is still much faster than decoding, but it may > be because it allocates a smaller object, regardless of the iteration count] > > The same principle can probably be applied to the other decoding > functions in unicodeobject.c, but first I wanted to know whether the > principle is ok to apply. Marc-André, what is your take? I'm +1 on anything that makes codecs faster :-) However, the patch should be checked with some other compilers as well, e.g. using MS VC++. > () the benchmark I used is: > > ./python -m timeit -s "import > codecs;c=codecs.utf_8_decode;s=b'abcde'1000" "c(s)" > > More complex input also gets a speedup, albeit a smaller one (~10%): > > ./python -m timeit -s "import > codecs;c=codecs.utf_8_decode;s=b'\xc3\xa9\xe7\xb4\xa2'1000" "c(s)"

Content

On 2009年01月07日 16:25, Antoine Pitrou wrote:
> New submission from Antoine Pitrou <pitrou@free.fr>:
> 
> Here is a patch to speedup utf8 decoding. On a 64-bit build, the maximum
> speedup is around 30%, and on a 32-bit build around 15%. (*)
> 
> The patch may look disturbingly trivial, and I haven't studied the
> assembler output, but I think it is explained by the fact that having a
> separate loop counter breaks the register dependencies (when the 's'
> pointer was incremented, other operations had to wait for the
> incrementation to be committed).
> 
> [side note: utf8 encoding is still much faster than decoding, but it may
> be because it allocates a smaller object, regardless of the iteration count]
> 
> The same principle can probably be applied to the other decoding
> functions in unicodeobject.c, but first I wanted to know whether the
> principle is ok to apply. Marc-André, what is your take?
I'm +1 on anything that makes codecs faster :-)
However, the patch should be checked with some other compilers
as well, e.g. using MS VC++.
> (*) the benchmark I used is:
> 
> ./python -m timeit -s "import
> codecs;c=codecs.utf_8_decode;s=b'abcde'*1000" "c(s)"
> 
> More complex input also gets a speedup, albeit a smaller one (~10%):
> 
> ./python -m timeit -s "import
> codecs;c=codecs.utf_8_decode;s=b'\xc3\xa9\xe7\xb4\xa2'*1000" "c(s)"

History
Date	User	Action	Args
2009年01月07日 18:35:14	lemburg	set	recipients: + lemburg, loewis, pitrou
2009年01月07日 18:35:13	lemburg	link	issue4868 messages
2009年01月07日 18:35:12	lemburg	create

homepage