[Python-Dev] Unicode charmap decoders slow

M.-A. Lemburg mal at egenix.com
Thu Oct 6 11:13:50 CEST 2005


Hye-Shik Chang wrote:
> On 10/6/05, M.-A. Lemburg <mal at egenix.com> wrote:
>>>Hye-Shik, could you please provide some timeit figures for
>>the fastmap encoding ?
>>
Thanks for the timings.
> (before applying Walter's patch, charmap decoder)
>> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "s.decode(e)"
> 100 loops, best of 3: 3.35 msec per loop
>> (applied the patch, improved charmap decoder)
>> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "s.decode(e)"
> 1000 loops, best of 3: 1.11 msec per loop
>> (the fastmap decoder)
>> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> u=unicode(s, e)" "s.decode(e)"
> 1000 loops, best of 3: 1.04 msec per loop
>> (utf-8 decoder)
>> % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> e)" "s.decode(e)"
> 1000 loops, best of 3: 851 usec per loop
>> Walter's decoder and the fastmap decoder run in mostly same way.
> So the performance difference is quite minor. Perhaps, the minor
> difference came from the existence of wrapper function on each codecs;
> the fastmap codec provides functions usable as Codecs.{en,de}code
> directly.
>> (encoding, charmap codec)
>> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10';
> u=unicode(s, e)" "u.encode(e)"
> 100 loops, best of 3: 3.51 msec per loop
>> (encoding, fastmap codec)
>> % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10_fc';
> u=unicode(s, e)" "u.encode(e)"
> 1000 loops, best of 3: 536 usec per loop
>> (encoding, utf-8 codec)
>> % ./python Lib/timeit.py -s "s='a'*53*1024; e='utf_8'; u=unicode(s,
> e)" "u.encode(e)"
> 1000 loops, best of 3: 1.5 msec per loop

I wonder why the UTF-8 codec is slower than the fastmap
codec in this case.
> If the encoding optimization can be easily done in Walter's approach,
> the fastmap codec would be too expensive way for the objective because
> we must maintain not only fastmap but also charmap for backward
> compatibility.

Indeed. Let's go with a patched charmap codec then.
-- 
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Oct 06 2005)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::


More information about the Python-Dev mailing list

AltStyle によって変換されたページ (->オリジナル) /