homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Optimize ASCII/latin1 encoder with surrogateescape error handlers
Type: performance Stage:
Components: Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: methane, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2015年09月24日 12:48 by vstinner, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
encode_ucs1_surrogateescape.patch vstinner, 2015年09月24日 12:55 review
encode_ucs1_surrogateescape-2.patch vstinner, 2015年09月24日 14:15 review
bench.py vstinner, 2015年09月24日 14:16
Messages (6)
msg251516 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年09月24日 12:48
Attached patch is based on faster_surrogates_hadling.patch written by Serhiy Storchaka for the issue #24870. It optimizes str.encode('ascii', 'surrogateescape') and str.encode('ascii', 'latin1').
msg251518 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年09月24日 12:54
New changeset fa65c32d7134 by Victor Stinner in branch 'default':
Issue #25227: Cleanup unicode_encode_ucs1() error handler
https://hg.python.org/cpython/rev/fa65c32d7134 
msg251525 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年09月24日 14:15
Updated test now with more unit tests.
msg251526 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年09月24日 14:16
Result of a micro-benchmark with encode_ucs1_surrogateescape-2.patch.
Common platform:
Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09)
Timer: time.perf_counter
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Python unicode implementation: PEP 393
Platform: Linux-4.1.6-200.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two
CFLAGS: -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
Bits: int=32, long=64, long long=64, size_t=64, void*=64
Platform of campaign before:
Date: 2015年09月24日 16:12:35
Timer precision: 54 ns
Python version: 3.6.0a0 (default:fa65c32d7134, Sep 24 2015, 16:11:44) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
SCM: hg revision=fa65c32d7134 tag=tip branch=default date="2015-09-24 14:45 +0200"
Platform of campaign after:
Python version: 3.6.0a0 (default:fa65c32d7134+, Sep 24 2015, 16:13:20) [GCC 5.1.1 20150618 (Red Hat 5.1.1-4)]
Timer precision: 55 ns
SCM: hg revision=fa65c32d7134+ tag=tip branch=default date="2015-09-24 14:45 +0200"
Date: 2015年09月24日 16:13:21
-----------------------+-------------+---------------
ascii | before | after
-----------------------+-------------+---------------
100 x 10**1 characters | 6.65 us (*) | 1.93 us (-71%)
100 x 10**3 characters | 512 us (*) | 158 us (-69%)
100 x 10**2 characters | 52.2 us (*) | 16.2 us (-69%)
100 x 10**4 characters | 5.09 ms (*) | 1.59 ms (-69%)
-----------------------+-------------+---------------
Total | 5.66 ms (*) | 1.77 ms (-69%)
-----------------------+-------------+---------------
-----------------------+-------------+---------------
latin1 | before | after
-----------------------+-------------+---------------
100 x 10**1 characters | 6.24 us (*) | 1.89 us (-70%)
100 x 10**3 characters | 500 us (*) | 160 us (-68%)
100 x 10**2 characters | 51 us (*) | 16.3 us (-68%)
100 x 10**4 characters | 5 ms (*) | 1.59 ms (-68%)
-----------------------+-------------+---------------
Total | 5.56 ms (*) | 1.77 ms (-68%)
-----------------------+-------------+---------------
--------+-------------+---------------
Summary | before | after
--------+-------------+---------------
ascii | 5.66 ms (*) | 1.77 ms (-69%)
latin1 | 5.56 ms (*) | 1.77 ms (-68%)
--------+-------------+---------------
Total | 11.2 ms (*) | 3.53 ms (-69%)
--------+-------------+---------------
msg251841 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年09月29日 10:35
New changeset 128a3f03ddeb by Victor Stinner in branch 'default':
Optimize ascii/latin1+surrogateescape encoders
https://hg.python.org/cpython/rev/128a3f03ddeb 
msg251843 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年09月29日 10:39
INADA Naoki: The ASCII and latin1 encoders are now up to 3 times as fast when the surrogateescape error handler is used in Python 3.6.
History
Date User Action Args
2022年04月11日 14:58:21adminsetgithub: 69414
2015年09月29日 10:39:51vstinnersetstatus: open -> closed

nosy: + methane, serhiy.storchaka
messages: + msg251843

resolution: fixed
2015年09月29日 10:35:12python-devsetmessages: + msg251841
2015年09月24日 14:16:18vstinnersetfiles: + bench.py

messages: + msg251526
2015年09月24日 14:15:38vstinnersetfiles: + encode_ucs1_surrogateescape-2.patch

messages: + msg251525
2015年09月24日 12:56:00vstinnersetfiles: + encode_ucs1_surrogateescape.patch
keywords: + patch
2015年09月24日 12:54:23python-devsetnosy: + python-dev
messages: + msg251518
2015年09月24日 12:48:03vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /