homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add _PyBytesWriter API to optimize Unicode encoders
Type: performance Stage:
Components: Unicode Versions: Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2015年10月05日 12:01 by vstinner, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bench_utf8_result.txt vstinner, 2015年10月05日 12:02
bench_ucs1_result.txt vstinner, 2015年10月05日 12:04
bytes_writer.patch vstinner, 2015年10月05日 12:05 review
Messages (14)
msg252322 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月05日 12:01
Attached patch is the first step to optimize Unicode encoders: it adds a _PyBytesWriter API. This API is responsible to use the most efficient buffer depending on the need:
* it's possible to use a small buffer directly allocated on the C stack
* otherwise a Python bytes object is allocated
* it's possible to overallocate the bytes objcet to reduce the number of calls to _PyBytes_Resize()
The patch only adds the new API, don't expect any speed up. I just added a small optimization: the overallocation is disabled in UCS1 encoder (ASCII and Latin1) for the last write.
msg252323 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月05日 12:02
Result of bench.py attached to issue #25267: attached bench_utf8_result.txt.
------------------------------------------------------+-------------+---------------
Summary                        | utf8_before |   utf8_after
------------------------------------------------------+-------------+---------------
ignore: "\udcff" * length               | 7.63 us (*) |    7.91 us
ignore: "a" * length + "\udcff"            | 10.7 us (*) |    10.8 us
ignore: ("a" * 99 + "\udcff" * 99) * length      | 2.17 ms (*) |    2.16 ms
ignore: ("\udcff" * 99 + "a") * length        | 843 us (*) |     866 us
ignore: "\udcff" + "a" * length            | 10.7 us (*) |     11 us
replace: "\udcff" * length              | 7.87 us (*) | 8.43 us (+7%)
replace: "a" * length + "\udcff"           | 10.8 us (*) |    10.9 us
replace: ("a" * 99 + "\udcff" * 99) * length     | 2.46 ms (*) |    2.46 ms
replace: ("\udcff" * 99 + "a") * length        | 907 us (*) |     939 us
replace: "\udcff" + "a" * length           | 10.9 us (*) |     11 us
surrogateescape: "\udcff" * length          | 14.2 us (*) | 17.2 us (+21%)
surrogateescape: "a" * length + "\udcff"       | 10.6 us (*) |    10.7 us
surrogateescape: ("a" * 99 + "\udcff" * 99) * length | 3.19 ms (*) |  3.4 ms (+7%)
surrogateescape: ("\udcff" * 99 + "a") * length    | 1.64 ms (*) | 1.87 ms (+13%)
surrogateescape: "\udcff" + "a" * length       | 10.6 us (*) |    10.7 us
surrogatepass: "\udcff" * length           | 23.1 us (*) |    23.5 us
surrogatepass: "a" * length + "\udcff"        | 10.7 us (*) |    10.8 us
surrogatepass: ("a" * 99 + "\udcff" * 99) * length  | 4.39 ms (*) |    4.44 ms
surrogatepass: ("\udcff" * 99 + "a") * length     | 2.43 ms (*) |    2.47 ms
surrogatepass: "\udcff" + "a" * length        | 10.6 us (*) |    10.8 us
backslashreplace: "\udcff" * length          | 65.7 us (*) |    64.3 us
backslashreplace: "a" * length + "\udcff"       | 15.7 us (*) |     15 us
backslashreplace: ("a" * 99 + "\udcff" * 99) * length |  12 ms (*) | 15.9 ms (+32%)
backslashreplace: ("\udcff" * 99 + "a") * length   | 11.1 ms (*) | 13.5 ms (+22%)
backslashreplace: "\udcff" + "a" * length       | 16.4 us (*) | 15.1 us (-8%)
------------------------------------------------------+-------------+---------------
Total                         | 41.4 ms (*) | 48.3 ms (+17%)
------------------------------------------------------+-------------+---------------
msg252324 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月05日 12:04
Results of bench.py attached to issue #25227 (ASCII and Latin1 encoders): attached bench_ucs1_result.txt file.
--------+-------------+-----------
Summary | ucs1_before | ucs1_after
--------+-------------+-----------
ascii | 1.69 ms (*) | 1.69 ms
latin1 | 1.7 ms (*) | 1.69 ms
--------+-------------+-----------
Total | 3.39 ms (*) | 3.39 ms
--------+-------------+-----------
msg252325 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月05日 12:12
A few months ago, I wrote a previous implementation of the _PyBytesWriter API which embedded the "current pointer" inside _PyBytesWriter API. The problem was that GCC produced less efficient code than expect for the hotspot of the encoder.
In the new implementation (attached patch), the "current pointer" is unchanged: it's still a variable local to the encoder function. Instead, the current pointer became a *parameter* to all _PyBytesWriter *functions*.
I expect to not touch performances of encoders for valid encoded strings (when the code calling error handlers is not used), which is important since we have very good performance here.
_PyBytesWriter is not restricted to the code to allocate the buffer.
--
bytes_writer.patch:
+ char stackbuf[256];
Oh, I forgot to mention this other small optimization. I also added a small buffer allocated on the C stack for the UCS1 encoder (ASCII, Latin1). It may optimize a little bit encoding when the output string is smaller than 256 bytes when the error handler is used.
The optimization comes from the very efficient UTF-8 encoder.
msg252335 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月05日 16:17
My previous abandonned attempt was the issue #17742.
"Add _PyBytesWriter API to optimize Unicode encoders"
Oh, I forgot to mention and it may also be used to optimize bytes % args. More generally, any code generating a bytes object with an unknown length is advance. Said differently: _PyBytesWriter can be used when precomputing the output length is more expensive.
str % args now uses _PyUnicodeWriter but building an Unicode string is even more complex because of the different Unicode "kinds": 1, 2 or 4 bytes per character.
msg252570 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年10月08日 22:59
New changeset 1a2175149c5e by Victor Stinner in branch 'default':
Issue #25318: Add _PyBytesWriter API
https://hg.python.org/cpython/rev/1a2175149c5e 
msg252571 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月08日 23:04
Oh, I was surprised to see same or worse performances for UTF-8/backslashreplace. In fact, I forgot to enable overallocation. With overallocation, it is now faster ;-)
I modified the API to put the "stack buffer" inside _PyBytesWriter API directly. I also reworked _PyBytesWriter_Alloc() to call _PyBytesWriter_Prepare() so _PyBytesWriter_Alloc() now supports overallocation as well. It was part of _PyBytesWriter design to support overallocation at the first allocation (_PyBytesWriter_Alloc), that's why we have _PyBytesWriter_Alloc() *and* _PyBytesWriter_Init(): it's possible to set overallocate=1 between init and alloc.
I pushed my change since it didn't kill performances. It's only a little bit smaller but on very short encode: less than 500 ns. In other cases, it's the same performances or faster.
msg252573 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年10月08日 23:46
New changeset 59f4806a5add by Victor Stinner in branch 'default':
Optimize backslashreplace error handler
https://hg.python.org/cpython/rev/59f4806a5add 
msg252574 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年10月09日 00:32
New changeset c134eddcb347 by Victor Stinner in branch 'default':
Issue #25318: Move _PyBytesWriter to bytesobject.c
https://hg.python.org/cpython/rev/c134eddcb347 
msg252579 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月09日 00:51
I created the issue #25349 "Use _PyBytesWriter for bytes%args".
msg252580 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年10月09日 00:52
New changeset e9c1404d6bd9 by Victor Stinner in branch 'default':
Issue #25318: Fix compilation error
https://hg.python.org/cpython/rev/e9c1404d6bd9 
msg252582 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月09日 01:27
The FreeBSD 9.x buildbot is grumpy.
http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.x/builds/3495/steps/test/logs/stdio
Assertion failed: (start[writer->allocated] == 0), function _PyBytesWriter_CheckConsistency, file Objects/bytesobject.c, line 3809.
Fatal Python error: Aborted
Current thread 0x0000000801807400 (most recent call first):
 File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/test/test_pep277.py", line 150 in test_listdir
 File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/case.py", line 600 in run
 File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/case.py", line 648 in __call__
 File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/suite.py", line 122 in run
 File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/suite.py", line 84 in __call__
 File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/suite.py", line 122 in run
 File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/suite.py", line 84 in __call__
 File "/usr/home/buildbot/python/3.x.koobs-freebsd9/build/Lib/unittest/runner.py", line 176 in run
...
msg252583 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年10月09日 01:39
New changeset 9cf89366bbcb by Victor Stinner in branch 'default':
Issue #25318: Avoid sprintf() in backslashreplace()
https://hg.python.org/cpython/rev/9cf89366bbcb
New changeset 0a522f68d275 by Victor Stinner in branch 'default':
Issue #25318: Fix backslashreplace()
https://hg.python.org/cpython/rev/0a522f68d275
New changeset c53dcf1d6967 by Victor Stinner in branch 'default':
Issue #25318: cleanup code _PyBytesWriter
https://hg.python.org/cpython/rev/c53dcf1d6967 
msg252602 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年10月09日 12:18
Buildbots still like this new API :-) (no test failure recently)
I reworked the API a little bit to make its usage simpler in Unicode encoders. I started to open new issues to using this new API in more functions producing byte strings.
I consider that this issue can now be closed. I'm happy, the API looks good to me and the modified code is faster.
History
Date User Action Args
2022年04月11日 14:58:22adminsetgithub: 69505
2015年10月09日 12:18:57vstinnersetstatus: open -> closed
resolution: fixed
messages: + msg252602
2015年10月09日 01:39:00python-devsetmessages: + msg252583
2015年10月09日 01:27:37vstinnersetmessages: + msg252582
2015年10月09日 00:52:49python-devsetmessages: + msg252580
2015年10月09日 00:51:25vstinnersetmessages: + msg252579
2015年10月09日 00:32:57python-devsetmessages: + msg252574
2015年10月08日 23:46:53python-devsetmessages: + msg252573
2015年10月08日 23:04:15vstinnersetmessages: + msg252571
2015年10月08日 22:59:55python-devsetnosy: + python-dev
messages: + msg252570
2015年10月05日 16:17:29vstinnersetmessages: + msg252335
2015年10月05日 12:12:22vstinnersetmessages: + msg252325
2015年10月05日 12:05:32vstinnersetfiles: + bytes_writer.patch
keywords: + patch
2015年10月05日 12:04:41vstinnersetfiles: + bench_ucs1_result.txt
2015年10月05日 12:04:04vstinnersetmessages: + msg252324
2015年10月05日 12:02:50vstinnersetfiles: + bench_utf8_result.txt

messages: + msg252323
2015年10月05日 12:01:28vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /