homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Faster utf-16 decoder
Type: performance Stage: resolved
Components: Interpreter Core, Unicode Versions: Python 3.3
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: Arfrever, asvetlov, ezio.melotti, loewis, pitrou, python-dev, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2012年04月19日 20:59 by serhiy.storchaka, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
decode_utf16.patch serhiy.storchaka, 2012年04月19日 20:58 review
decodebench.py serhiy.storchaka, 2012年04月23日 21:01
bench-diff.py serhiy.storchaka, 2012年04月23日 21:01
decode_utf16_2.patch serhiy.storchaka, 2012年05月03日 13:21 review
decode_utf16_3.patch serhiy.storchaka, 2012年05月11日 19:24 review
decode_utf16_4.patch serhiy.storchaka, 2012年05月14日 22:14 review
decode_utf16_5.patch serhiy.storchaka, 2012年05月15日 21:29 review
decode_utf16_6.patch serhiy.storchaka, 2012年05月15日 21:29 review
Messages (15)
msg158748 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年04月19日 20:58
I propose a patch, which accelerates the utf-16 decoder. With PEP 393 utf-16 decoder slowed down a few times (3-4x), this patch returns the performance at the level of Python 3.2 and even higher (+10-30% over 3.2).
In addition, it fixes a few bugs in the utf-16 decoder. Also as a side effect is possible acceleration of other decoders.
msg158751 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2012年04月19日 21:03
See also #14625 for UTF-32 decoder.
msg158753 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年04月19日 21:09
See also issue #14579 for utf-16 decoder bugs.
msg158772 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2012年04月19日 23:08
Serhiy: can you please submit a contributor form?
msg159077 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年04月23日 21:01
Here are the results of benchmarking (numbers in MB/s).
On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:
 Py2.7 Py3.2 Py3.3 patch
utf-16le 'A'*10000 504 (+282%) 1905 (+1%) 565 (+241%) 1927
utf-16le '\x80'*10000 503 (+264%) 1894 (-3%) 417 (+340%) 1833
utf-16le '\x80'+'A'*9999 504 (+264%) 1890 (-3%) 422 (+335%) 1834
utf-16le '\u0100'*10000 503 (+249%) 1896 (-7%) 357 (+391%) 1754
utf-16le '\u0100'+'A'*9999 504 (+252%) 1896 (-6%) 360 (+393%) 1776
utf-16le '\u0100'+'\x80'*9999 503 (+249%) 1890 (-7%) 357 (+392%) 1756
utf-16le '\u8000'*10000 503 (-18%) 355 (+16%) 75 (+449%) 412
utf-16le '\u8000'+'A'*9999 504 (+254%) 1892 (-6%) 359 (+397%) 1783
utf-16le '\u8000'+'\x80'*9999 503 (+249%) 1896 (-7%) 357 (+392%) 1755
utf-16le '\u8000'+'\u0100'*9999 503 (+258%) 1901 (-5%) 359 (+402%) 1802
utf-16le '\U00010000'*10000 484 (-14%) 379 (+9%) 103 (+303%) 415
utf-16le '\U00010000'+'A'*9999 504 (+244%) 1905 (-9%) 353 (+392%) 1735
utf-16le '\U00010000'+'\x80'*9999 503 (+245%) 1899 (-9%) 348 (+398%) 1733
utf-16le '\U00010000'+'\u0100'*9999 503 (+244%) 1882 (-8%) 348 (+397%) 1729
utf-16le '\U00010000'+'\u8000'*9999 503 (-18%) 355 (+16%) 71 (+482%) 413
utf-16be 'A'*10000 504 (+284%) 1553 (+24%) 469 (+312%) 1933
utf-16be '\x80'*10000 504 (+251%) 1551 (+14%) 387 (+357%) 1770
utf-16be '\x80'+'A'*9999 504 (+261%) 1549 (+17%) 386 (+371%) 1819
utf-16be '\u0100'*10000 503 (+175%) 1544 (-10%) 333 (+316%) 1384
utf-16be '\u0100'+'A'*9999 505 (+178%) 1548 (-9%) 335 (+319%) 1403
utf-16be '\u0100'+'\x80'*9999 503 (+179%) 1552 (-9%) 336 (+318%) 1405
utf-16be '\u8000'*10000 503 (-2%) 415 (+19%) 75 (+559%) 494
utf-16be '\u8000'+'A'*9999 504 (+179%) 1551 (-9%) 335 (+320%) 1408
utf-16be '\u8000'+'\x80'*9999 504 (+178%) 1551 (-10%) 336 (+317%) 1402
utf-16be '\u8000'+'\u0100'*9999 504 (+179%) 1549 (-9%) 336 (+318%) 1404
utf-16be '\U00010000'*10000 483 (-7%) 407 (+10%) 105 (+326%) 447
utf-16be '\U00010000'+'A'*9999 504 (+149%) 1554 (-19%) 317 (+295%) 1253
utf-16be '\U00010000'+'\x80'*9999 503 (+153%) 1543 (-17%) 317 (+302%) 1275
utf-16be '\U00010000'+'\u0100'*9999 503 (+153%) 1537 (-17%) 317 (+302%) 1274
utf-16be '\U00010000'+'\u8000'*9999 503 (-2%) 415 (+19%) 71 (+597%) 495
On 32-bit Linux, Intel Atom N570 @ 1.66GHz:
 Py2.7 Py3.2 Py3.3 patch
utf-16le 'A'*10000 136 (+417%) 584 (+20%) 184 (+282%) 703
utf-16le '\x80'*10000 136 (+392%) 580 (+15%) 160 (+318%) 669
utf-16le '\x80'+'A'*9999 136 (+398%) 582 (+16%) 159 (+326%) 677
utf-16le '\u0100'*10000 137 (+346%) 583 (+5%) 129 (+374%) 611
utf-16le '\u0100'+'A'*9999 136 (+358%) 582 (+7%) 129 (+383%) 623
utf-16le '\u0100'+'\x80'*9999 136 (+348%) 580 (+5%) 129 (+372%) 609
utf-16le '\u8000'*10000 136 (+18%) 127 (+27%) 38 (+324%) 161
utf-16le '\u8000'+'A'*9999 136 (+357%) 582 (+7%) 129 (+382%) 622
utf-16le '\u8000'+'\x80'*9999 136 (+351%) 581 (+6%) 128 (+380%) 614
utf-16le '\u8000'+'\u0100'*9999 136 (+349%) 581 (+5%) 129 (+374%) 611
utf-16le '\U00010000'*10000 153 (-3%) 140 (+6%) 53 (+181%) 149
utf-16le '\U00010000'+'A'*9999 136 (+296%) 581 (-7%) 131 (+311%) 538
utf-16le '\U00010000'+'\x80'*9999 136 (+289%) 584 (-9%) 131 (+304%) 529
utf-16le '\U00010000'+'\u0100'*9999 136 (+290%) 579 (-8%) 130 (+308%) 530
utf-16le '\U00010000'+'\u8000'*9999 136 (+25%) 128 (+33%) 38 (+347%) 170
utf-16be 'A'*10000 136 (+331%) 441 (+33%) 166 (+253%) 586
utf-16be '\x80'*10000 136 (+309%) 440 (+26%) 145 (+283%) 556
utf-16be '\x80'+'A'*9999 136 (+312%) 442 (+27%) 145 (+286%) 560
utf-16be '\u0100'*10000 136 (+231%) 441 (+2%) 120 (+275%) 450
utf-16be '\u0100'+'A'*9999 136 (+232%) 442 (+2%) 120 (+276%) 451
utf-16be '\u0100'+'\x80'*9999 136 (+231%) 438 (+3%) 119 (+278%) 450
utf-16be '\u8000'*10000 136 (+22%) 127 (+31%) 38 (+337%) 166
utf-16be '\u8000'+'A'*9999 136 (+232%) 439 (+3%) 120 (+276%) 451
utf-16be '\u8000'+'\x80'*9999 136 (+230%) 439 (+2%) 120 (+274%) 449
utf-16be '\u8000'+'\u0100'*9999 136 (+232%) 439 (+3%) 120 (+276%) 451
utf-16be '\U00010000'*10000 153 (-1%) 139 (+9%) 52 (+192%) 152
utf-16be '\U00010000'+'A'*9999 136 (+211%) 440 (-4%) 121 (+250%) 423
utf-16be '\U00010000'+'\x80'*9999 136 (+210%) 440 (-4%) 122 (+246%) 422
utf-16be '\U00010000'+'\u0100'*9999 136 (+210%) 441 (-5%) 121 (+248%) 421
utf-16be '\U00010000'+'\u8000'*9999 136 (+27%) 128 (+35%) 38 (+355%) 173
msg159090 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年04月23日 21:57
64 bit Linux, Intel Core i5-2500K @ 3.30GHz:
 vanilla 3.3 patched
utf-16le 'A'*10000 1384 (+278%)	5233
utf-16le 'A'*9999+'\x80' 1303 (+259%)	4684
utf-16le 'A'*9999+'\u0100' 953 (+195%)	2813
utf-16le 'A'*9999+'\u8000' 953 (+195%)	2814
utf-16le 'A'*9999+'\U00010000' 979 (+197%)	2903
utf-16le '\x80'*10000 1243 (+321%)	5230
utf-16le '\x80'+'A'*9999 1256 (+313%)	5188
utf-16le '\x80'*9999+'\u0100' 880 (+214%)	2765
utf-16le '\x80'*9999+'\u8000' 880 (+214%)	2763
utf-16le '\x80'*9999+'\U00010000' 899 (+218%)	2860
utf-16le '\u0100'*10000 1047 (+370%)	4917
utf-16le '\u0100'+'A'*9999 1046 (+369%)	4906
utf-16le '\u0100'+'\x80'*9999 1047 (+370%)	4920
utf-16le '\u0100'*9999+'\u8000' 1047 (+369%)	4906
utf-16le '\u0100'*9999+'\U00010000' 791 (+253%)	2793
utf-16le '\u8000'*10000 230 (+410%)	1173
utf-16le '\u8000'+'A'*9999 1043 (+371%)	4911
utf-16le '\u8000'+'\x80'*9999 1044 (+345%)	4645
utf-16le '\u8000'+'\u0100'*9999 1041 (+350%)	4681
utf-16le '\u8000'*9999+'\U00010000' 215 (+357%)	983
utf-16le '\U00010000'*10000 362 (+170%)	976
utf-16le '\U00010000'+'A'*9999 985 (+210%)	3052
utf-16le '\U00010000'+'\x80'*9999 985 (+211%)	3066
utf-16le '\U00010000'+'\u0100'*9999 983 (+209%)	3042
utf-16le '\U00010000'+'\u8000'*9999 245 (+329%)	1052
utf-16be 'A'*10000 1268 (+313%)	5240
utf-16be 'A'*9999+'\x80' 1199 (+297%)	4758
utf-16be 'A'*9999+'\u0100' 896 (+211%)	2786
utf-16be 'A'*9999+'\u8000' 897 (+211%)	2788
utf-16be 'A'*9999+'\U00010000' 919 (+214%)	2885
utf-16be '\x80'*10000 1154 (+341%)	5087
utf-16be '\x80'+'A'*9999 1155 (+343%)	5112
utf-16be '\x80'*9999+'\u0100' 829 (+229%)	2728
utf-16be '\x80'*9999+'\u8000' 828 (+229%)	2726
utf-16be '\x80'*9999+'\U00010000' 852 (+232%)	2832
utf-16be '\u0100'*10000 981 (+332%)	4241
utf-16be '\u0100'+'A'*9999 981 (+330%)	4220
utf-16be '\u0100'+'\x80'*9999 977 (+331%)	4213
utf-16be '\u0100'*9999+'\u8000' 982 (+331%)	4237
utf-16be '\u0100'*9999+'\U00010000' 748 (+237%)	2520
utf-16be '\u8000'*10000 230 (+413%)	1180
utf-16be '\u8000'+'A'*9999 979 (+331%)	4218
utf-16be '\u8000'+'\x80'*9999 974 (+333%)	4215
utf-16be '\u8000'+'\u0100'*9999 972 (+335%)	4226
utf-16be '\u8000'*9999+'\U00010000' 215 (+361%)	992
utf-16be '\U00010000'*10000 362 (+170%)	978
utf-16be '\U00010000'+'A'*9999 924 (+232%)	3064
utf-16be '\U00010000'+'\x80'*9999 921 (+223%)	2979
utf-16be '\U00010000'+'\u0100'*9999 921 (+233%)	3064
utf-16be '\U00010000'+'\u8000'*9999 245 (+329%)	1052
msg159847 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年05月03日 10:34
New changeset 830eeff4fe8f by Victor Stinner in branch 'default':
Issue #14624, #14687: Optimize unicode_widen()
http://hg.python.org/cpython/rev/830eeff4fe8f 
msg159858 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年05月03日 13:21
Here is updated patch, taking into account that unicode_widen is already
optimized.
msg160442 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年05月11日 19:24
The patch updated to stylistic conformity of the UTF-8 decoder. The decoding of the UCS2 non-surrogate characters a little speed up (+15%).
msg160572 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年05月13日 20:38
New performance figures under 64 bit Linux, Intel Core i5-2500K @ 3.30GHz:
 vanilla 3.3 patched
utf-16le 'A'*10000 1411 (+290%)	5504
utf-16le 'A'*9999+'\x80' 1368 (+263%)	4970
utf-16le 'A'*9999+'\u0100' 1145 (+151%)	2871
utf-16le 'A'*9999+'\u8000' 1144 (+151%)	2870
utf-16le 'A'*9999+'\U00010000' 1164 (+154%)	2957
utf-16le '\x80'*10000 1403 (+271%)	5209
utf-16le '\x80'+'A'*9999 1406 (+272%)	5235
utf-16le '\x80'*9999+'\u0100' 1138 (+138%)	2713
utf-16le '\x80'*9999+'\u8000' 1138 (+139%)	2716
utf-16le '\x80'*9999+'\U00010000' 1155 (+151%)	2897
utf-16le '\u0100'*10000 1477 (+243%)	5062
utf-16le '\u0100'+'A'*9999 1478 (+243%)	5072
utf-16le '\u0100'+'\x80'*9999 1477 (+243%)	5062
utf-16le '\u0100'*9999+'\u8000' 1478 (+242%)	5055
utf-16le '\u0100'*9999+'\U00010000' 1201 (+131%)	2776
utf-16le '\u8000'*10000 246 (+347%)	1100
utf-16le '\u8000'+'A'*9999 1475 (+244%)	5069
utf-16le '\u8000'+'\x80'*9999 1474 (+243%)	5062
utf-16le '\u8000'+'\u0100'*9999 1473 (+243%)	5057
utf-16le '\u8000'*9999+'\U00010000' 236 (+295%)	932
utf-16le '\U00010000'*10000 393 (+164%)	1039
utf-16le '\U00010000'+'A'*9999 1325 (+134%)	3106
utf-16le '\U00010000'+'\x80'*9999 1326 (+134%)	3103
utf-16le '\U00010000'+'\u0100'*9999 1326 (+134%)	3104
utf-16le '\U00010000'+'\u8000'*9999 253 (+331%)	1091
utf-16be 'A'*10000 1341 (+298%)	5342
utf-16be 'A'*9999+'\x80' 1305 (+275%)	4888
utf-16be 'A'*9999+'\u0100' 1101 (+157%)	2834
utf-16be 'A'*9999+'\u8000' 1102 (+157%)	2831
utf-16be 'A'*9999+'\U00010000' 1115 (+162%)	2917
utf-16be '\x80'*10000 1326 (+296%)	5253
utf-16be '\x80'+'A'*9999 1322 (+298%)	5258
utf-16be '\x80'*9999+'\u0100' 1088 (+156%)	2781
utf-16be '\x80'*9999+'\u8000' 1088 (+155%)	2770
utf-16be '\x80'*9999+'\U00010000' 1103 (+159%)	2854
utf-16be '\u0100'*10000 1344 (+221%)	4308
utf-16be '\u0100'+'A'*9999 1342 (+223%)	4330
utf-16be '\u0100'+'\x80'*9999 1343 (+221%)	4307
utf-16be '\u0100'*9999+'\u8000' 1343 (+221%)	4306
utf-16be '\u0100'*9999+'\U00010000' 1109 (+128%)	2529
utf-16be '\u8000'*10000 248 (+341%)	1094
utf-16be '\u8000'+'A'*9999 1340 (+223%)	4331
utf-16be '\u8000'+'\x80'*9999 1341 (+221%)	4307
utf-16be '\u8000'+'\u0100'*9999 1341 (+221%)	4309
utf-16be '\u8000'*9999+'\U00010000' 239 (+290%)	931
utf-16be '\U00010000'*10000 399 (+160%)	1037
utf-16be '\U00010000'+'A'*9999 1230 (+152%)	3101
utf-16be '\U00010000'+'\x80'*9999 1218 (+154%)	3095
utf-16be '\U00010000'+'\u0100'*9999 1220 (+154%)	3095
utf-16be '\U00010000'+'\u8000'*9999 257 (+318%)	1074
msg160672 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年05月14日 22:14
The patch updated with a little clarified code and added comments.
msg160766 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年05月15日 21:29
Here are two new patch. Checking for characters out-of-range moved,
making the code simpler. Theoretically it is a bit slow down decoding of
short UCS1 strings (up to 1 and 3 chars on 32- and 64-bit), but
practically there is no difference. The second patch is different from
the first patch that masks are not calculated and specified explicitly.
I am not sure that it improves readability. The commiter has the choice.
msg160768 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年05月15日 21:50
New changeset cdcc816dea85 by Antoine Pitrou in branch 'default':
Issue #14624: UTF-16 decoding is now 3x to 4x faster on various inputs.
http://hg.python.org/cpython/rev/cdcc816dea85 
msg160769 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年05月15日 21:52
Thank you Serhiy! Now committed.
msg161100 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年05月19日 09:02
Thank you, Antoine. Now only issue14625 waits for review.
> changeset: 77012:3430d7329a3b
> +* UTF-8 and UTF-16 decoding is now 2x to 4x faster.
In fact now UTF-16 decoding faster for a maximum of +25% compared to Python 3.2 on my computers (and sometimes a little slower yet). 2x to 4x it is faster compared to former slow-downed Python 3.3 (thanks to PEP 393).
History
Date User Action Args
2022年04月11日 14:57:29adminsetgithub: 58829
2012年05月19日 09:02:56serhiy.storchakasetmessages: + msg161100
2012年05月15日 21:52:07pitrousetstatus: open -> closed
resolution: fixed
messages: + msg160769

stage: resolved
2012年05月15日 21:50:53python-devsetmessages: + msg160768
2012年05月15日 21:29:28serhiy.storchakasetfiles: + decode_utf16_5.patch, decode_utf16_6.patch

messages: + msg160766
2012年05月14日 22:14:49serhiy.storchakasetfiles: + decode_utf16_4.patch

messages: + msg160672
2012年05月13日 20:38:32pitrousetmessages: + msg160572
2012年05月11日 19:46:24serhiy.storchakasetnosy: + ezio.melotti
components: + Unicode
2012年05月11日 19:24:30serhiy.storchakasetfiles: + decode_utf16_3.patch

messages: + msg160442
2012年05月03日 13:21:47serhiy.storchakasetfiles: + decode_utf16_2.patch

messages: + msg159858
2012年05月03日 10:34:06python-devsetnosy: + python-dev
messages: + msg159847
2012年04月23日 21:57:20pitrousetmessages: + msg159090
2012年04月23日 21:01:19serhiy.storchakasetfiles: + decodebench.py, bench-diff.py

messages: + msg159077
2012年04月20日 21:38:25asvetlovsetnosy: + asvetlov
2012年04月20日 06:35:36Arfreversetnosy: + Arfrever
2012年04月19日 23:08:21loewissetnosy: + loewis
messages: + msg158772
2012年04月19日 21:09:26serhiy.storchakasetmessages: + msg158753
2012年04月19日 21:03:29vstinnersetmessages: + msg158751
2012年04月19日 21:02:57vstinnersetnosy: + vstinner
2012年04月19日 21:01:54vstinnersetnosy: + pitrou
2012年04月19日 20:59:00serhiy.storchakacreate

AltStyle によって変換されたページ (->オリジナル) /