homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: BUILD_MAP_UNPACK_WITH_CALL is slow
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.7, Python 3.6
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: serhiy.storchaka Nosy List: Demur Rumed, abarry, berker.peksag, python-dev, rhettinger, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2016年06月21日 01:46 by Demur Rumed, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
mapaca.patch Demur Rumed, 2016年06月21日 01:45 review
mapaca2.patch Demur Rumed, 2016年06月21日 13:03 review
faster_build_map_unpack_with_call.patch serhiy.storchaka, 2016年09月10日 22:56 review
faster_build_map_unpack_with_call2.patch serhiy.storchaka, 2016年09月23日 20:47 review
Pull Requests
URL Status Linked Edit
PR 552 closed dstufft, 2017年03月31日 16:36
Messages (17)
msg268951 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016年06月21日 01:45
BUILD_MAP_UNPACK_WITH_CALL is _really_ slow, wasting much of its time asserting that keys are non overlapping. This patch optimizes a fast path for distinct dicts, especially useful for #27213 where BUILD_MAP_UNPACK_WITH_CALL is generated for a single **kw rather than needing **kw1,**kw2 to hit this slow opcode
This patch tracks size of dictionary, if size doesn't increase by same size as the dict we updated sum with, then we scan to find where collision is. Further optimization can be done by scanning size of dicts to preallocate dictionary
Microbenchmark:
from timeit import timeit
def f(**x):return x
timeit(lambda:f(**{'a':2},**{'b':2}))
Unpatched takes ~15s
Patched takes ~5s
msg268967 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年06月21日 05:00
The problem is that var-keyword argument can be not a dict, but general mapping. This optimization is applicable only if it is a dict.
msg268989 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016年06月21日 13:03
mapaca2 heavy handily deals with the must-work-with-all-mappings by converting any non dict mappings on the stack with dicts when with_call is true
I'm not sure if it'd be better to seek a less opcode centric fix-- ie introduce a method to dictobject which returns None if no collision occurs otherwise it returns the first key which collides and stops updating at that point. PyDict_DistinctUpdate
msg268990 - (view) Author: Philip Dubé (Demur Rumed) * Date: 2016年06月21日 13:05
(returning None wouldn't work because that may be the key, but something like returning the dict itself (ie an unhashable) or keeping this as a C API & returning NULL would work)
msg268992 - (view) Author: Anilyka Barry (abarry) * (Python triager) Date: 2016年06月21日 13:15
`dict` subclasses can be hashable - it's a very bad idea, but not guarded against. If you were to do this, I'd suggest going the exception way, à la StopIteration.
I wonder how feasible it would be for e.g. dict.__setitem__ to check if a key already exists; it would be a special path that's not taken normally, but filling the mapping on call enables it, and it raises if it sees a duplicate. I think that inserting the check in there would reduce complexity and probably have a negligible impact on performance. I don't know if that's a good idea, just throwing this out here.
msg268997 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年06月21日 13:43
I think this kills the optimization effect for non-dicts.
See on PyDict_Merge(). It takes the boolean parameter that controls the behavior in case of matching keys. I think the best would be to rename it to say _PyDict_MergeEx(), extend the boolean parameter to ternary parameter, and raise an exception if it is in the third state and matching keys are found. PyDict_Merge() would be implemented as a simple wrapper around _PyDict_MergeEx(). We should check wherever this affects the performance of dict.update().
msg273680 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年08月25日 21:45
See also issue #27845: "Optimize update_keyword_args() function".
msg275710 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年09月10日 22:56
Here is a patch that gets rid of calculating intersections. I didn't make benchmarking still.
msg277279 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年09月23日 14:08
The side effect of the last patch is fixing a regression in 3.6 and a bug in 3.5 (issue28257).
msg277283 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年09月23日 15:31
Microbenchmarks:
$ ./python -m timeit -s "def f(**kw): pass" -s "b = {'b': 2}" -- "f(a=1, **b)"
Unpatched: 100000 loops, best of 3: 7.64 usec per loop
Patched: 100000 loops, best of 3: 3.14 usec per loop
$ ./python -m timeit -s "def f(**kw): pass" -s "a = {'a': 1}; b = {'b': 2}" -- "f(**a, **b)"
Unpatched: 100000 loops, best of 3: 6.93 usec per loop
Patched: 100000 loops, best of 3: 2.66 usec per loop
$ ./python -m timeit -s "def f(a=None, b=None): pass" -s "b = {'b': 2}" -- "f(a=1, **b)"
Unpatched: 100000 loops, best of 3: 7.27 usec per loop
Patched: 100000 loops, best of 3: 2.83 usec per loop
$ ./python -m timeit -s "def f(a=None, b=None): pass" -s "a = {'a': 1}; b = {'b': 2}" -- "f(**a, **b)"
Unpatched: 100000 loops, best of 3: 6.47 usec per loop
Patched: 100000 loops, best of 3: 2.31 usec per loop
msg277285 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2016年09月23日 16:23
> $ ./python -m timeit -s "def f(**kw): pass" -s "b = {'b': 2}" -- "f(a=1, **b)"
> Unpatched: 100000 loops, best of 3: 7.64 usec per loop
> Patched: 100000 loops, best of 3: 3.14 usec per loop
Impressive numbers but... is it a perf regression compared to Python 3.5 (and even 2.7)? Or a perf speedup compared to Python 3.5?
msg277287 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年09月23日 16:58
Yes, it is expected perf regression compared to Python 3.5 and 2.7 when pass keyword arguments and single var-keyword argument (because 3.6 uses BUILD_MAP_UNPACK_WITH_CALL, while 2.7 and 3.5 don't need it for this case). In case of multiple var-keyword arguments (all versions need BUILD_MAP_UNPACK_WITH_CALL) even unpatched 3.6 is faster than 3.5.
$ ./python -m timeit -s "def f(**kw): pass" -s "b = {'b': 2}" -- "f(a=1, **b)"
Python 2.7: 100000 loops, best of 3: 2.21 usec per loop
Python 3.5: 100000 loops, best of 3: 4.31 usec per loop
Python 3.6 unpatched: 100000 loops, best of 3: 7.64 usec per loop
Python 3.6 patched: 100000 loops, best of 3: 3.14 usec per loop
$ ./python -m timeit -s "def f(**kw): pass" -s "a = {'a': 1}; b = {'b': 2}" -- "f(**a, **b)"
Python 3.5: 100000 loops, best of 3: 11.6 usec per loop
Python 3.6 unpatched: 100000 loops, best of 3: 6.93 usec per loop
Python 3.6 patched: 100000 loops, best of 3: 2.66 usec per loop
$ ./python -m timeit -s "def f(a=None, b=None): pass" -s "b = {'b': 2}" -- "f(a=1, **b)"
Python 2.7: 100000 loops, best of 3: 1.97 usec per loop
Python 3.5: 100000 loops, best of 3: 3.75 usec per loop
Python 3.6 unpatched: 100000 loops, best of 3: 7.27 usec per loop
Python 3.6 patched: 100000 loops, best of 3: 2.83 usec per loop
$ ./python -m timeit -s "def f(a=None, b=None): pass" -s "a = {'a': 1}; b = {'b': 2}" -- "f(**a, **b)"
Python 3.5: 100000 loops, best of 3: 10.6 usec per loop
Python 3.6 unpatched: 100000 loops, best of 3: 6.47 usec per loop
Python 3.6 patched: 100000 loops, best of 3: 2.31 usec per loop
msg277297 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年09月23日 20:47
Updated patch addresses Victor's comments.
msg277859 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016年10月02日 08:12
New changeset 148172f75d43 by Serhiy Storchaka in branch '3.6':
Issue #27358: Optimized merging var-keyword arguments and improved error
https://hg.python.org/cpython/rev/148172f75d43
New changeset 489ad68b35c0 by Serhiy Storchaka in branch 'default':
Issue #27358: Optimized merging var-keyword arguments and improved error
https://hg.python.org/cpython/rev/489ad68b35c0
New changeset ec84d815e90f by Serhiy Storchaka in branch '3.5':
Issue #27358: Backported tests.
https://hg.python.org/cpython/rev/ec84d815e90f
New changeset 29a658d47ae8 by Serhiy Storchaka in branch '2.7':
Issue #27358: Backported tests.
https://hg.python.org/cpython/rev/29a658d47ae8 
msg277866 - (view) Author: Berker Peksag (berker.peksag) * (Python committer) Date: 2016年10月02日 09:15
test_unpack_ex fails on several buildbots:
http://buildbot.python.org/all/builders/x86-64%20Ubuntu%2015.10%20Skylake%20CPU%203.6/builds/120/steps/test/logs/stdio
test test_unpack_ex failed
**********************************************************************
File "/home/buildbot/buildarea/3.6.intel-ubuntu-skylake/build/Lib/test/test_unpack_ex.py", line ?, in test.test_unpack_ex.__test__.doctests
Failed example:
 {**1}
Expected:
 Traceback (most recent call last):
 ...
 TypeError: 'int' object is not a mapping
Got:
 Traceback (most recent call last):
 File "/home/buildbot/buildarea/3.6.intel-ubuntu-skylake/build/Lib/doctest.py", line 1330, in __run
 compileflags, 1), test.globs)
 File "<doctest test.test_unpack_ex.__test__.doctests[38]>", line 1, in <module>
 {**1}
 TypeError: 'int' object is not a mapping1
**********************************************************************
File "/home/buildbot/buildarea/3.6.intel-ubuntu-skylake/build/Lib/test/test_unpack_ex.py", line ?, in test.test_unpack_ex.__test__.doctests
Failed example:
 {**[]}
Expected:
 Traceback (most recent call last):
 ...
 TypeError: 'list' object is not a mapping
Got:
 Traceback (most recent call last):
 File "/home/buildbot/buildarea/3.6.intel-ubuntu-skylake/build/Lib/doctest.py", line 1330, in __run
 compileflags, 1), test.globs)
 File "<doctest test.test_unpack_ex.__test__.doctests[39]>", line 1, in <module>
 {**[]}
 TypeError: 'list' object is not a mapping1
**********************************************************************
1 items had failures:
 2 of 85 in test.test_unpack_ex.__test__.doctests
***Test Failed*** 2 failures.
msg277877 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2016年10月02日 10:06
New changeset e2d4e077cfb2 by Berker Peksag in branch '3.6':
Issue #27358: Fix typo in error message
https://hg.python.org/cpython/rev/e2d4e077cfb2
New changeset 9abb316f1593 by Berker Peksag in branch 'default':
Issue #27358: Merge from 3.6
https://hg.python.org/cpython/rev/9abb316f1593 
msg277878 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年10月02日 10:10
Thanks Berker!
This was temporary change for debugging. I forgot to remove it.
History
Date User Action Args
2022年04月11日 14:58:32adminsetgithub: 71545
2017年03月31日 16:36:38dstufftsetpull_requests: + pull_request1104
2016年10月02日 10:10:28serhiy.storchakasetmessages: + msg277878
2016年10月02日 10:06:36python-devsetmessages: + msg277877
2016年10月02日 09:15:55berker.peksagsetnosy: + berker.peksag
messages: + msg277866
2016年10月02日 08:39:02serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2016年10月02日 08:12:37python-devsetnosy: + python-dev
messages: + msg277859
2016年10月02日 07:41:56serhiy.storchakasetassignee: serhiy.storchaka
versions: + Python 3.7
2016年09月23日 20:47:07serhiy.storchakasetfiles: + faster_build_map_unpack_with_call2.patch

messages: + msg277297
2016年09月23日 16:58:57serhiy.storchakasetmessages: + msg277287
2016年09月23日 16:23:29vstinnersetmessages: + msg277285
2016年09月23日 15:31:57serhiy.storchakasetmessages: + msg277283
2016年09月23日 14:08:53serhiy.storchakasetmessages: + msg277279
2016年09月23日 14:06:55serhiy.storchakalinkissue28257 dependencies
2016年09月10日 22:56:44serhiy.storchakasetfiles: + faster_build_map_unpack_with_call.patch

messages: + msg275710
2016年08月25日 21:45:08vstinnersetnosy: + vstinner
messages: + msg273680
2016年06月21日 13:43:10serhiy.storchakasetnosy: + rhettinger
messages: + msg268997
2016年06月21日 13:15:19abarrysetnosy: + abarry
messages: + msg268992
2016年06月21日 13:05:39Demur Rumedsetmessages: + msg268990
2016年06月21日 13:03:20Demur Rumedsetfiles: + mapaca2.patch

messages: + msg268989
2016年06月21日 05:00:19serhiy.storchakasetmessages: + msg268967
stage: patch review
2016年06月21日 01:49:20Demur Rumedsetnosy: + serhiy.storchaka
2016年06月21日 01:46:29Demur Rumedsettype: performance
components: + Interpreter Core
versions: + Python 3.6
2016年06月21日 01:46:01Demur Rumedcreate

AltStyle によって変換されたページ (->オリジナル) /