homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: LOOKUP_METHOD and CALL_METHOD optimization
Type: performance Stage: resolved
Components: Interpreter Core Versions: Python 3.2
process
Status: closed Resolution:
Dependencies: Superseder: Speedup method calls 1.2x
View: 26110
Assigned To: Nosy List: benjamin.peterson, collinwinter, jyasskin, pitrou, r.david.murray, rnk
Priority: low Keywords: patch

Created on 2009年05月16日 01:43 by benjamin.peterson, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
call_method.patch benjamin.peterson, 2009年05月16日 01:43
Messages (10)
msg87850 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009年05月16日 01:43
This is an optimization ported from PyPy. [1] It tries to prevent bound
methods from being created by using the stack as a cache. I couldn't
apply this to builtin methods because those use a method-wrapper
descriptor. The results were not very impressive. However, I'm attaching
the patch to see if anyone else wants to look at it.
[1]
http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#id1
Test minimum run-time average run-time
 this other diff this other 
 diff
-------------------------------------------------------------------------------
 BuiltinFunctionCalls: 342ms 340ms +0.6% 378ms 361ms
 +4.7%
 BuiltinMethodLookup: 315ms 308ms +2.3% 333ms 319ms
 +4.5%
 CompareFloats: 250ms 251ms -0.3% 257ms 258ms
 -0.7%
 CompareFloatsIntegers: 266ms 265ms +0.4% 273ms 273ms
 -0.2%
 CompareIntegers: 233ms 232ms +0.4% 238ms 238ms
 -0.2%
 CompareInternedStrings: 279ms 275ms +1.3% 285ms 284ms
 +0.3%
 CompareLongs: 225ms 223ms +1.1% 231ms 229ms
 +0.8%
 CompareStrings: 238ms 235ms +1.3% 244ms 243ms
 +0.6%
 CompareUnicode: 243ms 246ms -0.9% 252ms 252ms
 -0.1%
 ComplexPythonFunctionCalls: 307ms 301ms +2.0% 315ms 309ms
 +1.8%
 ConcatStrings: 372ms 366ms +1.6% 376ms 385ms
 -2.1%
 ConcatUnicode: 260ms 259ms +0.4% 266ms 269ms
 -0.9%
 CreateInstances: 351ms 336ms +4.7% 365ms 346ms
 +5.5%
 CreateNewInstances: 265ms 256ms +3.6% 281ms 264ms
 +6.5%
 CreateStringsWithConcat: 290ms 289ms +0.1% 304ms 301ms
 +1.2%
 CreateUnicodeWithConcat: 220ms 219ms +0.8% 227ms 223ms
 +1.8%
 DictCreation: 201ms 200ms +0.4% 204ms 206ms
 -1.0%
 DictWithFloatKeys: 400ms 418ms -4.4% 410ms 424ms
 -3.4%
 DictWithIntegerKeys: 298ms 294ms +1.2% 306ms 304ms
 +0.6%
 DictWithStringKeys: 260ms 264ms -1.5% 270ms 275ms
 -2.1%
 ForLoops: 224ms 223ms +0.2% 232ms 232ms
 +0.2%
 IfThenElse: 160ms 160ms +0.0% 168ms 182ms
 -8.0%
 ListSlicing: 293ms 292ms +0.5% 302ms 306ms
 -1.2%
 NestedForLoops: 301ms 300ms +0.2% 305ms 308ms
 -0.8%
 NestedListComprehensions: 323ms 328ms -1.7% 331ms 335ms
 -1.3%
 NormalClassAttribute: 313ms 314ms -0.1% 323ms 330ms
 -2.1%
 NormalInstanceAttribute: 284ms 283ms +0.4% 289ms 288ms
 +0.4%
 PythonFunctionCalls: 259ms 278ms -6.7% 274ms 289ms
 -5.3%
 PythonMethodCalls: 358ms 357ms +0.3% 371ms 365ms
 +1.6%
 Recursion: 389ms 398ms -2.1% 395ms 407ms
 -2.9%
 SecondImport: 335ms 319ms +5.2% 346ms 380ms
 -8.9%
 SecondPackageImport: 338ms 326ms +3.6% 350ms 337ms
 +4.0%
 SecondSubmoduleImport: 413ms 403ms +2.5% 426ms 411ms
 +3.6%
 SimpleComplexArithmetic: 341ms 345ms -1.2% 351ms 355ms
 -1.1%
 SimpleDictManipulation: 288ms 298ms -3.7% 293ms 303ms
 -3.1%
 SimpleFloatArithmetic: 272ms 275ms -1.1% 279ms 286ms
 -2.7%
 SimpleIntFloatArithmetic: 211ms 204ms +3.3% 216ms 215ms
 +0.5%
 SimpleIntegerArithmetic: 207ms 203ms +1.7% 214ms 213ms
 +0.5%
 SimpleListComprehensions: 275ms 273ms +0.6% 281ms 281ms
 -0.1%
 SimpleListManipulation: 224ms 229ms -2.5% 234ms 241ms
 -2.9%
 SimpleLongArithmetic: 252ms 253ms -0.6% 263ms 266ms
 -1.0%
 SmallLists: 290ms 301ms -3.8% 299ms 311ms
 -3.9%
 SmallTuples: 254ms 253ms +0.3% 261ms 266ms
 -1.9%
 SpecialClassAttribute: 311ms 309ms +0.7% 320ms 321ms
 -0.2%
 SpecialInstanceAttribute: 358ms 358ms +0.1% 370ms 371ms
 -0.4%
 StringMappings: 817ms 833ms -1.9% 823ms 852ms
 -3.4%
 StringPredicates: 488ms 538ms -9.2% 495ms 547ms
 -9.4%
 StringSlicing: 295ms 296ms -0.2% 306ms 323ms
 -5.5%
 TryExcept: 282ms 280ms +1.0% 291ms 288ms
 +1.0%
 TryFinally: 290ms 255ms +14.1% 300ms 263ms
 +14.1%
 TryRaiseExcept: 261ms 256ms +1.7% 271ms 263ms
 +3.1%
 TupleSlicing: 281ms 270ms +4.3% 289ms 277ms
 +4.3%
 UnicodeMappings: 329ms 337ms -2.4% 337ms 348ms
 -3.0%
 UnicodePredicates: 295ms 329ms -10.4% 303ms 338ms
 -10.5%
 UnicodeProperties: 272ms 307ms -11.4% 284ms 313ms
 -9.3%
 UnicodeSlicing: 258ms 261ms -1.2% 266ms 275ms
 -3.2%
 WithFinally: 384ms 398ms -3.6% 394ms 408ms
 -3.5%
 WithRaiseExcept: 320ms 294ms +8.9% 336ms 304ms
 +10.6%
-------------------------------------------------------------------------------
msg87870 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年05月16日 09:13
Can you give results of the "richards" benchmark?
msg87889 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009年05月16日 12:49
With the patch:
Richards benchmark (Python) starting... [<function entry_point at 0x63b430>]
finished.
Total time for 10 iterations: 8.49 secs
Average time per iteration: 848.90 ms
Without:
Richards benchmark (Python) starting... [<function entry_point at 0x637530>]
finished.
Total time for 10 iterations: 10.46 secs
Average time per iteration: 1045.88 ms
msg87896 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2009年05月16日 14:25
Similar results here.
With the patch:
Richards benchmark (Python) starting... [<function entry_point at
0x7fc15b3f5848>]
finished.
Total time for 4 iterations: 1.78 secs
Average time per iteration: 443.90 ms
Without the patch:
Richards benchmark (Python) starting... [<function entry_point at
0x7f8b9fb20848>]
finished.
Total time for 4 iterations: 2.02 secs
Average time per iteration: 503.79 ms
msg95707 - (view) Author: Reid Kleckner (rnk) (Python committer) Date: 2009年11月25日 02:21
One thing I was wondering about the current patch is what about objects
that have attributes that shadow methods? For example:
class C(object):
 def foo(self):
 return 1
c = c()
print c.foo()
c.foo = lambda: 2
print c.foo()
Shouldn't the above print 1 and 2? With the current patch, it seems
that you might still print 1.
There's also the possible performance drawback where you're loading
builtin C methods, so the optimization fails, but you end up calling
_PyType_Lookup twice. :(
I'm doing the same optimization for unladen swallow, and these were some
of the things I ran into. I think I'm going to write a
PyObject_GetMethod that tries to get a method without binding it, but if
it can't for any reason, it behaves just like PyObject_GetAttr and sets
a status code.
msg95710 - (view) Author: Benjamin Peterson (benjamin.peterson) * (Python committer) Date: 2009年11月25日 03:27
Yes, my patch introduces that regression you mention. PyPy solves this
by having the instances dictionary keep track of shadowing of the type
dictionary. Not easy for CPython... I wish you luck on your patch!
msg110253 - (view) Author: Reid Kleckner (rnk) (Python committer) Date: 2010年07月14日 04:56
I have an patch for unladen-swallow out for review here:
http://codereview.appspot.com/160063/show
It resolves the correctness issues I mentioned previously by emitting guards if necessary. If the type is predictable and uses slots, then we don't need to check the instance dict.
It gives a 5% speedup on the unpickle benchmark. Presumably the other benchmarks do not do as many method calls.
msg110272 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2010年07月14日 12:08
First, are these benchmark results jitted or non-jitted? Right now, non-jitted results are a stronger motivation for inclusion in main CPython, IMHO. 
Second, 2.7 won't receive any features / performance improvements anymore. It would be nice to have 3.2 (non-jitted) benchmark results.
Third, if removing intermediate allocations is the kind of optimizations a JIT will do anyway, does it make sense or not to make these optimizations explicit at the bytecode level? (this is really an open question, not a rhetorical one)
msg110298 - (view) Author: Reid Kleckner (rnk) (Python committer) Date: 2010年07月14日 16:45
Sorry, I was just posting it so Benjamin could see what this bought us. I'm not pushing to get this in CPython.
The results are for JITed code. I forget what the interpreted results are. I think they are good for the microbenchmarks, but not as good for the macro.
msg168170 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2012年08月14日 03:39
Benjamin confirms a regression in his patch, and the other patch was not intended for CPython. So I'm closing this issue.
History
Date User Action Args
2022年04月11日 14:56:48adminsetgithub: 50283
2016年01月14日 17:55:24yselivanovsetsuperseder: Speedup method calls 1.2x
2012年08月14日 03:39:29r.david.murraysetstatus: open -> closed

nosy: + r.david.murray
messages: + msg168170

stage: patch review -> resolved
2010年07月14日 16:45:22rnksetmessages: + msg110298
2010年07月14日 12:11:47pitrousetstage: patch review
versions: - Python 2.7
2010年07月14日 12:08:10pitrousetnosy: + collinwinter, jyasskin
messages: + msg110272
2010年07月14日 04:56:43rnksetmessages: + msg110253
2009年11月25日 03:27:40benjamin.petersonsetmessages: + msg95710
2009年11月25日 02:21:16rnksetmessages: + msg95707
2009年11月24日 17:15:04rnksetnosy: + rnk
2009年05月16日 14:25:05pitrousetmessages: + msg87896
2009年05月16日 12:49:25benjamin.petersonsetmessages: + msg87889
2009年05月16日 09:13:54pitrousetnosy: + pitrou
messages: + msg87870
2009年05月16日 01:43:58benjamin.petersoncreate

AltStyle によって変換されたページ (->オリジナル) /