homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add opcode cache for LOAD_ATTR
Type: Stage: resolved
Components: Interpreter Core Versions: Python 3.10
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: pablogsal Nosy List: Guido.van.Rossum, gvanrossum, pablogsal, vstinner, yselivanov
Priority: normal Keywords: patch

Created on 2020年10月20日 03:18 by pablogsal, last changed 2022年04月11日 14:59 by admin. This issue is now closed.

Pull Requests
URL Status Linked Edit
PR 22803 merged pablogsal, 2020年10月20日 03:19
PR 24070 merged pablogsal, 2021年01月03日 02:53
PR 24582 merged vstinner, 2021年02月19日 14:05
Messages (9)
msg379083 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020年10月20日 03:18
From the creators of "opcode cache for LOAD_GLOBAL" (https://bugs.python.org/issue26219) now it's time for "opcode cache for LOAD_ATTR: the revenge". This issue/PR builds on top of Yury's original patch in the same way https://bugs.python.org/issue26219 did for LOAD_GLOBAL. 
These are the benchmark results for the pyperformance test suite with PGO/LTO/CPU ISOLATION in a tuned system with pyperf:
+-------------------------+--------------------------------------+-------------------------------------+
| Benchmark | 2020年10月20日_01-18-master-de73d432bb29 | 2020年10月20日_02-28-cache-68f931f6938a |
+=========================+======================================+=====================================+
| go | 407 ms | 349 ms: 1.17x faster (-14%) |
+-------------------------+--------------------------------------+-------------------------------------+
| raytrace | 822 ms | 730 ms: 1.13x faster (-11%) |
+-------------------------+--------------------------------------+-------------------------------------+
| unpickle_pure_python | 497 us | 447 us: 1.11x faster (-10%) |
+-------------------------+--------------------------------------+-------------------------------------+
| scimark_sor | 311 ms | 280 ms: 1.11x faster (-10%) |
+-------------------------+--------------------------------------+-------------------------------------+
| hexiom | 15.4 ms | 14.0 ms: 1.10x faster (-9%) |
+-------------------------+--------------------------------------+-------------------------------------+
| logging_silent | 302 ns | 276 ns: 1.10x faster (-9%) |
+-------------------------+--------------------------------------+-------------------------------------+
| chaos | 176 ms | 163 ms: 1.08x faster (-7%) |
+-------------------------+--------------------------------------+-------------------------------------+
| pyflate | 1.01 sec | 948 ms: 1.06x faster (-6%) |
+-------------------------+--------------------------------------+-------------------------------------+
| scimark_lu | 246 ms | 232 ms: 1.06x faster (-6%) |
+-------------------------+--------------------------------------+-------------------------------------+
| pickle_pure_python | 712 us | 674 us: 1.06x faster (-5%) |
+-------------------------+--------------------------------------+-------------------------------------+
| regex_effbot | 4.49 ms | 4.26 ms: 1.05x faster (-5%) |
+-------------------------+--------------------------------------+-------------------------------------+
| scimark_monte_carlo | 160 ms | 153 ms: 1.05x faster (-5%) |
+-------------------------+--------------------------------------+-------------------------------------+
| richards | 120 ms | 115 ms: 1.05x faster (-4%) |
+-------------------------+--------------------------------------+-------------------------------------+
| 2to3 | 458 ms | 442 ms: 1.04x faster (-4%) |
+-------------------------+--------------------------------------+-------------------------------------+
| regex_v8 | 33.7 ms | 32.5 ms: 1.04x faster (-3%) |
+-------------------------+--------------------------------------+-------------------------------------+
| scimark_sparse_mat_mult | 7.16 ms | 6.93 ms: 1.03x faster (-3%) |
+-------------------------+--------------------------------------+-------------------------------------+
| deltablue | 12.1 ms | 11.7 ms: 1.03x faster (-3%) |
+-------------------------+--------------------------------------+-------------------------------------+
| regex_dna | 268 ms | 261 ms: 1.03x faster (-3%) |
+-------------------------+--------------------------------------+-------------------------------------+
| meteor_contest | 152 ms | 148 ms: 1.03x faster (-3%) |
+-------------------------+--------------------------------------+-------------------------------------+
| genshi_xml | 89.0 ms | 87.1 ms: 1.02x faster (-2%) |
+-------------------------+--------------------------------------+-------------------------------------+
| logging_simple | 12.8 us | 12.5 us: 1.02x faster (-2%) |
+-------------------------+--------------------------------------+-------------------------------------+
| genshi_text | 42.4 ms | 41.5 ms: 1.02x faster (-2%) |
+-------------------------+--------------------------------------+-------------------------------------+
| nbody | 215 ms | 211 ms: 1.02x faster (-2%) |
+-------------------------+--------------------------------------+-------------------------------------+
Not significant (35): chameleon; django_template; dulwich_log; fannkuch; float; json_dumps; json_loads; logging_format; mako; nqueens; pathlib; pickle; pickle_dict; pickle_list; pidigits; python_startup; python_startup_no_site; regex_compile; scimark_fft; spectral_norm; sqlalchemy_declarative; sqlalchemy_imperative; sqlite_synth; sympy_expand; sympy_sum; sympy_str; telco; tornado_http; unpack_sequence; unpickle; unpickle_list; xml_etree_parse; xml_etree_iterparse; xml_etree_generate; xml_etree_process; sympy_integrate
msg379087 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020年10月20日 05:22
New changeset 109826c8508dd02e06ae0f1784f1d202495a8680 by Pablo Galindo in branch 'master':
bpo-42093: Add opcode cache for LOAD_ATTR (GH-22803)
https://github.com/python/cpython/commit/109826c8508dd02e06ae0f1784f1d202495a8680
msg383859 - (view) Author: Guido van Rossum (Guido.van.Rossum) Date: 2020年12月27日 20:04
Wow, this is amazing. I just found that this is now faster than slots. Should we mention that in What's New?
(Of course there's an optimization possible for slots as well, but it would require complicating the cache struct. Maybe in 3.11. :-)
msg384087 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2020年12月30日 20:34
>Wow, this is amazing. I just found that this is now faster than slots. 
Not only that: is even faster than the highly-tuned namedtuple access descriptors that used to be the faster access to an attribute:
3.9 results
-----------
 17.6 ns read_classvar_from_class
 16.3 ns read_classvar_from_instance
 23.2 ns read_instancevar
 19.7 ns read_instancevar_slots
 17.9 ns read_namedtuple
 39.2 ns read_boundmethod
Now this is the faster way to get an attribute:
3.10 results
------------
 17.9 ns read_classvar_from_class
 16.9 ns read_classvar_from_instance
 14.1 ns read_instancevar
 20.0 ns read_instancevar_slots
 18.0 ns read_namedtuple
 40.7 ns read_boundmethod
> Should we mention that in What's New?
Good idea!. I will prepare a PR complementing the current paragraph.
msg384254 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年01月03日 04:38
New changeset 9e8fe1986cb4205fb9f883c89b9d5d76a9847e0b by Pablo Galindo in branch 'master':
bpo-42093: Tweak the what's new message about the new LOAD_ATTR opcode cache (GH-24070)
https://github.com/python/cpython/commit/9e8fe1986cb4205fb9f883c89b9d5d76a9847e0b
msg384354 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021年01月04日 23:05
Thanks! Do you have any plans for further inline caches? I was wondering if we could reverse the situation for slots again by adding slots support to the LOAD_ATTR opcode inline cache...
msg384364 - (view) Author: Pablo Galindo Salgado (pablogsal) * (Python committer) Date: 2021年01月05日 02:19
> Thanks! Do you have any plans for further inline caches?
Yeah, we are experimenting with some ideas here: https://bugs.python.org/issue42115. 
> I was wondering if we could reverse the situation for slots again by adding slots support to the LOAD_ATTR opcode inline cache...
I think we can do it as long as we can detect easily if a given descriptor is immutable. The problem of mutability is this code:
class Descriptor:
 pass
class C:
 def __init__(self):
 self.x = 1
 x = Descriptor()
def f(o):
 return o.x
o = C()
for i in range(10000):
 assert f(o) == 1
Descriptor.__get__ = lambda self, instance, value: 2
Descriptor.__set__ = lambda *args: None
print(f(o))
In this case, if we do not skip the cache for mutable descriptors, the code will not reflect the new result (2 instead of 1). __slots__ are immutable descriptors so we should be good as long as we can detect them.
msg384366 - (view) Author: Guido van Rossum (gvanrossum) * (Python committer) Date: 2021年01月05日 03:57
Hm, I was thinking to recognize the specific type of descriptor used by slots and cache only that. Though we would still have to consider updates to C.__dict__ (that's handled by looking at the dict version right?).
msg387451 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2021年02月21日 11:02
New changeset d5fc99873769f0d0d5c5d5d99059177a75a4e46e by Victor Stinner in branch 'master':
bpo-42093: Cleanup _PyDict_GetItemHint() (GH-24582)
https://github.com/python/cpython/commit/d5fc99873769f0d0d5c5d5d99059177a75a4e46e
History
Date User Action Args
2022年04月11日 14:59:37adminsetgithub: 86259
2021年02月21日 11:02:18vstinnersetmessages: + msg387451
2021年02月19日 14:05:05vstinnersetnosy: + vstinner

pull_requests: + pull_request23361
2021年01月05日 03:57:20gvanrossumsetmessages: + msg384366
2021年01月05日 02:19:54pablogsalsetmessages: + msg384364
2021年01月04日 23:05:56gvanrossumsetnosy: + gvanrossum
messages: + msg384354
2021年01月03日 04:38:11pablogsalsetmessages: + msg384254
2021年01月03日 02:53:43pablogsalsetpull_requests: + pull_request22903
2020年12月30日 20:34:07pablogsalsetmessages: + msg384087
2020年12月27日 20:04:36Guido.van.Rossumsetnosy: + Guido.van.Rossum
messages: + msg383859
2020年10月20日 05:27:36pablogsalsetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2020年10月20日 05:22:52pablogsalsetmessages: + msg379087
2020年10月20日 03:31:21pablogsalsetnosy: + yselivanov
2020年10月20日 03:19:20pablogsalsetkeywords: + patch
stage: patch review
pull_requests: + pull_request21760
2020年10月20日 03:18:36pablogsalcreate

AltStyle によって変換されたページ (->オリジナル) /