homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: C implementation of functools.lru_cache
Type: performance Stage: resolved
Components: Extension Modules, Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: fixed
Dependencies: 24276 Superseder:
Assigned To: serhiy.storchaka Nosy List: Aaron.Meurer, BreamoreBoy, amaury.forgeotdarc, asvetlov, brechtm, eric.snow, ezio.melotti, giampaolo.rodola, jaraco, jcea, josh.r, kachayev, kmike, meador.inge, ned.deily, pitrou, poke, python-dev, rhettinger, scoder, serhiy.storchaka
Priority: normal Keywords: patch

Created on 2012年03月20日 13:40 by anacrolix, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
lru_cache_class.py rhettinger, 2012年03月31日 20:10 Updated pure python roadmap for the C implementation. Adds handling for kwd_mark and try/except
functools.lru_cache-in-c.patch anacrolix, 2012年04月02日 07:20 review
14373.fixed.diff kachayev, 2012年12月19日 21:54 Python version works without C acceleration, test cases for both implementations review
14373.v3.diff kachayev, 2012年12月21日 19:38 review
14373.v4.diff kachayev, 2012年12月21日 22:13 review
lru_cache_bench.py serhiy.storchaka, 2012年12月22日 20:21 Benchmark script
14373.v7.diff kachayev, 2012年12月30日 13:43 review
14373.v9.diff kachayev, 2012年12月30日 20:04 review
clru_cache.patch serhiy.storchaka, 2013年11月22日 17:49 review
clru_cache2.patch serhiy.storchaka, 2013年11月22日 21:48 review
clru_cache_3.patch serhiy.storchaka, 2015年05月24日 07:13 review
clru_cache_crasher.py serhiy.storchaka, 2015年05月24日 07:13
clru_cache_4.patch serhiy.storchaka, 2015年05月24日 13:18 Removed fast path in make_key review
clru_cache_descr_get.patch serhiy.storchaka, 2015年06月07日 06:44 review
test_lru_cache_threaded.patch serhiy.storchaka, 2015年06月07日 18:56 review
clru_cache_new.patch serhiy.storchaka, 2015年07月14日 06:23 review
Messages (66)
msg156405 - (view) Author: Matt Joiner (anacrolix) Date: 2012年03月20日 13:40
functools.lru_cache is optimized to the point that it may benefit from a C implementation.
msg156449 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012年03月20日 19:14
Thank you for working on this. I look forward to reviewing it.
msg156492 - (view) Author: Matt Joiner (anacrolix) Date: 2012年03月21日 12:10
Updated patch to fix a crash if maxsize isn't given, and add a unit test for that.
Possible issues:
 * I've tried to emulate object() by calling PyBaseObject_Type. Not sure if there's a more lightweight object for this that just provides the needed __hash__ and __eq__ attributes.
 * Not sure if kwd_mark is deallocated correctly.
 * Can't quite work out the best way to wrap the C cache_info() method to output the _CacheInfo named tuple. The current mechanism might not be wrapping the attributes correctly.
msg156803 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012年03月26日 08:29
I've just started looking at this. Nice job and good attention to detail on the error checking. Expect to have a few high-level suggestions and a ton of minor edits.
Here are a couple of quick thoughts:
* The comment style needs to switch to the /* style */
* Use the pure python _cache_info by looking it up on the module object.
* I had expected PyList objects rather than links between C structs (following the pure python lru_cache as closely as possible).
msg156808 - (view) Author: Matt Joiner (anacrolix) Date: 2012年03月26日 10:53
I've fixed the commenting, and cache_info use.
I've left the element management in pure C as it reduces memory use (56 bytes for 4 element list, vs. 16 for lru_cache_elem), and avoids ref counting overhead (3 refs per link, plus GC). The difference might become quite marked for very large caches. There's also a nice invariant that links the key to the cache dict, and the result object to the lru_cache_elem. I'm happy to change this if it doesn't matter.
My only concern now is the wrapping of the lru cache object. In the Python version, @wraps allows the lru_cache to masquerade as the wrapped function wrt str/repr. The C version is wrapped, but str/repr remain unchanged. Not sure if this is a problem.
msg157224 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012年03月31日 19:50
Matt, I'm attaching a pure python version to serve as a better map for how to implement this in C.
* it incorporate the recent lru_cache algorithmic updates (moving the root around the circular queue to re-use old links).
* it shows which parts should be implemented in C using a regular type and shows how to call it from pure python.
* it gives hints on use of #defines and PyDict_GetItem
* the critical sections are marked so you can use the GIL rather than using locks.
* there are hints for what datatypes to use (only the hits and misses need the ability to grow beyond sys.maxsize).
* it shows how to access the named tuple from within the C code (using a straight PyObject_Call).
* this code passes the test suite and should be directly translatable (and very fast).
* please follow the model and use PyList objects instead of C structure for links
* let me know if there are any questions.
msg157226 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2012年03月31日 20:10
Updated to show how to handle the kwd_mark and the try/except.
msg157343 - (view) Author: Matt Joiner (anacrolix) Date: 2012年04月02日 07:20
> * it incorporate the recent lru_cache algorithmic updates (moving the root around the circular queue to re-use old links).
The existing C patch already does this.
> * it shows which parts should be implemented in C using a regular type and shows how to call it from pure python.
 
The existing patch already did this.
> * it gives hints on use of #defines and PyDict_GetItem
I'm familiar with C. I've retained PyDict_GetItemWithError so as not to suppress typing errors from keys constructed from bad arguments from the user.
> * the critical sections are marked so you can use the GIL rather than using locks.
The existing patch is already using the GIL, it didn't have any locks.
> * there are hints for what datatypes to use (only the hits and misses need the ability to grow beyond sys.maxsize).
The existing patch already had this.
> * it shows how to access the named tuple from within the C code (using a straight PyObject_Call).
I incorporated this.
> * this code passes the test suite and should be directly translatable (and very fast).
So did the old code.
> * please follow the model and use PyList objects instead of C structure for links
I've left this as is, the linked list in C is idiomatic, avoids a lot of branching and reference counting and is also more readable than the equivalent C/Python.
msg175533 - (view) Author: Matt Joiner (anacrolix) Date: 2012年11月14日 04:16
I look forward to your feedback Ezio.
msg175540 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012年11月14日 07:06
I wonder if we should keep the original Python implementation alongside the new C version. If we do, it would be nice to see a 100% coverage of the Python version and have the tests check both the implementations.
msg175545 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2012年11月14日 07:20
See also #12428.
msg176287 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年11月24日 14:19
I have added a lot of comments in Rietveld. In general the patch looks good, but I have some style nitpicks and some more strong comments. Matt, please update the patch.
Tests should test both Python and C implementations.
I doubt if get rid of locking is right. Hash calculation and comparison of arguments can call Python code and this code can recursive use the wrapped function.
Some benchmarks will be interesting.
I think this acceleration must necessarily be in 3.4.
msg177784 - (view) Author: Alexey Kachayev (kachayev) * Date: 2012年12月19日 21:54
Updated patch with next points:
* fix typedefs
* python implementation works normally without C acceleration
* test cases cover both python/c implementations
* several style fixes in C module
msg177808 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年12月20日 10:04
Thanks, Alexey.
However most of my comments were not considered. I have repeated them and added some new comments.
msg177867 - (view) Author: Alexey Kachayev (kachayev) * Date: 2012年12月21日 08:05
Serhiy, thank you for review. Working further on fixes.
msg177889 - (view) Author: Alexey Kachayev (kachayev) * Date: 2012年12月21日 19:38
Fixed my previous patch according to all comments from review, except removing keyword arguments from lru_cache_new function.
msg177903 - (view) Author: Alexey Kachayev (kachayev) * Date: 2012年12月21日 22:13
Added additional Py_DECREF(s) for key and value.
msg177953 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年12月22日 20:21
LGTM. Thank you, Matt and Alexey.
Here is a simple benchmark. On my computer it shows 10-25x speedup using the C accelerator.
msg177955 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年12月22日 21:00
Antoine reminded me about a lock. In Python implementation it needed because linked list modifications are not atomic. In C implementation linked list modifications are atomic. However dict operations can call Python code and therefore they are not atomic. I don't know what bad things can happened with concurrent cache updating, however using lock will be safer and cheap enought.
Please add lock field to lru_cache_object and use it as in Python implementation. If no one can prove that a lock is not needed.
msg177967 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012年12月23日 10:08
Just for the record, I've compiled Raymond's roadmap version in Cython (with only slight changes to make 'self.maxsize' a Py_ssize_t and using an external .pxd for typing) and ran Serhiy's benchmark over it (Ubuntu 12.10, 64bit). This is what I get in Py3.4:
 0.022 untyped_cy(i)
 0.023 untyped_cy("spam", i)
 0.024 untyped_cy("spam", "spam", i)
 0.106 untyped_cy(a=i)
 0.133 untyped_cy(a="spam", b=i)
 0.152 untyped_cy(a="spam", b="spam", c=i)
 0.033 typed_cy(i)
 0.038 typed_cy("spam", i)
 0.039 typed_cy("spam", "spam", i)
 0.129 typed_cy(a=i)
 0.168 typed_cy(a="spam", b=i)
 0.183 typed_cy(a="spam", b="spam", c=i)
 0.143 untyped_py(i)
 0.234 untyped_py("spam", i)
 0.247 untyped_py("spam", "spam", i)
 0.368 untyped_py(a=i)
 0.406 untyped_py(a="spam", b=i)
 0.425 untyped_py(a="spam", b="spam", c=i)
 0.447 typed_py(i)
 0.469 typed_py("spam", i)
 0.480 typed_py("spam", "spam", i)
 0.745 typed_py(a=i)
 0.783 typed_py(a="spam", b=i)
 0.819 typed_py(a="spam", b="spam", c=i)
Looking at the factors, that's about the same speedup that the dedicated hand tuned C implementation presented according to Serhiy's own runs (he reported 10-25x). Makes me wonder why we should have two entirely separate implementations for this.
Here's the lru_cache_class.pxd file I used:
"""
cimport cython
cdef make_key(tuple args, dict kwds, bint typed, tuple kwd_mark)
@cython.final
@cython.internal
cdef class c_lru_cache:
 cdef dict cache
 cdef Py_ssize_t hits
 cdef Py_ssize_t misses
 cdef Py_ssize_t maxsize
 cdef bint typed
 cdef object user_function
 cdef object cache_info_type
 cdef tuple kwd_mark
 cdef list root
"""
msg177976 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012年12月23日 12:10
Hmm. Judging by the numbers for the Python version, my machine appears
to be slower than Stefan (Behnel)'s machine, and yet the C version is
much faster here than the posted Cython numbers.
If I adjust the results for the machine differences, the C version
would appear to be 2.5x faster than the Cython version.
msg177980 - (view) Author: Stefan Krah (skrah) * (Python committer) Date: 2012年12月23日 12:40
I've managed to build the Cython version now. It's in fact between 4 and 6
times slower here than the C version.
msg177993 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2012年12月23日 16:20
Yep, I basically didn't do any optimisation, it's the plain Python code compiled, just with the class being converted into an extension type.
msg178573 - (view) Author: Alexey Kachayev (kachayev) * Date: 2012年12月30日 13:43
Thread-safe implementation for cache cleanup.
msg178589 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2012年12月30日 17:16
Alexey, as I see, you have missed some Antoine's comments (and my comments about whitespaces). Please, be more careful.
msg178611 - (view) Author: Alexey Kachayev (kachayev) * Date: 2012年12月30日 20:04
Updated diff with:
 * fix object leaks
 * tp_clear
 * additional test for maxsize < 0
I also reimplemented multithreading test (fixed error and added skip rule). But actually, I'm not sure that it's enough for ensuring thread-safety of clear operation. I'm working on other variant now. I will be appreciated for any advice about where to find example of the same (or close enough) test cases from other modules.
Regarding to previous comments from review. 
1. "guard against negative numbers"
I added special test case for negative maxsize in order to show, that now C version works the same as Python one. So, should we change both implementation? What behavior is most logical here? Reimplementation will change public API, so it's not only about acceleration.
2. Use regular PyObject instead of lru_list_elem.
What the problems are with current implementation?
msg203597 - (view) Author: Brecht Machiels (brechtm) Date: 2013年11月21日 09:34
What's the status of this patch? What still needs to be done for it to be accepted?
msg203599 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年11月21日 09:45
I'm working on this.
msg203654 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2013年11月21日 16:43
After Serhiy looks at this, I need to review it for reentrancy thread-safety and reentrancy issues.
msg203821 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年11月22日 17:49
Here is a revised patch. It is synchronized with tip. Got rid of capsules (sped up about 10%). lru_list_elem is now true Python object. Got rid of the lock, the new code must be thread-safe with GIL only.
I very much hope that this code will be included in 3.4. We can fix bugs later if there are any. I am also planning to optimize the make_key() functions.
msg203846 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2013年11月22日 19:42
> Got rid of the lock, the new code must be thread-safe with GIL only.
I'm not convinced by this (see the review I posted).
msg203894 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2013年11月22日 21:48
Updated patch fixes bugs found by Antoine. Thank you Antoine.
msg214208 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年03月20日 11:51
Could you please make a review Raymond?
msg214209 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014年03月20日 11:57
> Could you please make a review Raymond?
Yes, I will take a look. I looking a making other changes to the lru_cache and don't want the C implementation to go it first. There are still some open questions about re-entrancy that still need to be addressed before a C implementation gets added to the mix.
msg223439 - (view) Author: Mark Lawrence (BreamoreBoy) * Date: 2014年07月18日 21:42
https://pypi.python.org/pypi/fastcache/0.4.0 also seems relevant.
msg224938 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年08月06日 15:22
Raymond, do you have time for the review? It will be easier to support both implementation at the same time, than change only Python implementation and asynchronously update the patch with the risk to lost changes in C implementation.
I think we should increase the priority of this issue. Efficient lru_cache will affect other parts of the stdlib. In particular the use of lru_cache was withdrawed in the re module due to large overhead of Python implementation. The ipaddress module now uses own specialized implementation of the caching instead of general lru_cache for the same reason. Issue13299 proposition will be more acceptable with faster lru_cache.
msg224941 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014年08月06日 15:34
I think a lock is still needed for cache misses. The dict operations (set and del) can release the GIL (as well as course as the PyObject_Call()), therefore you might end up with duplicate list links for a given key.
(and given cache misses are supposed to be much more expensive anyway, I don't think acquiring a lock there is detrimental)
msg224980 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2014年08月07日 01:19
> I think we should increase the priority of this issue.
I don't think so at all. The LRU cache we have now is plenty efficient for its intended use cases (caching I/O bound functions and expensive functions). If is only unsuitable for functions that are already blazingly fast. 
Getting the locks right and carefully looking for re-entrancy issues is important. Also, keeping the memory footprint of the keys small is important (if people didn't care about space, they wouldn't be using an LRU at all).
I will look at this but currently have much higher priorities elsewhere in Python (adding C accelerators for tricky code is less important for the time being -- we have a long time until 3.5).
msg224981 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2014年08月07日 01:51
Le 06/08/2014 21:19, Raymond Hettinger a écrit :
>
> I don't think so at all. The LRU cache we have now is plenty
> efficient
for its intended use cases (caching I/O bound functions and expensive
functions). If is only unsuitable for functions that are already
blazingly fast.
This is an unrealistic simplification. Many functions can be either 
expensive or blazingly fast, depending on their input (typical examples 
are re.compile(), math.factorial()...). But the decision of applying the 
lru_cache decorator is a compile-time binary decision: it cannot encode 
the varying properties of the function depending on its inputs.
msg243932 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015年05月23日 17:15
Serhiy, go ahead and apply your patch, lru_cache2.patch. I'll fix-up the make_key() code after the beta1 (like the pure python version, it needs to guarantee that hash is called no more than once per key).
msg243939 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年05月23日 19:43
New changeset 57776eee74f2 by Serhiy Storchaka in branch 'default':
Issue #14373: Added C implementation of functools.lru_cache(). Based on
https://hg.python.org/cpython/rev/57776eee74f2 
msg243940 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年05月23日 19:48
Thanks Raymond.
I tried to write an implementation with locking, but it would be too complicated (much more complex than all proposed before patches), because recursive locks are needed. In any case I think that GIL is enough here. Locking in Python implementation is needed for atomic modifying linked list.
msg243962 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015年05月24日 02:41
Thanks Serhiy. I'll work on the next steps after the beta.
msg243967 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年05月24日 04:21
Unfortunately the patch caused a crash in test_ipaddress. Larry granted special exclusion for adding this feature after feature freeze if it will be fixed before beta 2.
msg243970 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年05月24日 07:13
Here are the recent version of the patch and minimal crash reproducer.
msg243982 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年05月24日 13:18
The problem is in property descriptor getter. It uses cached tuple for args and can unexpectedly modify it. Opened issue24276 for fixing this bug.
Another way is to remove fast path in lru_cache_make_key() and always copy args tuple.
msg244002 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年05月24日 20:33
Backed out the backout in cb30db9cc029.
msg244928 - (view) Author: Tim Graham (Tim.Graham) * Date: 2015年06月06日 20:30
This changed caused a problem in Django's test suite (bisected to 0d0989359bbb0).
 File "/home/tim/code/django/django/db/models/options.py", line 709, in _populate_directed_relation_graph
 all_models = self.apps.get_models(include_auto_created=True)
TypeError: get_models() missing 1 required positional argument: 'self'
The get_models() method is decorated with @lru_cache.lru_cache(maxsize=None)
https://github.com/django/django/blob/c2b4967e76fd671e6199e4dd54d2a2c1f096b8eb/django/apps/registry.py#L157-L179 
msg244944 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年06月07日 06:44
Here is a patch that adds __get__ for lru_cache object.
msg244950 - (view) Author: Tim Graham (Tim.Graham) * Date: 2015年06月07日 10:43
Thanks, that does resolve the issue.
msg244966 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年06月07日 18:56
test_lru_cache_threaded is randomly failed on PPC64 PowerLinux buildbot: http://buildbot.python.org/all/builders/PPC64%20PowerLinux%203.x/builds/3573/steps/test/logs/stdio
======================================================================
FAIL: test_lru_cache_threaded (test.test_functools.TestLRUPy)
----------------------------------------------------------------------
Traceback (most recent call last):
 File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_functools.py", line 1129, in test_lru_cache_threaded
 self.assertEqual(hits, 45)
AssertionError: 43 != 45
----------------------------------------------------------------------
The tests looks incorrect, it should fail when cached function is called simultaneously from different threads. Unfortunately it is hard reproduce the failure on other platform.
Proposed patch fixes the test and adds other threaded test, specially for simultaneous call.
msg244992 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年06月08日 08:21
New changeset 19dbee688a30 by Serhiy Storchaka in branch '3.5':
Issue #14373: Fixed threaded test for lru_cache(). Added new threaded test.
https://hg.python.org/cpython/rev/19dbee688a30
New changeset da331f50aad4 by Serhiy Storchaka in branch 'default':
Issue #14373: Fixed threaded test for lru_cache(). Added new threaded test.
https://hg.python.org/cpython/rev/da331f50aad4
New changeset eff50d543c79 by Serhiy Storchaka in branch '3.5':
Issue #14373: C implementation of functools.lru_cache() now can be used with
https://hg.python.org/cpython/rev/eff50d543c79
New changeset f4b785d14792 by Serhiy Storchaka in branch 'default':
Issue #14373: C implementation of functools.lru_cache() now can be used with
https://hg.python.org/cpython/rev/f4b785d14792 
msg244994 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年06月08日 09:45
New changeset c52f381fe674 by Serhiy Storchaka in branch '3.5':
Issue #14373: Other attempt to fix threaded test for lru_cache().
https://hg.python.org/cpython/rev/c52f381fe674
New changeset 73acb8c187d4 by Serhiy Storchaka in branch 'default':
Issue #14373: Other attempt to fix threaded test for lru_cache().
https://hg.python.org/cpython/rev/73acb8c187d4 
msg245606 - (view) Author: Raymond Hettinger (rhettinger) * (Python committer) Date: 2015年06月21日 17:37
If the C version is to remain in Python3.5, please make sure it provides all of the carefully designed features of the pure python version:
 * The hash function is called no more than once per element
 * The key is constructed to be flat as possible
 * Thread-safe where needed:
 - make_key called outside locks 
 - within a reentrant lock (either a real lock or the GIL),
 check to see if the key is in the cache and if so, 
 update the stats and return the value
 + outside of locks, call the user function
 (allowing for recursive functions)
 + within a reentrant lock (either a real lock or the GIL)
 handle cache misses by 1) verifying there was not an
 intervening cache update by the user function, 2)
 updating links, 3) updating stats. During the updates,
 make only one potentially reentrant cache[key]
 assignment after all invariants have been restored.
 Save all decrefs until all other state updates
 have been completed.
A number of features of the lru_cache were designed for space savings over speed (lru is all about eviction to make space for a new entry), for thread safety and to not fall apart during reentrancy. It also provides a guarantee that the hash function is not called more than once per element and is called *before* any of the lru structure updates or lookups (this makes reasoning about correctness *much* easier).
In these capabilities can't be reliably reproduced for 3.5, I think the whole thing should be reverted and deferred to 3.6.
msg245618 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年06月22日 04:10
> If the C version is to remain in Python3.5, please make sure it provides all
> of the carefully designed features of the pure python version:
> 
> * The hash function is called no more than once per element
Will be satisfied by issue24483.
I think all other capabilities are satisfied. The GIL is used to satisfy 
thread-safe.
msg246573 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2015年07月10日 17:59
I'm witnessing a crash in the C implementation during garbage collection. Interestingly, it only shows in the Py3.6 branch, not in Py3.5 (both latest). Here's the gdb session:
Program received signal SIGSEGV, Segmentation fault.
lru_cache_tp_traverse (self=0x7ffff2a80ae8, visit=0x43c528 <visit_decref>, arg=0x0) at ./Modules/_functoolsmodule.c:1040
1040 lru_list_elem *next = link->next;
(gdb) list
1035 static int
1036 lru_cache_tp_traverse(lru_cache_object *self, visitproc visit, void *arg)
1037 {
1038 lru_list_elem *link = self->root.next;
1039 while (link != &self->root) {
1040 lru_list_elem *next = link->next;
1041 Py_VISIT(link);
1042 link = next;
1043 }
1044 Py_VISIT(self->maxsize_O);
(gdb) print link
1ドル = (lru_list_elem *) 0x0
(gdb) print self
2ドル = (lru_cache_object *) 0x7ffff2a80ae8
(gdb) print self->root.next
3ドル = (struct lru_list_elem *) 0x0
(gdb) print self->root
4ドル = {ob_base = {_ob_next = 0x7ffff2a26458, _ob_prev = 0x90e860 <refchain>, ob_refcnt = 1, ob_type = 0x92c500 <lru_cache_type>}, prev = 0x0, next = 0x0, key = 0x0, result = 0x0}
IIUC correctly, there is only one entry and the code doesn't expect that. An additional "is empty?" NULL check may or may not be the right fix. It might also be the linking itself that's incorrect here. The code seems to expect a cyclic data structure which is apparently not maintained that way.
msg246576 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年07月10日 18:48
self->root.next and self->root.prev should never be NULL. Could you please provide minimal example of code that produces a crash?
msg246580 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2015年07月10日 19:46
It's not actually my own code using the lru_cache here. From a quick grep
over the code tree, the only potentially related usage I found was in
Python's fnmatch module, on the "_compile_pattern()" function. Commenting
that out then made the crash go away, so this was the culprit.
However, I ran test_functools.py of the same installation and it passes, so
not every usage is broken here. Simple manual testing didn't reveal
anything either so far.
msg246582 - (view) Author: Stefan Behnel (scoder) * (Python committer) Date: 2015年07月10日 19:49
test_fnmatch.py also passes, BTW.
msg246592 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2015年07月11日 03:09
I've also seen a crash in lru_cache_tp_traverse but with the 3.5.0b3 release build for the OS X 64-bit/32-bit installer. I just stumbled across the segfault by bringing up the interactive interpreter and typing "import ssl". After a lot of playing around, I reduced the failing case to: 1. have an "import pprint" in a startup file referred to by PYTHONSTARTUP *and* 2. "import ssl" must be the very first command entered in the interactive interpreter. Odd! Unfortunately, the release build is a non-debug build and, so far, I have not been able to reproduce the segfault with any other build, debug or non-debug. So, whatever the problem is, it's very build dependent. Here is the OS X system traceback from the segfault: 
Path: /Library/Frameworks/Python.framework/Versions/3.5.0b3_10_6/Resources/Python.app/Contents/MacOS/Python
Identifier: Python
Version: ???
Code Type: X86-64 (Native)
Parent Process: bash [51285]
Responsible: iTerm [754]
User ID: 503
Date/Time: 2015年07月10日 19:57:16.086 -0700
OS Version: Mac OS X 10.10.4 (14E46)
Report Version: 11
Anonymous UUID: CFED52E3-698C-835B-D61C-F4B1F18D2CB6
Time Awake Since Boot: 800000 seconds
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000018
VM Regions Near 0x18:
--> 
 __TEXT 0000000100000000-0000000100001000 [ 4K] r-x/rwx SM=COW /Library/Frameworks/Python.framework/Versions/3.5.0b3_10_6/Resources/Python.app/Contents/MacOS/Python
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 org.python.python 	0x000000010015e5e5 lru_cache_tp_traverse + 37
1 org.python.python 	0x000000010013a2d7 collect + 439
2 org.python.python 	0x000000010013aee5 _PyObject_GC_Alloc + 357
3 org.python.python 	0x000000010013afe7 _PyObject_GC_New + 23
4 org.python.python 	0x0000000100059bce PyDict_New + 334
5 org.python.python 	0x000000010015f029 lru_cache_new + 249
6 org.python.python 	0x00000001000795a6 type_call + 38
7 org.python.python 	0x000000010000dc93 PyObject_Call + 99
8 org.python.python 	0x00000001000e9fd8 PyEval_EvalFrameEx + 7656
9 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
10 org.python.python 	0x00000001000f035d PyEval_EvalFrameEx + 33133
11 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
12 org.python.python 	0x00000001000f1e07 PyEval_EvalCodeEx + 71
13 org.python.python 	0x00000001000e5ff5 builtin___build_class__ + 485
14 org.python.python 	0x0000000100065549 PyCFunction_Call + 281
15 org.python.python 	0x00000001000f0768 PyEval_EvalFrameEx + 34168
16 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
17 org.python.python 	0x00000001000f1e61 PyEval_EvalCode + 81
18 org.python.python 	0x00000001000e5683 builtin_exec + 627
19 org.python.python 	0x0000000100065519 PyCFunction_Call + 233
20 org.python.python 	0x00000001000f0a9b PyEval_EvalFrameEx + 34987
21 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
22 org.python.python 	0x00000001000f035d PyEval_EvalFrameEx + 33133
23 org.python.python 	0x00000001000f07fd PyEval_EvalFrameEx + 34317
24 org.python.python 	0x00000001000f07fd PyEval_EvalFrameEx + 34317
25 org.python.python 	0x00000001000f07fd PyEval_EvalFrameEx + 34317
26 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
27 org.python.python 	0x00000001000f1e07 PyEval_EvalCodeEx + 71
28 org.python.python 	0x000000010004017a function_call + 186
29 org.python.python 	0x000000010000dc93 PyObject_Call + 99
30 org.python.python 	0x0000000100010ff6 _PyObject_CallMethodIdObjArgs + 454
31 org.python.python 	0x000000010010d6d3 PyImport_ImportModuleLevelObject + 1171
32 org.python.python 	0x00000001000e5e03 builtin___import__ + 131
33 org.python.python 	0x0000000100065549 PyCFunction_Call + 281
34 org.python.python 	0x000000010000dc93 PyObject_Call + 99
35 org.python.python 	0x00000001000e64f7 PyEval_CallObjectWithKeywords + 87
36 org.python.python 	0x00000001000ea43e PyEval_EvalFrameEx + 8782
37 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
38 org.python.python 	0x00000001000f1e61 PyEval_EvalCode + 81
39 org.python.python 	0x00000001000e5683 builtin_exec + 627
40 org.python.python 	0x0000000100065519 PyCFunction_Call + 233
41 org.python.python 	0x00000001000f0a9b PyEval_EvalFrameEx + 34987
42 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
43 org.python.python 	0x00000001000f035d PyEval_EvalFrameEx + 33133
44 org.python.python 	0x00000001000f07fd PyEval_EvalFrameEx + 34317
45 org.python.python 	0x00000001000f07fd PyEval_EvalFrameEx + 34317
46 org.python.python 	0x00000001000f07fd PyEval_EvalFrameEx + 34317
47 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
48 org.python.python 	0x00000001000f1e07 PyEval_EvalCodeEx + 71
49 org.python.python 	0x000000010004017a function_call + 186
50 org.python.python 	0x000000010000dc93 PyObject_Call + 99
51 org.python.python 	0x0000000100010ff6 _PyObject_CallMethodIdObjArgs + 454
52 org.python.python 	0x000000010010d6d3 PyImport_ImportModuleLevelObject + 1171
53 org.python.python 	0x00000001000e5e03 builtin___import__ + 131
54 org.python.python 	0x0000000100065549 PyCFunction_Call + 281
55 org.python.python 	0x000000010000dc93 PyObject_Call + 99
56 org.python.python 	0x00000001000e64f7 PyEval_CallObjectWithKeywords + 87
57 org.python.python 	0x00000001000ea43e PyEval_EvalFrameEx + 8782
58 org.python.python 	0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400
59 org.python.python 	0x00000001000f1e61 PyEval_EvalCode + 81
60 org.python.python 	0x000000010011fa7a PyRun_InteractiveOneObject + 474
61 org.python.python 	0x000000010011fdfe PyRun_InteractiveLoopFlags + 110
62 org.python.python 	0x000000010012076c PyRun_AnyFileExFlags + 76
63 org.python.python 	0x00000001001390a9 Py_Main + 3785
64 org.python.python 	0x0000000100000e32 0x100000000 + 3634
65 org.python.python 	0x0000000100000c84 0x100000000 + 3204
Thread 0 crashed with X86 Thread State (64-bit):
 rax: 0x000000010023d6c0 rbx: 0x00000001006fdb70 rcx: 0x0000000100382048 rdx: 0x0000000000000000
 rdi: 0x0000000000000000 rsi: 0x00000001001392e0 rbp: 0x00007fff5bffb530 rsp: 0x00007fff5bffb510
 r8: 0x0000000000000000 r9: 0x0000000000000001 r10: 0x0000000000000016 r11: 0x00000001001392e0
 r12: 0x00000001006fdb88 r13: 0x0000000000000000 r14: 0x00000001001392e0 r15: 0x000000010102fdb8
 rip: 0x000000010015e5e5 rfl: 0x0000000000010206 cr2: 0x0000000000000018
msg246718 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年07月14日 06:23
Stefan, Ned, can you reproduce the same crash with following patch?
msg246720 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2015年07月14日 07:25
Serhiy, I'll try making a 3.5.0b3+patch installer build in the same manner as the 3.5.0b3 installer build. But I may not get to it for several days.
msg247309 - (view) Author: Ned Deily (ned.deily) * (Python committer) Date: 2015年07月25日 02:07
Sorry about the delay in testing the patch. I just confirmed (1) that I am still able to produce a segfault on OS X as described above under the specific conditions with a 10.6-like installer built with the current 3.5 tip and (2) that, with clru_cache_new.patch applied to the same current 3.5 tip, I no longer see the segfault. So it looks like a fix.
msg247320 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年07月25日 09:11
New changeset 5345e5ce2eed by Serhiy Storchaka in branch '3.5':
Issue #14373: Fixed segmentation fault when gc.collect() is called during
https://hg.python.org/cpython/rev/5345e5ce2eed
New changeset 9c6d11d22801 by Serhiy Storchaka in branch 'default':
Issue #14373: Fixed segmentation fault when gc.collect() is called during
https://hg.python.org/cpython/rev/9c6d11d22801 
msg247321 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年07月25日 09:13
Thank you Ned.
Is anything left to do with this issue or it can be closed?
msg253235 - (view) Author: Jason R. Coombs (jaraco) * (Python committer) Date: 2015年10月20日 16:41
I suspect this change is implicated in issue25447.
History
Date User Action Args
2022年04月11日 14:57:28adminsetgithub: 58581
2015年11月01日 08:32:11serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2015年10月21日 07:26:05anacrolixsetnosy: - anacrolix
2015年10月20日 16:41:34jaracosetstatus: pending -> open
nosy: + jaraco
messages: + msg253235

2015年10月06日 04:20:50rhettingersetstatus: open -> pending
assignee: serhiy.storchaka
2015年10月06日 03:21:13rhettingersetstatus: pending -> open
assignee: rhettinger -> (no value)
2015年07月25日 09:13:21serhiy.storchakasetpriority: release blocker -> normal
status: open -> pending
messages: + msg247321
2015年07月25日 09:11:39python-devsetmessages: + msg247320
2015年07月25日 02:07:22ned.deilysetmessages: + msg247309
2015年07月21日 19:46:01serhiy.storchakasetpriority: normal -> release blocker
2015年07月14日 07:25:52ned.deilysetmessages: + msg246720
2015年07月14日 06:23:54serhiy.storchakasetfiles: + clru_cache_new.patch

messages: + msg246718
2015年07月11日 03:09:37ned.deilysetnosy: + ned.deily
messages: + msg246592
2015年07月10日 19:49:58Tim.Grahamsetnosy: - Tim.Graham
2015年07月10日 19:49:10scodersetmessages: + msg246582
2015年07月10日 19:46:03scodersetmessages: + msg246580
2015年07月10日 18:48:02serhiy.storchakasetmessages: + msg246576
2015年07月10日 17:59:52scodersetmessages: + msg246573
2015年06月22日 04:10:51serhiy.storchakasetmessages: + msg245618
2015年06月21日 17:37:09rhettingersetmessages: + msg245606
2015年06月08日 09:45:16python-devsetmessages: + msg244994
2015年06月08日 08:21:11python-devsetmessages: + msg244992
2015年06月07日 18:56:06serhiy.storchakasetfiles: + test_lru_cache_threaded.patch

messages: + msg244966
2015年06月07日 10:43:37Tim.Grahamsetmessages: + msg244950
2015年06月07日 06:44:39serhiy.storchakasetfiles: + clru_cache_descr_get.patch

messages: + msg244944
2015年06月06日 20:30:52Tim.Grahamsetnosy: + Tim.Graham
messages: + msg244928
2015年05月24日 20:33:07serhiy.storchakasetmessages: + msg244002
2015年05月24日 13:26:31serhiy.storchakasetdependencies: + Correct reuse argument tuple in property descriptor
2015年05月24日 13:18:47serhiy.storchakasetfiles: + clru_cache_4.patch

messages: + msg243982
2015年05月24日 07:13:31serhiy.storchakasetfiles: + clru_cache_3.patch, clru_cache_crasher.py

messages: + msg243970
2015年05月24日 04:21:35serhiy.storchakasetmessages: + msg243967
2015年05月24日 02:41:23rhettingersetassignee: serhiy.storchaka -> rhettinger
messages: + msg243962
2015年05月23日 19:48:36serhiy.storchakasetmessages: + msg243940
2015年05月23日 19:43:16python-devsetnosy: + python-dev
messages: + msg243939
2015年05月23日 17:15:17rhettingersetassignee: rhettinger -> serhiy.storchaka
messages: + msg243932
2015年04月10日 18:10:54kmikesetnosy: + kmike
2014年10月14日 15:44:57skrahsetnosy: - skrah
2014年08月07日 01:51:22pitrousetmessages: + msg224981
2014年08月07日 01:19:01rhettingersetmessages: + msg224980
2014年08月06日 15:34:01pitrousetmessages: + msg224941
2014年08月06日 15:30:20pitrousetpriority: low -> normal
2014年08月06日 15:22:45serhiy.storchakasetmessages: + msg224938
2014年07月18日 21:42:01BreamoreBoysetnosy: + BreamoreBoy
messages: + msg223439
2014年06月26日 19:12:45Aaron.Meurersetnosy: + Aaron.Meurer
2014年04月07日 16:14:56pokesetnosy: + poke
2014年03月20日 11:57:03rhettingersetpriority: high -> low

messages: + msg214209
2014年03月20日 11:51:15serhiy.storchakasetmessages: + msg214208
2014年03月06日 22:38:45josh.rsetnosy: + josh.r
2013年11月22日 21:48:52serhiy.storchakasetfiles: + clru_cache2.patch

messages: + msg203894
2013年11月22日 19:42:59pitrousetmessages: + msg203846
2013年11月22日 17:49:54serhiy.storchakasetpriority: low -> high
files: + clru_cache.patch
messages: + msg203821
2013年11月21日 16:43:43rhettingersetpriority: normal -> low

messages: + msg203654
versions: + Python 3.5, - Python 3.4
2013年11月21日 09:45:30serhiy.storchakasetmessages: + msg203599
2013年11月21日 09:34:58brechtmsetnosy: + brechtm
messages: + msg203597
2013年02月16日 00:39:48jceasetnosy: + jcea
2012年12月30日 20:04:12kachayevsetfiles: + 14373.v9.diff

messages: + msg178611
2012年12月30日 17:16:04serhiy.storchakasetmessages: + msg178589
2012年12月30日 13:44:02kachayevsetfiles: + 14373.v7.diff

messages: + msg178573
2012年12月23日 16:20:54scodersetmessages: + msg177993
2012年12月23日 12:40:00skrahsetmessages: + msg177980
2012年12月23日 12:10:56skrahsetnosy: + skrah
messages: + msg177976
2012年12月23日 10:08:24scodersetnosy: + scoder
messages: + msg177967
2012年12月22日 21:01:00serhiy.storchakasetmessages: + msg177955
stage: commit review -> patch review
2012年12月22日 20:21:15serhiy.storchakasetfiles: + lru_cache_bench.py

messages: + msg177953
stage: needs patch -> commit review
2012年12月21日 22:13:52kachayevsetfiles: + 14373.v4.diff

messages: + msg177903
2012年12月21日 19:38:44kachayevsetfiles: + 14373.v3.diff

messages: + msg177889
2012年12月21日 08:05:05kachayevsetmessages: + msg177867
2012年12月20日 11:57:36nedbatsetnosy: - nedbat
2012年12月20日 10:04:57serhiy.storchakasetmessages: + msg177808
stage: patch review -> needs patch
2012年12月20日 09:01:41serhiy.storchakasetstage: needs patch -> patch review
2012年12月19日 22:10:44asvetlovsetnosy: + amaury.forgeotdarc, pitrou
2012年12月19日 21:54:18kachayevsetfiles: + 14373.fixed.diff
nosy: + kachayev
messages: + msg177784

2012年11月24日 14:19:26serhiy.storchakasetmessages: + msg176287
components: + Extension Modules, - Interpreter Core
stage: patch review -> needs patch
2012年11月14日 10:45:09asvetlovsetnosy: + asvetlov
2012年11月14日 08:48:35serhiy.storchakasetnosy: + serhiy.storchaka
2012年11月14日 07:20:34ezio.melottisetmessages: + msg175545
2012年11月14日 07:06:11ezio.melottisetmessages: + msg175540
2012年11月14日 04:16:12anacrolixsetmessages: + msg175533
2012年11月13日 05:13:09ezio.melottisetnosy: + ezio.melotti
stage: patch review

versions: + Python 3.4, - Python 3.3
2012年11月13日 03:47:59eric.snowsetnosy: + eric.snow
2012年04月02日 07:20:41anacrolixsetfiles: - functools.lru_cache-in-c.patch
2012年04月02日 07:20:31anacrolixsetfiles: - functools.lru_cache-in-c.patch
2012年04月02日 07:20:15anacrolixsetfiles: + functools.lru_cache-in-c.patch

messages: + msg157343
2012年03月31日 20:10:34rhettingersetfiles: - lru_cache_class.py
2012年03月31日 20:10:00rhettingersetfiles: + lru_cache_class.py

messages: + msg157226
2012年03月31日 19:50:28rhettingersetfiles: + lru_cache_class.py

messages: + msg157224
2012年03月26日 10:54:47anacrolixsetfiles: - functools.lru_cache-in-c
2012年03月26日 10:53:33anacrolixsetfiles: + functools.lru_cache-in-c.patch

messages: + msg156808
2012年03月26日 08:29:55rhettingersetmessages: + msg156803
2012年03月25日 18:57:06anacrolixsetnosy: + nedbat
2012年03月21日 14:52:09meador.ingesetnosy: + meador.inge
2012年03月21日 12:10:43anacrolixsetfiles: + functools.lru_cache-in-c.patch
keywords: + patch
messages: + msg156492
2012年03月20日 19:14:14rhettingersetassignee: rhettinger
messages: + msg156449
2012年03月20日 18:30:15giampaolo.rodolasetnosy: + giampaolo.rodola
2012年03月20日 14:14:47anacrolixsetfiles: + functools.lru_cache-in-c
2012年03月20日 13:40:44anacrolixcreate

AltStyle によって変換されたページ (->オリジナル) /