This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2012年03月20日 13:40 by anacrolix, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| lru_cache_class.py | rhettinger, 2012年03月31日 20:10 | Updated pure python roadmap for the C implementation. Adds handling for kwd_mark and try/except | ||
| functools.lru_cache-in-c.patch | anacrolix, 2012年04月02日 07:20 | review | ||
| 14373.fixed.diff | kachayev, 2012年12月19日 21:54 | Python version works without C acceleration, test cases for both implementations | review | |
| 14373.v3.diff | kachayev, 2012年12月21日 19:38 | review | ||
| 14373.v4.diff | kachayev, 2012年12月21日 22:13 | review | ||
| lru_cache_bench.py | serhiy.storchaka, 2012年12月22日 20:21 | Benchmark script | ||
| 14373.v7.diff | kachayev, 2012年12月30日 13:43 | review | ||
| 14373.v9.diff | kachayev, 2012年12月30日 20:04 | review | ||
| clru_cache.patch | serhiy.storchaka, 2013年11月22日 17:49 | review | ||
| clru_cache2.patch | serhiy.storchaka, 2013年11月22日 21:48 | review | ||
| clru_cache_3.patch | serhiy.storchaka, 2015年05月24日 07:13 | review | ||
| clru_cache_crasher.py | serhiy.storchaka, 2015年05月24日 07:13 | |||
| clru_cache_4.patch | serhiy.storchaka, 2015年05月24日 13:18 | Removed fast path in make_key | review | |
| clru_cache_descr_get.patch | serhiy.storchaka, 2015年06月07日 06:44 | review | ||
| test_lru_cache_threaded.patch | serhiy.storchaka, 2015年06月07日 18:56 | review | ||
| clru_cache_new.patch | serhiy.storchaka, 2015年07月14日 06:23 | review | ||
| Messages (66) | |||
|---|---|---|---|
| msg156405 - (view) | Author: Matt Joiner (anacrolix) | Date: 2012年03月20日 13:40 | |
functools.lru_cache is optimized to the point that it may benefit from a C implementation. |
|||
| msg156449 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2012年03月20日 19:14 | |
Thank you for working on this. I look forward to reviewing it. |
|||
| msg156492 - (view) | Author: Matt Joiner (anacrolix) | Date: 2012年03月21日 12:10 | |
Updated patch to fix a crash if maxsize isn't given, and add a unit test for that. Possible issues: * I've tried to emulate object() by calling PyBaseObject_Type. Not sure if there's a more lightweight object for this that just provides the needed __hash__ and __eq__ attributes. * Not sure if kwd_mark is deallocated correctly. * Can't quite work out the best way to wrap the C cache_info() method to output the _CacheInfo named tuple. The current mechanism might not be wrapping the attributes correctly. |
|||
| msg156803 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2012年03月26日 08:29 | |
I've just started looking at this. Nice job and good attention to detail on the error checking. Expect to have a few high-level suggestions and a ton of minor edits. Here are a couple of quick thoughts: * The comment style needs to switch to the /* style */ * Use the pure python _cache_info by looking it up on the module object. * I had expected PyList objects rather than links between C structs (following the pure python lru_cache as closely as possible). |
|||
| msg156808 - (view) | Author: Matt Joiner (anacrolix) | Date: 2012年03月26日 10:53 | |
I've fixed the commenting, and cache_info use. I've left the element management in pure C as it reduces memory use (56 bytes for 4 element list, vs. 16 for lru_cache_elem), and avoids ref counting overhead (3 refs per link, plus GC). The difference might become quite marked for very large caches. There's also a nice invariant that links the key to the cache dict, and the result object to the lru_cache_elem. I'm happy to change this if it doesn't matter. My only concern now is the wrapping of the lru cache object. In the Python version, @wraps allows the lru_cache to masquerade as the wrapped function wrt str/repr. The C version is wrapped, but str/repr remain unchanged. Not sure if this is a problem. |
|||
| msg157224 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2012年03月31日 19:50 | |
Matt, I'm attaching a pure python version to serve as a better map for how to implement this in C. * it incorporate the recent lru_cache algorithmic updates (moving the root around the circular queue to re-use old links). * it shows which parts should be implemented in C using a regular type and shows how to call it from pure python. * it gives hints on use of #defines and PyDict_GetItem * the critical sections are marked so you can use the GIL rather than using locks. * there are hints for what datatypes to use (only the hits and misses need the ability to grow beyond sys.maxsize). * it shows how to access the named tuple from within the C code (using a straight PyObject_Call). * this code passes the test suite and should be directly translatable (and very fast). * please follow the model and use PyList objects instead of C structure for links * let me know if there are any questions. |
|||
| msg157226 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2012年03月31日 20:10 | |
Updated to show how to handle the kwd_mark and the try/except. |
|||
| msg157343 - (view) | Author: Matt Joiner (anacrolix) | Date: 2012年04月02日 07:20 | |
> * it incorporate the recent lru_cache algorithmic updates (moving the root around the circular queue to re-use old links). The existing C patch already does this. > * it shows which parts should be implemented in C using a regular type and shows how to call it from pure python. The existing patch already did this. > * it gives hints on use of #defines and PyDict_GetItem I'm familiar with C. I've retained PyDict_GetItemWithError so as not to suppress typing errors from keys constructed from bad arguments from the user. > * the critical sections are marked so you can use the GIL rather than using locks. The existing patch is already using the GIL, it didn't have any locks. > * there are hints for what datatypes to use (only the hits and misses need the ability to grow beyond sys.maxsize). The existing patch already had this. > * it shows how to access the named tuple from within the C code (using a straight PyObject_Call). I incorporated this. > * this code passes the test suite and should be directly translatable (and very fast). So did the old code. > * please follow the model and use PyList objects instead of C structure for links I've left this as is, the linked list in C is idiomatic, avoids a lot of branching and reference counting and is also more readable than the equivalent C/Python. |
|||
| msg175533 - (view) | Author: Matt Joiner (anacrolix) | Date: 2012年11月14日 04:16 | |
I look forward to your feedback Ezio. |
|||
| msg175540 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2012年11月14日 07:06 | |
I wonder if we should keep the original Python implementation alongside the new C version. If we do, it would be nice to see a 100% coverage of the Python version and have the tests check both the implementations. |
|||
| msg175545 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2012年11月14日 07:20 | |
See also #12428. |
|||
| msg176287 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年11月24日 14:19 | |
I have added a lot of comments in Rietveld. In general the patch looks good, but I have some style nitpicks and some more strong comments. Matt, please update the patch. Tests should test both Python and C implementations. I doubt if get rid of locking is right. Hash calculation and comparison of arguments can call Python code and this code can recursive use the wrapped function. Some benchmarks will be interesting. I think this acceleration must necessarily be in 3.4. |
|||
| msg177784 - (view) | Author: Alexey Kachayev (kachayev) * | Date: 2012年12月19日 21:54 | |
Updated patch with next points: * fix typedefs * python implementation works normally without C acceleration * test cases cover both python/c implementations * several style fixes in C module |
|||
| msg177808 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月20日 10:04 | |
Thanks, Alexey. However most of my comments were not considered. I have repeated them and added some new comments. |
|||
| msg177867 - (view) | Author: Alexey Kachayev (kachayev) * | Date: 2012年12月21日 08:05 | |
Serhiy, thank you for review. Working further on fixes. |
|||
| msg177889 - (view) | Author: Alexey Kachayev (kachayev) * | Date: 2012年12月21日 19:38 | |
Fixed my previous patch according to all comments from review, except removing keyword arguments from lru_cache_new function. |
|||
| msg177903 - (view) | Author: Alexey Kachayev (kachayev) * | Date: 2012年12月21日 22:13 | |
Added additional Py_DECREF(s) for key and value. |
|||
| msg177953 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月22日 20:21 | |
LGTM. Thank you, Matt and Alexey. Here is a simple benchmark. On my computer it shows 10-25x speedup using the C accelerator. |
|||
| msg177955 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月22日 21:00 | |
Antoine reminded me about a lock. In Python implementation it needed because linked list modifications are not atomic. In C implementation linked list modifications are atomic. However dict operations can call Python code and therefore they are not atomic. I don't know what bad things can happened with concurrent cache updating, however using lock will be safer and cheap enought. Please add lock field to lru_cache_object and use it as in Python implementation. If no one can prove that a lock is not needed. |
|||
| msg177967 - (view) | Author: Stefan Behnel (scoder) * (Python committer) | Date: 2012年12月23日 10:08 | |
Just for the record, I've compiled Raymond's roadmap version in Cython (with only slight changes to make 'self.maxsize' a Py_ssize_t and using an external .pxd for typing) and ran Serhiy's benchmark over it (Ubuntu 12.10, 64bit). This is what I get in Py3.4:
0.022 untyped_cy(i)
0.023 untyped_cy("spam", i)
0.024 untyped_cy("spam", "spam", i)
0.106 untyped_cy(a=i)
0.133 untyped_cy(a="spam", b=i)
0.152 untyped_cy(a="spam", b="spam", c=i)
0.033 typed_cy(i)
0.038 typed_cy("spam", i)
0.039 typed_cy("spam", "spam", i)
0.129 typed_cy(a=i)
0.168 typed_cy(a="spam", b=i)
0.183 typed_cy(a="spam", b="spam", c=i)
0.143 untyped_py(i)
0.234 untyped_py("spam", i)
0.247 untyped_py("spam", "spam", i)
0.368 untyped_py(a=i)
0.406 untyped_py(a="spam", b=i)
0.425 untyped_py(a="spam", b="spam", c=i)
0.447 typed_py(i)
0.469 typed_py("spam", i)
0.480 typed_py("spam", "spam", i)
0.745 typed_py(a=i)
0.783 typed_py(a="spam", b=i)
0.819 typed_py(a="spam", b="spam", c=i)
Looking at the factors, that's about the same speedup that the dedicated hand tuned C implementation presented according to Serhiy's own runs (he reported 10-25x). Makes me wonder why we should have two entirely separate implementations for this.
Here's the lru_cache_class.pxd file I used:
"""
cimport cython
cdef make_key(tuple args, dict kwds, bint typed, tuple kwd_mark)
@cython.final
@cython.internal
cdef class c_lru_cache:
cdef dict cache
cdef Py_ssize_t hits
cdef Py_ssize_t misses
cdef Py_ssize_t maxsize
cdef bint typed
cdef object user_function
cdef object cache_info_type
cdef tuple kwd_mark
cdef list root
"""
|
|||
| msg177976 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2012年12月23日 12:10 | |
Hmm. Judging by the numbers for the Python version, my machine appears to be slower than Stefan (Behnel)'s machine, and yet the C version is much faster here than the posted Cython numbers. If I adjust the results for the machine differences, the C version would appear to be 2.5x faster than the Cython version. |
|||
| msg177980 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2012年12月23日 12:40 | |
I've managed to build the Cython version now. It's in fact between 4 and 6 times slower here than the C version. |
|||
| msg177993 - (view) | Author: Stefan Behnel (scoder) * (Python committer) | Date: 2012年12月23日 16:20 | |
Yep, I basically didn't do any optimisation, it's the plain Python code compiled, just with the class being converted into an extension type. |
|||
| msg178573 - (view) | Author: Alexey Kachayev (kachayev) * | Date: 2012年12月30日 13:43 | |
Thread-safe implementation for cache cleanup. |
|||
| msg178589 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月30日 17:16 | |
Alexey, as I see, you have missed some Antoine's comments (and my comments about whitespaces). Please, be more careful. |
|||
| msg178611 - (view) | Author: Alexey Kachayev (kachayev) * | Date: 2012年12月30日 20:04 | |
Updated diff with: * fix object leaks * tp_clear * additional test for maxsize < 0 I also reimplemented multithreading test (fixed error and added skip rule). But actually, I'm not sure that it's enough for ensuring thread-safety of clear operation. I'm working on other variant now. I will be appreciated for any advice about where to find example of the same (or close enough) test cases from other modules. Regarding to previous comments from review. 1. "guard against negative numbers" I added special test case for negative maxsize in order to show, that now C version works the same as Python one. So, should we change both implementation? What behavior is most logical here? Reimplementation will change public API, so it's not only about acceleration. 2. Use regular PyObject instead of lru_list_elem. What the problems are with current implementation? |
|||
| msg203597 - (view) | Author: Brecht Machiels (brechtm) | Date: 2013年11月21日 09:34 | |
What's the status of this patch? What still needs to be done for it to be accepted? |
|||
| msg203599 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年11月21日 09:45 | |
I'm working on this. |
|||
| msg203654 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2013年11月21日 16:43 | |
After Serhiy looks at this, I need to review it for reentrancy thread-safety and reentrancy issues. |
|||
| msg203821 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年11月22日 17:49 | |
Here is a revised patch. It is synchronized with tip. Got rid of capsules (sped up about 10%). lru_list_elem is now true Python object. Got rid of the lock, the new code must be thread-safe with GIL only. I very much hope that this code will be included in 3.4. We can fix bugs later if there are any. I am also planning to optimize the make_key() functions. |
|||
| msg203846 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年11月22日 19:42 | |
> Got rid of the lock, the new code must be thread-safe with GIL only. I'm not convinced by this (see the review I posted). |
|||
| msg203894 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年11月22日 21:48 | |
Updated patch fixes bugs found by Antoine. Thank you Antoine. |
|||
| msg214208 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年03月20日 11:51 | |
Could you please make a review Raymond? |
|||
| msg214209 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2014年03月20日 11:57 | |
> Could you please make a review Raymond? Yes, I will take a look. I looking a making other changes to the lru_cache and don't want the C implementation to go it first. There are still some open questions about re-entrancy that still need to be addressed before a C implementation gets added to the mix. |
|||
| msg223439 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2014年07月18日 21:42 | |
https://pypi.python.org/pypi/fastcache/0.4.0 also seems relevant. |
|||
| msg224938 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年08月06日 15:22 | |
Raymond, do you have time for the review? It will be easier to support both implementation at the same time, than change only Python implementation and asynchronously update the patch with the risk to lost changes in C implementation. I think we should increase the priority of this issue. Efficient lru_cache will affect other parts of the stdlib. In particular the use of lru_cache was withdrawed in the re module due to large overhead of Python implementation. The ipaddress module now uses own specialized implementation of the caching instead of general lru_cache for the same reason. Issue13299 proposition will be more acceptable with faster lru_cache. |
|||
| msg224941 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年08月06日 15:34 | |
I think a lock is still needed for cache misses. The dict operations (set and del) can release the GIL (as well as course as the PyObject_Call()), therefore you might end up with duplicate list links for a given key. (and given cache misses are supposed to be much more expensive anyway, I don't think acquiring a lock there is detrimental) |
|||
| msg224980 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2014年08月07日 01:19 | |
> I think we should increase the priority of this issue. I don't think so at all. The LRU cache we have now is plenty efficient for its intended use cases (caching I/O bound functions and expensive functions). If is only unsuitable for functions that are already blazingly fast. Getting the locks right and carefully looking for re-entrancy issues is important. Also, keeping the memory footprint of the keys small is important (if people didn't care about space, they wouldn't be using an LRU at all). I will look at this but currently have much higher priorities elsewhere in Python (adding C accelerators for tricky code is less important for the time being -- we have a long time until 3.5). |
|||
| msg224981 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年08月07日 01:51 | |
Le 06/08/2014 21:19, Raymond Hettinger a écrit : > > I don't think so at all. The LRU cache we have now is plenty > efficient for its intended use cases (caching I/O bound functions and expensive functions). If is only unsuitable for functions that are already blazingly fast. This is an unrealistic simplification. Many functions can be either expensive or blazingly fast, depending on their input (typical examples are re.compile(), math.factorial()...). But the decision of applying the lru_cache decorator is a compile-time binary decision: it cannot encode the varying properties of the function depending on its inputs. |
|||
| msg243932 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2015年05月23日 17:15 | |
Serhiy, go ahead and apply your patch, lru_cache2.patch. I'll fix-up the make_key() code after the beta1 (like the pure python version, it needs to guarantee that hash is called no more than once per key). |
|||
| msg243939 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年05月23日 19:43 | |
New changeset 57776eee74f2 by Serhiy Storchaka in branch 'default': Issue #14373: Added C implementation of functools.lru_cache(). Based on https://hg.python.org/cpython/rev/57776eee74f2 |
|||
| msg243940 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月23日 19:48 | |
Thanks Raymond. I tried to write an implementation with locking, but it would be too complicated (much more complex than all proposed before patches), because recursive locks are needed. In any case I think that GIL is enough here. Locking in Python implementation is needed for atomic modifying linked list. |
|||
| msg243962 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2015年05月24日 02:41 | |
Thanks Serhiy. I'll work on the next steps after the beta. |
|||
| msg243967 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月24日 04:21 | |
Unfortunately the patch caused a crash in test_ipaddress. Larry granted special exclusion for adding this feature after feature freeze if it will be fixed before beta 2. |
|||
| msg243970 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月24日 07:13 | |
Here are the recent version of the patch and minimal crash reproducer. |
|||
| msg243982 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月24日 13:18 | |
The problem is in property descriptor getter. It uses cached tuple for args and can unexpectedly modify it. Opened issue24276 for fixing this bug. Another way is to remove fast path in lru_cache_make_key() and always copy args tuple. |
|||
| msg244002 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月24日 20:33 | |
Backed out the backout in cb30db9cc029. |
|||
| msg244928 - (view) | Author: Tim Graham (Tim.Graham) * | Date: 2015年06月06日 20:30 | |
This changed caused a problem in Django's test suite (bisected to 0d0989359bbb0). File "/home/tim/code/django/django/db/models/options.py", line 709, in _populate_directed_relation_graph all_models = self.apps.get_models(include_auto_created=True) TypeError: get_models() missing 1 required positional argument: 'self' The get_models() method is decorated with @lru_cache.lru_cache(maxsize=None) https://github.com/django/django/blob/c2b4967e76fd671e6199e4dd54d2a2c1f096b8eb/django/apps/registry.py#L157-L179 |
|||
| msg244944 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月07日 06:44 | |
Here is a patch that adds __get__ for lru_cache object. |
|||
| msg244950 - (view) | Author: Tim Graham (Tim.Graham) * | Date: 2015年06月07日 10:43 | |
Thanks, that does resolve the issue. |
|||
| msg244966 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月07日 18:56 | |
test_lru_cache_threaded is randomly failed on PPC64 PowerLinux buildbot: http://buildbot.python.org/all/builders/PPC64%20PowerLinux%203.x/builds/3573/steps/test/logs/stdio ====================================================================== FAIL: test_lru_cache_threaded (test.test_functools.TestLRUPy) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/shager/cpython-buildarea/3.x.edelsohn-powerlinux-ppc64/build/Lib/test/test_functools.py", line 1129, in test_lru_cache_threaded self.assertEqual(hits, 45) AssertionError: 43 != 45 ---------------------------------------------------------------------- The tests looks incorrect, it should fail when cached function is called simultaneously from different threads. Unfortunately it is hard reproduce the failure on other platform. Proposed patch fixes the test and adds other threaded test, specially for simultaneous call. |
|||
| msg244992 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年06月08日 08:21 | |
New changeset 19dbee688a30 by Serhiy Storchaka in branch '3.5': Issue #14373: Fixed threaded test for lru_cache(). Added new threaded test. https://hg.python.org/cpython/rev/19dbee688a30 New changeset da331f50aad4 by Serhiy Storchaka in branch 'default': Issue #14373: Fixed threaded test for lru_cache(). Added new threaded test. https://hg.python.org/cpython/rev/da331f50aad4 New changeset eff50d543c79 by Serhiy Storchaka in branch '3.5': Issue #14373: C implementation of functools.lru_cache() now can be used with https://hg.python.org/cpython/rev/eff50d543c79 New changeset f4b785d14792 by Serhiy Storchaka in branch 'default': Issue #14373: C implementation of functools.lru_cache() now can be used with https://hg.python.org/cpython/rev/f4b785d14792 |
|||
| msg244994 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年06月08日 09:45 | |
New changeset c52f381fe674 by Serhiy Storchaka in branch '3.5': Issue #14373: Other attempt to fix threaded test for lru_cache(). https://hg.python.org/cpython/rev/c52f381fe674 New changeset 73acb8c187d4 by Serhiy Storchaka in branch 'default': Issue #14373: Other attempt to fix threaded test for lru_cache(). https://hg.python.org/cpython/rev/73acb8c187d4 |
|||
| msg245606 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2015年06月21日 17:37 | |
If the C version is to remain in Python3.5, please make sure it provides all of the carefully designed features of the pure python version: * The hash function is called no more than once per element * The key is constructed to be flat as possible * Thread-safe where needed: - make_key called outside locks - within a reentrant lock (either a real lock or the GIL), check to see if the key is in the cache and if so, update the stats and return the value + outside of locks, call the user function (allowing for recursive functions) + within a reentrant lock (either a real lock or the GIL) handle cache misses by 1) verifying there was not an intervening cache update by the user function, 2) updating links, 3) updating stats. During the updates, make only one potentially reentrant cache[key] assignment after all invariants have been restored. Save all decrefs until all other state updates have been completed. A number of features of the lru_cache were designed for space savings over speed (lru is all about eviction to make space for a new entry), for thread safety and to not fall apart during reentrancy. It also provides a guarantee that the hash function is not called more than once per element and is called *before* any of the lru structure updates or lookups (this makes reasoning about correctness *much* easier). In these capabilities can't be reliably reproduced for 3.5, I think the whole thing should be reverted and deferred to 3.6. |
|||
| msg245618 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月22日 04:10 | |
> If the C version is to remain in Python3.5, please make sure it provides all > of the carefully designed features of the pure python version: > > * The hash function is called no more than once per element Will be satisfied by issue24483. I think all other capabilities are satisfied. The GIL is used to satisfy thread-safe. |
|||
| msg246573 - (view) | Author: Stefan Behnel (scoder) * (Python committer) | Date: 2015年07月10日 17:59 | |
I'm witnessing a crash in the C implementation during garbage collection. Interestingly, it only shows in the Py3.6 branch, not in Py3.5 (both latest). Here's the gdb session:
Program received signal SIGSEGV, Segmentation fault.
lru_cache_tp_traverse (self=0x7ffff2a80ae8, visit=0x43c528 <visit_decref>, arg=0x0) at ./Modules/_functoolsmodule.c:1040
1040 lru_list_elem *next = link->next;
(gdb) list
1035 static int
1036 lru_cache_tp_traverse(lru_cache_object *self, visitproc visit, void *arg)
1037 {
1038 lru_list_elem *link = self->root.next;
1039 while (link != &self->root) {
1040 lru_list_elem *next = link->next;
1041 Py_VISIT(link);
1042 link = next;
1043 }
1044 Py_VISIT(self->maxsize_O);
(gdb) print link
1ドル = (lru_list_elem *) 0x0
(gdb) print self
2ドル = (lru_cache_object *) 0x7ffff2a80ae8
(gdb) print self->root.next
3ドル = (struct lru_list_elem *) 0x0
(gdb) print self->root
4ドル = {ob_base = {_ob_next = 0x7ffff2a26458, _ob_prev = 0x90e860 <refchain>, ob_refcnt = 1, ob_type = 0x92c500 <lru_cache_type>}, prev = 0x0, next = 0x0, key = 0x0, result = 0x0}
IIUC correctly, there is only one entry and the code doesn't expect that. An additional "is empty?" NULL check may or may not be the right fix. It might also be the linking itself that's incorrect here. The code seems to expect a cyclic data structure which is apparently not maintained that way.
|
|||
| msg246576 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年07月10日 18:48 | |
self->root.next and self->root.prev should never be NULL. Could you please provide minimal example of code that produces a crash? |
|||
| msg246580 - (view) | Author: Stefan Behnel (scoder) * (Python committer) | Date: 2015年07月10日 19:46 | |
It's not actually my own code using the lru_cache here. From a quick grep over the code tree, the only potentially related usage I found was in Python's fnmatch module, on the "_compile_pattern()" function. Commenting that out then made the crash go away, so this was the culprit. However, I ran test_functools.py of the same installation and it passes, so not every usage is broken here. Simple manual testing didn't reveal anything either so far. |
|||
| msg246582 - (view) | Author: Stefan Behnel (scoder) * (Python committer) | Date: 2015年07月10日 19:49 | |
test_fnmatch.py also passes, BTW. |
|||
| msg246592 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2015年07月11日 03:09 | |
I've also seen a crash in lru_cache_tp_traverse but with the 3.5.0b3 release build for the OS X 64-bit/32-bit installer. I just stumbled across the segfault by bringing up the interactive interpreter and typing "import ssl". After a lot of playing around, I reduced the failing case to: 1. have an "import pprint" in a startup file referred to by PYTHONSTARTUP *and* 2. "import ssl" must be the very first command entered in the interactive interpreter. Odd! Unfortunately, the release build is a non-debug build and, so far, I have not been able to reproduce the segfault with any other build, debug or non-debug. So, whatever the problem is, it's very build dependent. Here is the OS X system traceback from the segfault: Path: /Library/Frameworks/Python.framework/Versions/3.5.0b3_10_6/Resources/Python.app/Contents/MacOS/Python Identifier: Python Version: ??? Code Type: X86-64 (Native) Parent Process: bash [51285] Responsible: iTerm [754] User ID: 503 Date/Time: 2015年07月10日 19:57:16.086 -0700 OS Version: Mac OS X 10.10.4 (14E46) Report Version: 11 Anonymous UUID: CFED52E3-698C-835B-D61C-F4B1F18D2CB6 Time Awake Since Boot: 800000 seconds Crashed Thread: 0 Dispatch queue: com.apple.main-thread Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Codes: KERN_INVALID_ADDRESS at 0x0000000000000018 VM Regions Near 0x18: --> __TEXT 0000000100000000-0000000100001000 [ 4K] r-x/rwx SM=COW /Library/Frameworks/Python.framework/Versions/3.5.0b3_10_6/Resources/Python.app/Contents/MacOS/Python Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 org.python.python 0x000000010015e5e5 lru_cache_tp_traverse + 37 1 org.python.python 0x000000010013a2d7 collect + 439 2 org.python.python 0x000000010013aee5 _PyObject_GC_Alloc + 357 3 org.python.python 0x000000010013afe7 _PyObject_GC_New + 23 4 org.python.python 0x0000000100059bce PyDict_New + 334 5 org.python.python 0x000000010015f029 lru_cache_new + 249 6 org.python.python 0x00000001000795a6 type_call + 38 7 org.python.python 0x000000010000dc93 PyObject_Call + 99 8 org.python.python 0x00000001000e9fd8 PyEval_EvalFrameEx + 7656 9 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 10 org.python.python 0x00000001000f035d PyEval_EvalFrameEx + 33133 11 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 12 org.python.python 0x00000001000f1e07 PyEval_EvalCodeEx + 71 13 org.python.python 0x00000001000e5ff5 builtin___build_class__ + 485 14 org.python.python 0x0000000100065549 PyCFunction_Call + 281 15 org.python.python 0x00000001000f0768 PyEval_EvalFrameEx + 34168 16 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 17 org.python.python 0x00000001000f1e61 PyEval_EvalCode + 81 18 org.python.python 0x00000001000e5683 builtin_exec + 627 19 org.python.python 0x0000000100065519 PyCFunction_Call + 233 20 org.python.python 0x00000001000f0a9b PyEval_EvalFrameEx + 34987 21 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 22 org.python.python 0x00000001000f035d PyEval_EvalFrameEx + 33133 23 org.python.python 0x00000001000f07fd PyEval_EvalFrameEx + 34317 24 org.python.python 0x00000001000f07fd PyEval_EvalFrameEx + 34317 25 org.python.python 0x00000001000f07fd PyEval_EvalFrameEx + 34317 26 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 27 org.python.python 0x00000001000f1e07 PyEval_EvalCodeEx + 71 28 org.python.python 0x000000010004017a function_call + 186 29 org.python.python 0x000000010000dc93 PyObject_Call + 99 30 org.python.python 0x0000000100010ff6 _PyObject_CallMethodIdObjArgs + 454 31 org.python.python 0x000000010010d6d3 PyImport_ImportModuleLevelObject + 1171 32 org.python.python 0x00000001000e5e03 builtin___import__ + 131 33 org.python.python 0x0000000100065549 PyCFunction_Call + 281 34 org.python.python 0x000000010000dc93 PyObject_Call + 99 35 org.python.python 0x00000001000e64f7 PyEval_CallObjectWithKeywords + 87 36 org.python.python 0x00000001000ea43e PyEval_EvalFrameEx + 8782 37 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 38 org.python.python 0x00000001000f1e61 PyEval_EvalCode + 81 39 org.python.python 0x00000001000e5683 builtin_exec + 627 40 org.python.python 0x0000000100065519 PyCFunction_Call + 233 41 org.python.python 0x00000001000f0a9b PyEval_EvalFrameEx + 34987 42 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 43 org.python.python 0x00000001000f035d PyEval_EvalFrameEx + 33133 44 org.python.python 0x00000001000f07fd PyEval_EvalFrameEx + 34317 45 org.python.python 0x00000001000f07fd PyEval_EvalFrameEx + 34317 46 org.python.python 0x00000001000f07fd PyEval_EvalFrameEx + 34317 47 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 48 org.python.python 0x00000001000f1e07 PyEval_EvalCodeEx + 71 49 org.python.python 0x000000010004017a function_call + 186 50 org.python.python 0x000000010000dc93 PyObject_Call + 99 51 org.python.python 0x0000000100010ff6 _PyObject_CallMethodIdObjArgs + 454 52 org.python.python 0x000000010010d6d3 PyImport_ImportModuleLevelObject + 1171 53 org.python.python 0x00000001000e5e03 builtin___import__ + 131 54 org.python.python 0x0000000100065549 PyCFunction_Call + 281 55 org.python.python 0x000000010000dc93 PyObject_Call + 99 56 org.python.python 0x00000001000e64f7 PyEval_CallObjectWithKeywords + 87 57 org.python.python 0x00000001000ea43e PyEval_EvalFrameEx + 8782 58 org.python.python 0x00000001000f1d00 _PyEval_EvalCodeWithName + 2400 59 org.python.python 0x00000001000f1e61 PyEval_EvalCode + 81 60 org.python.python 0x000000010011fa7a PyRun_InteractiveOneObject + 474 61 org.python.python 0x000000010011fdfe PyRun_InteractiveLoopFlags + 110 62 org.python.python 0x000000010012076c PyRun_AnyFileExFlags + 76 63 org.python.python 0x00000001001390a9 Py_Main + 3785 64 org.python.python 0x0000000100000e32 0x100000000 + 3634 65 org.python.python 0x0000000100000c84 0x100000000 + 3204 Thread 0 crashed with X86 Thread State (64-bit): rax: 0x000000010023d6c0 rbx: 0x00000001006fdb70 rcx: 0x0000000100382048 rdx: 0x0000000000000000 rdi: 0x0000000000000000 rsi: 0x00000001001392e0 rbp: 0x00007fff5bffb530 rsp: 0x00007fff5bffb510 r8: 0x0000000000000000 r9: 0x0000000000000001 r10: 0x0000000000000016 r11: 0x00000001001392e0 r12: 0x00000001006fdb88 r13: 0x0000000000000000 r14: 0x00000001001392e0 r15: 0x000000010102fdb8 rip: 0x000000010015e5e5 rfl: 0x0000000000010206 cr2: 0x0000000000000018 |
|||
| msg246718 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年07月14日 06:23 | |
Stefan, Ned, can you reproduce the same crash with following patch? |
|||
| msg246720 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2015年07月14日 07:25 | |
Serhiy, I'll try making a 3.5.0b3+patch installer build in the same manner as the 3.5.0b3 installer build. But I may not get to it for several days. |
|||
| msg247309 - (view) | Author: Ned Deily (ned.deily) * (Python committer) | Date: 2015年07月25日 02:07 | |
Sorry about the delay in testing the patch. I just confirmed (1) that I am still able to produce a segfault on OS X as described above under the specific conditions with a 10.6-like installer built with the current 3.5 tip and (2) that, with clru_cache_new.patch applied to the same current 3.5 tip, I no longer see the segfault. So it looks like a fix. |
|||
| msg247320 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年07月25日 09:11 | |
New changeset 5345e5ce2eed by Serhiy Storchaka in branch '3.5': Issue #14373: Fixed segmentation fault when gc.collect() is called during https://hg.python.org/cpython/rev/5345e5ce2eed New changeset 9c6d11d22801 by Serhiy Storchaka in branch 'default': Issue #14373: Fixed segmentation fault when gc.collect() is called during https://hg.python.org/cpython/rev/9c6d11d22801 |
|||
| msg247321 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年07月25日 09:13 | |
Thank you Ned. Is anything left to do with this issue or it can be closed? |
|||
| msg253235 - (view) | Author: Jason R. Coombs (jaraco) * (Python committer) | Date: 2015年10月20日 16:41 | |
I suspect this change is implicated in issue25447. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:28 | admin | set | github: 58581 |
| 2015年11月01日 08:32:11 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2015年10月21日 07:26:05 | anacrolix | set | nosy:
- anacrolix |
| 2015年10月20日 16:41:34 | jaraco | set | status: pending -> open nosy: + jaraco messages: + msg253235 |
| 2015年10月06日 04:20:50 | rhettinger | set | status: open -> pending assignee: serhiy.storchaka |
| 2015年10月06日 03:21:13 | rhettinger | set | status: pending -> open assignee: rhettinger -> (no value) |
| 2015年07月25日 09:13:21 | serhiy.storchaka | set | priority: release blocker -> normal status: open -> pending messages: + msg247321 |
| 2015年07月25日 09:11:39 | python-dev | set | messages: + msg247320 |
| 2015年07月25日 02:07:22 | ned.deily | set | messages: + msg247309 |
| 2015年07月21日 19:46:01 | serhiy.storchaka | set | priority: normal -> release blocker |
| 2015年07月14日 07:25:52 | ned.deily | set | messages: + msg246720 |
| 2015年07月14日 06:23:54 | serhiy.storchaka | set | files:
+ clru_cache_new.patch messages: + msg246718 |
| 2015年07月11日 03:09:37 | ned.deily | set | nosy:
+ ned.deily messages: + msg246592 |
| 2015年07月10日 19:49:58 | Tim.Graham | set | nosy:
- Tim.Graham |
| 2015年07月10日 19:49:10 | scoder | set | messages: + msg246582 |
| 2015年07月10日 19:46:03 | scoder | set | messages: + msg246580 |
| 2015年07月10日 18:48:02 | serhiy.storchaka | set | messages: + msg246576 |
| 2015年07月10日 17:59:52 | scoder | set | messages: + msg246573 |
| 2015年06月22日 04:10:51 | serhiy.storchaka | set | messages: + msg245618 |
| 2015年06月21日 17:37:09 | rhettinger | set | messages: + msg245606 |
| 2015年06月08日 09:45:16 | python-dev | set | messages: + msg244994 |
| 2015年06月08日 08:21:11 | python-dev | set | messages: + msg244992 |
| 2015年06月07日 18:56:06 | serhiy.storchaka | set | files:
+ test_lru_cache_threaded.patch messages: + msg244966 |
| 2015年06月07日 10:43:37 | Tim.Graham | set | messages: + msg244950 |
| 2015年06月07日 06:44:39 | serhiy.storchaka | set | files:
+ clru_cache_descr_get.patch messages: + msg244944 |
| 2015年06月06日 20:30:52 | Tim.Graham | set | nosy:
+ Tim.Graham messages: + msg244928 |
| 2015年05月24日 20:33:07 | serhiy.storchaka | set | messages: + msg244002 |
| 2015年05月24日 13:26:31 | serhiy.storchaka | set | dependencies: + Correct reuse argument tuple in property descriptor |
| 2015年05月24日 13:18:47 | serhiy.storchaka | set | files:
+ clru_cache_4.patch messages: + msg243982 |
| 2015年05月24日 07:13:31 | serhiy.storchaka | set | files:
+ clru_cache_3.patch, clru_cache_crasher.py messages: + msg243970 |
| 2015年05月24日 04:21:35 | serhiy.storchaka | set | messages: + msg243967 |
| 2015年05月24日 02:41:23 | rhettinger | set | assignee: serhiy.storchaka -> rhettinger messages: + msg243962 |
| 2015年05月23日 19:48:36 | serhiy.storchaka | set | messages: + msg243940 |
| 2015年05月23日 19:43:16 | python-dev | set | nosy:
+ python-dev messages: + msg243939 |
| 2015年05月23日 17:15:17 | rhettinger | set | assignee: rhettinger -> serhiy.storchaka messages: + msg243932 |
| 2015年04月10日 18:10:54 | kmike | set | nosy:
+ kmike |
| 2014年10月14日 15:44:57 | skrah | set | nosy:
- skrah |
| 2014年08月07日 01:51:22 | pitrou | set | messages: + msg224981 |
| 2014年08月07日 01:19:01 | rhettinger | set | messages: + msg224980 |
| 2014年08月06日 15:34:01 | pitrou | set | messages: + msg224941 |
| 2014年08月06日 15:30:20 | pitrou | set | priority: low -> normal |
| 2014年08月06日 15:22:45 | serhiy.storchaka | set | messages: + msg224938 |
| 2014年07月18日 21:42:01 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages: + msg223439 |
| 2014年06月26日 19:12:45 | Aaron.Meurer | set | nosy:
+ Aaron.Meurer |
| 2014年04月07日 16:14:56 | poke | set | nosy:
+ poke |
| 2014年03月20日 11:57:03 | rhettinger | set | priority: high -> low messages: + msg214209 |
| 2014年03月20日 11:51:15 | serhiy.storchaka | set | messages: + msg214208 |
| 2014年03月06日 22:38:45 | josh.r | set | nosy:
+ josh.r |
| 2013年11月22日 21:48:52 | serhiy.storchaka | set | files:
+ clru_cache2.patch messages: + msg203894 |
| 2013年11月22日 19:42:59 | pitrou | set | messages: + msg203846 |
| 2013年11月22日 17:49:54 | serhiy.storchaka | set | priority: low -> high files: + clru_cache.patch messages: + msg203821 |
| 2013年11月21日 16:43:43 | rhettinger | set | priority: normal -> low messages: + msg203654 versions: + Python 3.5, - Python 3.4 |
| 2013年11月21日 09:45:30 | serhiy.storchaka | set | messages: + msg203599 |
| 2013年11月21日 09:34:58 | brechtm | set | nosy:
+ brechtm messages: + msg203597 |
| 2013年02月16日 00:39:48 | jcea | set | nosy:
+ jcea |
| 2012年12月30日 20:04:12 | kachayev | set | files:
+ 14373.v9.diff messages: + msg178611 |
| 2012年12月30日 17:16:04 | serhiy.storchaka | set | messages: + msg178589 |
| 2012年12月30日 13:44:02 | kachayev | set | files:
+ 14373.v7.diff messages: + msg178573 |
| 2012年12月23日 16:20:54 | scoder | set | messages: + msg177993 |
| 2012年12月23日 12:40:00 | skrah | set | messages: + msg177980 |
| 2012年12月23日 12:10:56 | skrah | set | nosy:
+ skrah messages: + msg177976 |
| 2012年12月23日 10:08:24 | scoder | set | nosy:
+ scoder messages: + msg177967 |
| 2012年12月22日 21:01:00 | serhiy.storchaka | set | messages:
+ msg177955 stage: commit review -> patch review |
| 2012年12月22日 20:21:15 | serhiy.storchaka | set | files:
+ lru_cache_bench.py messages: + msg177953 stage: needs patch -> commit review |
| 2012年12月21日 22:13:52 | kachayev | set | files:
+ 14373.v4.diff messages: + msg177903 |
| 2012年12月21日 19:38:44 | kachayev | set | files:
+ 14373.v3.diff messages: + msg177889 |
| 2012年12月21日 08:05:05 | kachayev | set | messages: + msg177867 |
| 2012年12月20日 11:57:36 | nedbat | set | nosy:
- nedbat |
| 2012年12月20日 10:04:57 | serhiy.storchaka | set | messages:
+ msg177808 stage: patch review -> needs patch |
| 2012年12月20日 09:01:41 | serhiy.storchaka | set | stage: needs patch -> patch review |
| 2012年12月19日 22:10:44 | asvetlov | set | nosy:
+ amaury.forgeotdarc, pitrou |
| 2012年12月19日 21:54:18 | kachayev | set | files:
+ 14373.fixed.diff nosy: + kachayev messages: + msg177784 |
| 2012年11月24日 14:19:26 | serhiy.storchaka | set | messages:
+ msg176287 components: + Extension Modules, - Interpreter Core stage: patch review -> needs patch |
| 2012年11月14日 10:45:09 | asvetlov | set | nosy:
+ asvetlov |
| 2012年11月14日 08:48:35 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka |
| 2012年11月14日 07:20:34 | ezio.melotti | set | messages: + msg175545 |
| 2012年11月14日 07:06:11 | ezio.melotti | set | messages: + msg175540 |
| 2012年11月14日 04:16:12 | anacrolix | set | messages: + msg175533 |
| 2012年11月13日 05:13:09 | ezio.melotti | set | nosy:
+ ezio.melotti stage: patch review versions: + Python 3.4, - Python 3.3 |
| 2012年11月13日 03:47:59 | eric.snow | set | nosy:
+ eric.snow |
| 2012年04月02日 07:20:41 | anacrolix | set | files: - functools.lru_cache-in-c.patch |
| 2012年04月02日 07:20:31 | anacrolix | set | files: - functools.lru_cache-in-c.patch |
| 2012年04月02日 07:20:15 | anacrolix | set | files:
+ functools.lru_cache-in-c.patch messages: + msg157343 |
| 2012年03月31日 20:10:34 | rhettinger | set | files: - lru_cache_class.py |
| 2012年03月31日 20:10:00 | rhettinger | set | files:
+ lru_cache_class.py messages: + msg157226 |
| 2012年03月31日 19:50:28 | rhettinger | set | files:
+ lru_cache_class.py messages: + msg157224 |
| 2012年03月26日 10:54:47 | anacrolix | set | files: - functools.lru_cache-in-c |
| 2012年03月26日 10:53:33 | anacrolix | set | files:
+ functools.lru_cache-in-c.patch messages: + msg156808 |
| 2012年03月26日 08:29:55 | rhettinger | set | messages: + msg156803 |
| 2012年03月25日 18:57:06 | anacrolix | set | nosy:
+ nedbat |
| 2012年03月21日 14:52:09 | meador.inge | set | nosy:
+ meador.inge |
| 2012年03月21日 12:10:43 | anacrolix | set | files:
+ functools.lru_cache-in-c.patch keywords: + patch messages: + msg156492 |
| 2012年03月20日 19:14:14 | rhettinger | set | assignee: rhettinger messages: + msg156449 |
| 2012年03月20日 18:30:15 | giampaolo.rodola | set | nosy:
+ giampaolo.rodola |
| 2012年03月20日 14:14:47 | anacrolix | set | files: + functools.lru_cache-in-c |
| 2012年03月20日 13:40:44 | anacrolix | create | |