This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年06月21日 12:34 by serhiy.storchaka, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| clru_cache_known_hash.patch | serhiy.storchaka, 2015年06月21日 12:34 | review | ||
| clru_cache_known_hash_2.patch | serhiy.storchaka, 2015年06月21日 17:15 | review | ||
| clru_cache_known_hash_3.patch | serhiy.storchaka, 2015年06月22日 04:18 | review | ||
| clru_cache_known_hash_4.patch | serhiy.storchaka, 2015年06月22日 20:30 | review | ||
| clru_cache_known_hash_5.larry.patch | larry, 2015年06月23日 08:24 | review | ||
| Messages (24) | |||
|---|---|---|---|
| msg245593 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月21日 12:34 | |
There is a difference between Python and C implementations of functools.lru_cache(). Python implementation caches the hash of the key, C implementation doesn't. May be this is not too important (at least I have no an example that shows the benefit of caching the hash), but there is a place for possible optimization. Proposed patch uses private functions _PyDict_GetItem_KnownHash() and _PyDict_SetItem_KnownHash() to avoid the second calculating of the hash for new keys. The hash is still calculated second time when the key is deleted from full cache. To avoid this _PyDict_DelItem_KnownHash() is needed. |
|||
| msg245601 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2015年06月21日 16:53 | |
I would like the full functionality of the Python version to be implemented. Guaranteeing that the hash is only calculated once prevents a reentrancy hole and provides a speed benefit as well. Please implement exactly what the pure python version does (no more than one call to hash ever). |
|||
| msg245604 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月21日 17:15 | |
New patch adds _PyDict_DelItem_KnownHash() and uses it to guarantee that the hash is only calculated once. |
|||
| msg245619 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月22日 04:18 | |
New patch touches also unbounded cache version. Larry, do you allow to commit such patch in 3.5? |
|||
| msg245623 - (view) | Author: Larry Hastings (larry) * (Python committer) | Date: 2015年06月22日 10:23 | |
I can accept this change, but I don't like that code. Is it really considered acceptable to have that much copy-and-paste code in the dict implementation for KnownHash calls? Could the common code be split off into a Py_LOCAL_INLINE function? |
|||
| msg245627 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2015年06月22日 13:17 | |
Serhiy's code looks like the cleanest way to do it. |
|||
| msg245645 - (view) | Author: Larry Hastings (larry) * (Python committer) | Date: 2015年06月22日 20:11 | |
Patch doesn't build for me against current trunk: gcc -pthread -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -Werror=declaration-after-statement -I. -IInclude -I./Include -DPy_BUILD_CORE -c ./Modules/_functoolsmodule.c -o Modules/_functoolsmodule.o ./Modules/_functoolsmodule.c: In function ‘infinite_lru_cache_wrapper’: ./Modules/_functoolsmodule.c:769:14: error: too few arguments to function ‘_PyDict_GetItem_KnownHash’ result = _PyDict_GetItem_KnownHash(self->cache, key); ^ In file included from Include/Python.h:87:0, from ./Modules/_functoolsmodule.c:2: Include/dictobject.h:59:24: note: declared here PyAPI_FUNC(PyObject *) _PyDict_GetItem_KnownHash(PyObject *mp, PyObject *key, ^ ./Modules/_functoolsmodule.c:785:9: error: too few arguments to function ‘_PyDict_SetItem_KnownHash’ if (_PyDict_SetItem_KnownHash(self->cache, key, result) < 0) { ^ In file included from Include/Python.h:87:0, from ./Modules/_functoolsmodule.c:2: Include/dictobject.h:71:17: note: declared here PyAPI_FUNC(int) _PyDict_SetItem_KnownHash(PyObject *mp, PyObject *key, ^ Makefile:1695: recipe for target 'Modules/_functoolsmodule.o' failed make: *** [Modules/_functoolsmodule.o] Error 1 |
|||
| msg245646 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月22日 20:30 | |
My bad. I submitted the last patch without checking (rebuilding Python takes too much time on my slow netbook). Here is fixed and tested patch. |
|||
| msg245672 - (view) | Author: Larry Hastings (larry) * (Python committer) | Date: 2015年06月23日 08:24 | |
I suggest that 20 lines of identical code copy-and-pasted between two functions is not the cleanest way to do it. Find attached my version of the patch, which splits this common code out into a static function. |
|||
| msg245720 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2015年06月24日 05:43 | |
Sorry Larry, but I prefer Serhiy's version. The known_hash variants need to have their own function intact version and gum up the rest of the code. See the current known_hash function for comparison (Serhiy modeled on what we had already decided to do much earlier on a separate issue). |
|||
| msg245733 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月24日 12:16 | |
I would prefer do not repeat the code, but I afraid this can affect the performance, and the performance of PyDict_GetItem is critical for Python. |
|||
| msg245754 - (view) | Author: Yury Selivanov (yselivanov) * (Python committer) | Date: 2015年06月24日 15:14 | |
> I would prefer do not repeat the code, but I afraid this can affect the performance, and the performance of PyDict_GetItem is critical for Python. Static function calls like that one will most likely be inlined by the compiler... But if you don't want to rely on that, maybe we can use a macro to avoid code duplication? |
|||
| msg245755 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年06月24日 15:44 | |
> But if you don't want to rely on that, maybe we can use a macro > to avoid code duplication? This will not make the code cleaner. In any case this is other issue. |
|||
| msg245775 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2015年06月24日 20:40 | |
Please let this go forward as Serhiy has written it. That is the way I wrote the existing known hash function. You all should be focused on correctness, not on bike-shedding my and Serhiy's coding style. |
|||
| msg245799 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2015年06月25日 05:10 | |
+1 to what Raymond says here. Concerns regarding code duplication need to be weighed against the likelihood of that code changing in the future, and the impact on the readability of each copy of the code in isolation. In this particular case, I think the likelihood of future change is low, and the likely negative impact of consolidation on readability is high, so it makes sense to me to defer consolidation until there's a clear benefit to sharing the implementations. |
|||
| msg245813 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2015年06月25日 15:36 | |
I did some regression testing and reviewed the code; LGTM. As for the code structure issues, I agree that the duplication is undesirable (the readability argument is not that convincing), but Serhiy's patch is consistent with the existing design. As such, I think the structure issue is a separate one and definitely should not hold this patch up. |
|||
| msg245933 - (view) | Author: Tal Einat (taleinat) * (Python committer) | Date: 2015年06月29日 12:12 | |
With clru_cache_known_hash_4.patch on the current default branch, the entire test suite passes here (OSX 10.10). |
|||
| msg246111 - (view) | Author: Larry Hastings (larry) * (Python committer) | Date: 2015年07月03日 00:50 | |
This can go in for 3.5 beta 3. |
|||
| msg246715 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年07月14日 04:33 | |
Can you allow me to commit the patch or will commit it yourself Raymond? |
|||
| msg246974 - (view) | Author: Tal Einat (taleinat) * (Python committer) | Date: 2015年07月20日 09:58 | |
Ping? Let's not miss the final 3.5 beta. |
|||
| msg251777 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年09月28日 17:10 | |
Ping. |
|||
| msg251832 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年09月29日 09:43 | |
clru_cache_known_hash_5.larry.patch looks good to me. Python 3.5 has no more restriction to push patches, go ahead Serhiy, push it to 3.5 & 3.6. |
|||
| msg252100 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年10月02日 09:51 | |
New changeset 3f4c319a822f by Serhiy Storchaka in branch '3.5': Issue #24483: C implementation of functools.lru_cache() now calculates key's https://hg.python.org/cpython/rev/3f4c319a822f New changeset 5758b85627c9 by Serhiy Storchaka in branch 'default': Issue #24483: C implementation of functools.lru_cache() now calculates key's https://hg.python.org/cpython/rev/5758b85627c9 |
|||
| msg252105 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2015年10月02日 10:27 | |
> New changeset 3f4c319a822f by Serhiy Storchaka in branch '3.5': > Issue #24483: C implementation of functools.lru_cache() now calculates key's > https://hg.python.org/cpython/rev/3f4c319a822f I didn't follow this issue, but I like the change. Good job :) |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:18 | admin | set | github: 68671 |
| 2015年10月02日 10:27:08 | vstinner | set | messages: + msg252105 |
| 2015年10月02日 10:16:19 | serhiy.storchaka | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2015年10月02日 09:51:11 | python-dev | set | nosy:
+ python-dev messages: + msg252100 |
| 2015年10月02日 09:42:50 | serhiy.storchaka | link | issue25295 dependencies |
| 2015年09月29日 09:43:44 | vstinner | set | nosy:
+ vstinner messages: + msg251832 |
| 2015年09月28日 17:10:46 | serhiy.storchaka | set | messages: + msg251777 |
| 2015年07月20日 09:58:23 | taleinat | set | messages: + msg246974 |
| 2015年07月14日 04:33:45 | serhiy.storchaka | set | messages: + msg246715 |
| 2015年07月03日 00:50:57 | larry | set | messages: + msg246111 |
| 2015年06月29日 12:12:52 | taleinat | set | nosy:
+ taleinat messages: + msg245933 |
| 2015年06月25日 15:36:18 | meador.inge | set | nosy:
+ meador.inge messages: + msg245813 |
| 2015年06月25日 05:10:28 | ncoghlan | set | messages: + msg245799 |
| 2015年06月24日 20:40:13 | rhettinger | set | messages: + msg245775 |
| 2015年06月24日 15:44:57 | serhiy.storchaka | set | messages:
+ msg245755 title: Avoid repeated hash calculation in C implementation of functools.lru_cache() -> Avoid repeated hash calculation in C implementation of functools.lru_cache() |
| 2015年06月24日 15:14:47 | yselivanov | set | nosy:
+ yselivanov messages: + msg245754 |
| 2015年06月24日 12:16:19 | serhiy.storchaka | set | messages: + msg245733 |
| 2015年06月24日 05:43:49 | rhettinger | set | messages: + msg245720 |
| 2015年06月23日 08:24:17 | larry | set | files:
+ clru_cache_known_hash_5.larry.patch messages: + msg245672 |
| 2015年06月22日 20:30:09 | serhiy.storchaka | set | files:
+ clru_cache_known_hash_4.patch messages: + msg245646 |
| 2015年06月22日 20:11:31 | larry | set | messages: + msg245645 |
| 2015年06月22日 13:17:14 | rhettinger | set | messages: + msg245627 |
| 2015年06月22日 10:23:49 | larry | set | messages: + msg245623 |
| 2015年06月22日 04:18:26 | serhiy.storchaka | set | files:
+ clru_cache_known_hash_3.patch versions: + Python 3.5 nosy: + larry messages: + msg245619 |
| 2015年06月21日 17:15:59 | serhiy.storchaka | set | files:
+ clru_cache_known_hash_2.patch messages: + msg245604 |
| 2015年06月21日 16:53:20 | rhettinger | set | messages: - msg245600 |
| 2015年06月21日 16:53:13 | rhettinger | set | messages: + msg245601 |
| 2015年06月21日 16:50:02 | rhettinger | set | assignee: rhettinger messages: + msg245600 |
| 2015年06月21日 12:34:02 | serhiy.storchaka | create | |