I've just written a simple caching / memoization python decorator. It's purpose is to cache what the function returns for all the arguments combinations it's been ever invoked with.
So, if, say, we had a function my_function(a, b=2, c=3)
invoked once with my_function(1, 2, c=3)
, the results should be then cached whenever I call it with my_function(1, 2, 3)
or my_function(1, c=3, b=2)
or my_function(c=3, b=2, a=1)
.
Is what's below a good approach and could it be somehow improved?
import functools
import pickle
def memoize_all(func):
"""
This is a caching decorator. It caches the function results for
all the arguments combinations, so use it with care. It does not matter whether the arguments
are passed as keywords or not.
"""
cache = {}
@functools.wraps(func)
def cached(*args, **kwargs):
arg_names = func.func_code.co_varnames
arg_dict = {}
for i, arg in enumerate(args):
arg_dict[arg_names[i]] = args[i]
arg_dict.update(**kwargs)
key = pickle.dumps(arg_dict)
if key not in cache:
cache[key] = func(*args, **kwargs)
return cache[key]
return cached
1 Answer 1
I don't know what's up with the spacing in the docstring, but fix it.
Your memoize_all
doesn't work with everything:
memoize_all(int)(1)
#>>> Traceback (most recent call last):
#>>> ...
#>>> AttributeError: type object 'int' has no attribute 'func_code'
I suppose this doesn't bother you much, but it irks me ;).
Consider
for i, arg in enumerate(args):
arg_dict[arg_names[i]] = args[i]
There's no point doing args[i]
; it's just arg
. Further, one should do
for i, (name, arg) in enumerate(zip(arg_names, args)):
arg_dict[name] = arg
But then one can just do
arg_dict = dict(zip(arg_names, args))
You could even use
arg_dict = dict(zip(arg_names, args), **kwargs)
You try and save this with
key = pickle.dumps(arg_dict)
I'm not at all fond of this method. pickle
is a bit fragile and more complicated than I would expect. I would personally do something like
key = frozenset(arg_dict.items())
If you skip generating arg_dict
, this is
key = frozenset(kwargs.items()) | frozenset(zip(arg_names, args))
This does prevent accepting mutable arguments, but it's normally best to enforce purity before caching arguments anyway.
You then have just
def cached(*args, **kwargs):
arg_names = func.func_code.co_varnames
key = frozenset(kwargs.items()) | frozenset(zip(arg_names, args))
if key not in cache:
cache[key] = func(*args, **kwargs)
return cache[key]
Note that in Python 3, this would be better with a signature object:
import functools
from inspect import signature
def memoize_all(func):
"""
This is a caching decorator. It caches the function results for
all the arguments combinations, so use it with care. It does not
matter whether the arguments are passed as keywords or not.
"""
cache = {}
func_sig = signature(func)
@functools.wraps(func)
def cached(*args, **kwargs):
bound = func_sig.bind(*args, **kwargs)
key = frozenset(bound.arguments.items())
if key not in cache:
cache[key] = func(*args, **kwargs)
return cache[key]
return cached
-
\$\begingroup\$ Thank you! My way of building of arg_dict was totally screwed compared to yours. And, actually, whereas I realize the mutable type as a default argument should not happen, why passing the list as a function argument would be against purity in this case? \$\endgroup\$rroszkowiak– rroszkowiak2015年01月24日 19:31:17 +00:00Commented Jan 24, 2015 at 19:31
-
\$\begingroup\$ If I understand the question, it's because
{'x': some_unhashable_object}
isn't hashable. \$\endgroup\$Veedrac– Veedrac2015年01月24日 19:43:54 +00:00Commented Jan 24, 2015 at 19:43 -
\$\begingroup\$ Right, but my point is why would one enforce the use of only hashable objects as function arguments (which needs to happen to use frozenset)? I mean do you consider using unhashable objects as function arguments as not "pure"? \$\endgroup\$rroszkowiak– rroszkowiak2015年01月24日 19:49:45 +00:00Commented Jan 24, 2015 at 19:49
-
\$\begingroup\$ @rroszkowiak It's not an inherent thing, it's just that sans the use of globals, having immutable inputs requires a pure function. \$\endgroup\$Veedrac– Veedrac2015年01月24日 20:08:52 +00:00Commented Jan 24, 2015 at 20:08
Explore related questions
See similar questions with these tags.
lru_cache
does this, and this SO question lists some 2.7 alternatives / backports \$\endgroup\$