Here is a lazy dictionary with test cases.
"""
A dictionary whose values are only evaluated on access.
"""
from functools import partial
from UserDict import IterableUserDict
class LazyDict(IterableUserDict, object):
"""
A lazy dictionary implementation which will try
to evaluate all values on access and cache the
result for later access.
"""
def set_lazy(self, key, item, *args, **kwargs):
"""
Allow the setting of a callable and arguments
as value of dictionary.
"""
if callable(item):
item = partial(item, *args, **kwargs)
super(LazyDict, self).__setitem__(key, item)
def __getitem__(self, key):
item = super(LazyDict, self).__getitem__(key)
try:
self[key] = item = item()
except TypeError:
pass
return item
def test_lazy_dict():
"""
Simple test cases for `LazyDict`.
"""
lazy_dict = LazyDict({1: 1, 2: lambda: 2})
assert lazy_dict[2] == 2
lazy_dict[3] = 3
assert lazy_dict[3] == 3
def joiner(*args, **kwargs):
sep = kwargs.pop('sep', ' ')
kwargs = [
'%s=%s' % (k, v)
for k, v in sorted(kwargs.iteritems())]
return sep.join(list(args) + kwargs)
lazy_dict.set_lazy(
4, joiner, 'foo', 'bar', name='test', other='muah', sep=' ')
assert lazy_dict[4] == 'foo bar name=test other=muah'
assert lazy_dict.get('5') is None
# Test caching functionality.
def call_at_max(count):
counter = [0]
def inner():
counter[0] += 1
if counter[0] > count:
raise AssertionError('Called more than once')
return 'happy'
return inner
call_once = call_at_max(1)
lazy_dict[5] = call_once
assert lazy_dict[5] == 'happy'
assert lazy_dict[5] == 'happy'
# Test for helper function.
try:
call_once()
except AssertionError:
assert True
1 Answer 1
Per the documentation for UserDict
:
The need for this class has been largely supplanted by the ability to subclass directly from
dict
(a feature that became available starting with Python version 2.2).
Rather than:
class LazyDict(IterableUserDict, object):
therefore, unless you have a really good reason to want to support 2.1 and earlier, I would use:
class LazyDict(dict):
or base it on collections.MutableMapping
. This also makes compatibility with 3.x (where UserDict
doesn't exist) less complex.
An edge case you may not have thought of: what if the function stored in the dictionary returns a callable object? Then you will get different results depending on when you access the dictionary. If this is intentional, it should be documented. If not, one solution would be to create a LazyCallable
class (effectively a custom partial
) to store the function and its arguments, so you can check if isinstance(item, LazyCallable)
to distinguish between items added via set_lazy
and any other callable values.
Overall, I'm not convinced I see the point to this. The function only gets called once, but I'm not sure the layer of additional complexity needed for:
lazy_dict.set_lazy(4, joiner, 'foo', 'bar', name='test', other='muah', sep=' ')
is better than:
vanilla_dict[4] = joiner('foo', 'bar', name='test', other='muah', sep=' ')
In both, the function only gets called once (at slightly different times, admittedly), and the latter doesn't require the reader to know about LazyDict
. This also doesn't provide the functionality (that e.g. regular "memoization" does) to dynamically store results of calls to the function with different arguments, so it's only called once for each set of arguments.
I suppose one advantage would be in cases where you aren't sure, when you add the function to the dictionary, whether or not you will ever need to call it. If the function is very computationally complex but not actually needed, you can optimise one call down to zero, but there are probably easier ways to do that. There's no reason the user couldn't put a partial
into the dictionary themselves.
Perhaps off-topic, but:
# Test caching functionality.
def call_at_max(count):
counter = [0]
def inner():
counter[0] += 1
if counter[0] > count:
raise AssertionError('Called more than once')
return 'happy'
return inner
You have hard-coded 'Called more than once'
, which won't make sense for count != 1
. Also, using a list to make a "mutable integer" isn't very neat. I would either make it non-generic (i.e. hard-code the 1
, too), or implement as something like:
MULTIPLES = {1: 'once', 2: 'twice'} # add 3: 'thrice' if you like!
# Test caching functionality.
def call_at_most(times):
def inner():
inner.counter += 1
if inner.counter > times:
raise AssertionError(
'Called more than {}'.format(
MULTIPLES.get(times, '{} times').format(times)
)
)
return 'happy'
inner.counter = 0
return inner
In use:
>>> test = call_at_most(2)
>>> test()
'happy'
>>> test()
'happy'
>>> test()
Traceback (most recent call last):
File "<pyshell#51>", line 1, in <module>
test()
File "<pyshell#47>", line 8, in inner
MULTIPLES.get(times, '{} times').format(times)
AssertionError: Called more than twice
I'd also replace assert True
with pass
.
-
\$\begingroup\$ For future readers also wondering what would be the point of deferring the function execution (rather than calling it immediately when building the dictionary): this is extremely useful if said function is slow or expensive (like a remote API call, or something incurring a lot of I/O or CPU), but won't be needed all the time. It is then a very good idea to use some kind of lazy evaluation. \$\endgroup\$jpetazzo– jpetazzo2015年09月15日 15:06:42 +00:00Commented Sep 15, 2015 at 15:06
Explore related questions
See similar questions with these tags.
IterableUserDict
? \$\endgroup\$IterableUserDict
is a dictionary meant for subclassing that supports iteration, see docs.python.org/2/library/userdict.html for documentation. \$\endgroup\$collections.MutableMapping
instead. Note, per the docs, "The need for [UserDict
] has been largely supplanted by the ability to subclass directly from dict (a feature that became available starting with Python version 2.2)." \$\endgroup\$