I wrote a Python module to learn about memoization, and I have some questions on what I did.
- How pythonic is this?
- Does it make a difference if I used a class attribute instead of a function attribute for my
lookup_table
? - How can I modify
multi_run_chrono
to use it like Python's decorators (with@
)? - Should I keep the
try... except KeyError
or rather doif key in dictionary...
? - Are there some nice tricks I could do to improve this code's elegance and/or performance?
Here is the code:
from __future__ import print_function
from random import randint
from time import clock
def naive_factorial(n):
"""calculates n! with a simple recursive algorithm"""
if n == 0:
return 1
else:
return n * naive_factorial(n-1)
def memo_factorial(n):
"""calculates n! using memoization in a recursive algorithm"""
if not hasattr(memo_factorial, 'lookup_table'):
memo_factorial.lookup_table = {0: 1}
try:
return memo_factorial.lookup_table[n]
except KeyError:
val = n * memo_factorial(n - 1)
memo_factorial.lookup_table[n] = val
return val
def check_correctness():
print('Checking both functions do the same things')
for i in [0, 1, 4, 10, 22, 63]:
n = naive_factorial(i)
m = memo_factorial(i)
print('n = %d -- Naive : %d, Memo : %d' % (i, n, m))
if n != m:
print('Error in functions implementation')
exit()
def multi_run_chrono(func, nb_runs, min_val, max_val):
print('Running %s %d times, with n from %d to %d'
% (func.__name__, nb_runs, min_val, max_val))
tests = [randint(min_val, max_val) for _ in xrange(nb_runs)]
starting_time = clock()
for n in tests:
func(n)
ending_time = clock()
print('Duration : %fs' % (ending_time - starting_time))
def main():
# check_correctness() # comment/uncomment to check or not
for func in [naive_factorial, memo_factorial]:
multi_run_chrono(func, 20000, 10, 200)
if __name__ == '__main__':
main()
Benchmarking on my computer :
Running naive_factorial 20000 times, with n from 10 to 200
Duration : 0.596933s
Running memo_factorial 20000 times, with n from 10 to 200
Duration : 0.006060s
All remarks are welcome, thank you very much!
2 Answers 2
Regarding the memoization, here are two different ways to do this (for single-valued functions):
import functools
def memoize(func):
cache = func.cache = {}
@functools.wraps(func)
def wrapper(n):
if n not in cache:
cache[n] = func(n)
return cache[n]
return wrapper
def memodict(f):
""" Memoization decorator for a function taking a single argument """
class memodict(dict):
def __missing__(self, key):
ret = self[key] = f(key)
return ret
return memodict().__getitem__
@memodict
# @memoize
def factorial(n):
"""calculates n! with a simple recursive algorithm"""
if n == 0:
return 1
else:
return n * factorial(n-1)
The latter is faster because it saves one additional lookup, but only works for single-valued functions, whereas the first can be extended to also pass along multiple arguments.
The memoize
is also nice, because it keeps the documentation of the function because of functools.wraps
(try help(factorial)
in both cases to see the difference).
Which implementation is faster depends on how often your try
block misses and succeeds.
As for timings:
Using @memoize
to calculate all factorials from 1 to 100000000 takes 2.66s on my machine, while doing the same with @memodict
takes 2.51s, so not a big difference.
If you want to use a decorator to time the execution time of a function, this will not work very nicely here, since if you decorate factorial
, every execution of the function will be timed (because of recursion this will be very often).
You can however use something like this:
def timeit(func, *args, **kwargs):
starting_time = clock()
result = func(*args, **kwargs)
ending_time = clock()
print 'Duration: {}'.format(ending_time - starting_time)
return result
timeit(factorial, 100)
If you do want to use this as decorator, just insert a wrapper:
def timeit(func):
def wrapper(*args, **kwargs):
starting_time = clock()
result = func(*args, **kwargs)
ending_time = clock()
print 'Duration: {}'.format(ending_time - starting_time)
return result
return wrapper
@timeit
def f():
pass
-
\$\begingroup\$ I'm a big fan of your
memodict
implementation, but it should still be possible to use it with multiple-valued functions. You just need an extra level of indirection that takes the args & kwargs and turns them into a tuple (or something hashable) and uses that as a key. \$\endgroup\$Dan Oberlam– Dan Oberlam2016年08月05日 16:01:12 +00:00Commented Aug 5, 2016 at 16:01 -
\$\begingroup\$ Yes, indeed. I took the implementation from code.activestate.com/recipes/… where different ways to get it to work with multiple arguments are discussed. All take a small performance hit when applied to the special case of one argument, though. \$\endgroup\$Graipher– Graipher2016年08月06日 08:42:42 +00:00Commented Aug 6, 2016 at 8:42
check_correctness
def check_correctness():
Functions should better take arguments to allow reuse, I would give the two functions and the inputs as arguments.
print('Checking both functions do the same things')
Unavoidable printing inside a function makes it impossible to reuse and violates the single responsability principle (computation + information)
for i in [0, 1, 4, 10, 22, 63]:
i
is not descriptive you should use testcase
to convey more information.
n = naive_factorial(i)
m = memo_factorial(i)
After removing print there is no need for these temporary variables.
print('n = %d -- Naive : %d, Memo : %d' % (i, n, m))
Remove as above.
if n != m:
print('Error in functions implementation') exit()
Raise an exception instead to allow the caller to catch it if he wants (further increasing re-usability)
Explore related questions
See similar questions with these tags.
functools.lru_cache
, also consider separating the algoritmh from the memoization logic. \$\endgroup\$naive_factorial
to make it use memoization instead of creating a whole new function with memoization integrated? \$\endgroup\$