4
\$\begingroup\$

I wrote a Python module to learn about memoization, and I have some questions on what I did.

  • How pythonic is this?
  • Does it make a difference if I used a class attribute instead of a function attribute for my lookup_table?
  • How can I modify multi_run_chrono to use it like Python's decorators (with @)?
  • Should I keep the try... except KeyError or rather do if key in dictionary...?
  • Are there some nice tricks I could do to improve this code's elegance and/or performance?

Here is the code:

from __future__ import print_function
from random import randint
from time import clock
def naive_factorial(n):
 """calculates n! with a simple recursive algorithm"""
 if n == 0:
 return 1
 else:
 return n * naive_factorial(n-1)
def memo_factorial(n):
 """calculates n! using memoization in a recursive algorithm"""
 if not hasattr(memo_factorial, 'lookup_table'):
 memo_factorial.lookup_table = {0: 1}
 try:
 return memo_factorial.lookup_table[n]
 except KeyError:
 val = n * memo_factorial(n - 1)
 memo_factorial.lookup_table[n] = val
 return val
def check_correctness():
 print('Checking both functions do the same things')
 for i in [0, 1, 4, 10, 22, 63]:
 n = naive_factorial(i)
 m = memo_factorial(i)
 print('n = %d -- Naive : %d, Memo : %d' % (i, n, m))
 if n != m:
 print('Error in functions implementation')
 exit()
def multi_run_chrono(func, nb_runs, min_val, max_val):
 print('Running %s %d times, with n from %d to %d'
 % (func.__name__, nb_runs, min_val, max_val))
 tests = [randint(min_val, max_val) for _ in xrange(nb_runs)]
 starting_time = clock()
 for n in tests:
 func(n)
 ending_time = clock()
 print('Duration : %fs' % (ending_time - starting_time))
def main():
 # check_correctness() # comment/uncomment to check or not
 for func in [naive_factorial, memo_factorial]:
 multi_run_chrono(func, 20000, 10, 200)
if __name__ == '__main__':
 main()

Benchmarking on my computer :

Running naive_factorial 20000 times, with n from 10 to 200
Duration : 0.596933s
Running memo_factorial 20000 times, with n from 10 to 200
Duration : 0.006060s

All remarks are welcome, thank you very much!

asked Aug 5, 2016 at 8:46
\$\endgroup\$
6
  • 1
    \$\begingroup\$ That's a nice piece of code regarding PEP8 style guide. Good job. Nothing to comment on this part. \$\endgroup\$ Commented Aug 5, 2016 at 9:20
  • 4
    \$\begingroup\$ Consider updating to Python 3 where memoization is built-in functools.lru_cache, also consider separating the algoritmh from the memoization logic. \$\endgroup\$ Commented Aug 5, 2016 at 9:28
  • \$\begingroup\$ @Dex'ter thanks, this is not the first time I post here, glad to see I learned my lessons well. \$\endgroup\$ Commented Aug 5, 2016 at 9:31
  • \$\begingroup\$ @Caridorc I'll look it up thanks. By separating the algoritmh from the memoization logic, do you mean like decorating the naive_factorial to make it use memoization instead of creating a whole new function with memoization integrated? \$\endgroup\$ Commented Aug 5, 2016 at 9:32
  • \$\begingroup\$ @BusyAnt Exactly, so you can reuse the memoization capability with different functions. \$\endgroup\$ Commented Aug 5, 2016 at 9:35

2 Answers 2

8
\$\begingroup\$

Regarding the memoization, here are two different ways to do this (for single-valued functions):

import functools
def memoize(func):
 cache = func.cache = {}
 @functools.wraps(func)
 def wrapper(n):
 if n not in cache:
 cache[n] = func(n)
 return cache[n]
 return wrapper
def memodict(f):
 """ Memoization decorator for a function taking a single argument """
 class memodict(dict):
 def __missing__(self, key):
 ret = self[key] = f(key)
 return ret
 return memodict().__getitem__
@memodict
# @memoize
def factorial(n):
 """calculates n! with a simple recursive algorithm"""
 if n == 0:
 return 1
 else:
 return n * factorial(n-1)

The latter is faster because it saves one additional lookup, but only works for single-valued functions, whereas the first can be extended to also pass along multiple arguments.

The memoize is also nice, because it keeps the documentation of the function because of functools.wraps (try help(factorial) in both cases to see the difference).

Which implementation is faster depends on how often your try block misses and succeeds.

As for timings: Using @memoize to calculate all factorials from 1 to 100000000 takes 2.66s on my machine, while doing the same with @memodict takes 2.51s, so not a big difference.

If you want to use a decorator to time the execution time of a function, this will not work very nicely here, since if you decorate factorial, every execution of the function will be timed (because of recursion this will be very often).

You can however use something like this:

def timeit(func, *args, **kwargs):
 starting_time = clock()
 result = func(*args, **kwargs)
 ending_time = clock()
 print 'Duration: {}'.format(ending_time - starting_time)
 return result
timeit(factorial, 100)

If you do want to use this as decorator, just insert a wrapper:

def timeit(func):
 def wrapper(*args, **kwargs):
 starting_time = clock()
 result = func(*args, **kwargs)
 ending_time = clock()
 print 'Duration: {}'.format(ending_time - starting_time)
 return result
 return wrapper
@timeit
def f():
 pass
answered Aug 5, 2016 at 9:58
\$\endgroup\$
2
  • \$\begingroup\$ I'm a big fan of your memodict implementation, but it should still be possible to use it with multiple-valued functions. You just need an extra level of indirection that takes the args & kwargs and turns them into a tuple (or something hashable) and uses that as a key. \$\endgroup\$ Commented Aug 5, 2016 at 16:01
  • \$\begingroup\$ Yes, indeed. I took the implementation from code.activestate.com/recipes/… where different ways to get it to work with multiple arguments are discussed. All take a small performance hit when applied to the special case of one argument, though. \$\endgroup\$ Commented Aug 6, 2016 at 8:42
2
\$\begingroup\$

check_correctness

def check_correctness():

Functions should better take arguments to allow reuse, I would give the two functions and the inputs as arguments.

print('Checking both functions do the same things')

Unavoidable printing inside a function makes it impossible to reuse and violates the single responsability principle (computation + information)

for i in [0, 1, 4, 10, 22, 63]:

i is not descriptive you should use testcase to convey more information.

n = naive_factorial(i) 
m = memo_factorial(i)

After removing print there is no need for these temporary variables.

 print('n = %d -- Naive : %d, Memo : %d' % (i, n, m))

Remove as above.

if n != m:
 print('Error in functions implementation') exit() 

Raise an exception instead to allow the caller to catch it if he wants (further increasing re-usability)

answered Aug 5, 2016 at 11:06
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.