def uniop_nested(func,o_list):
def inner(i_list):
if isinstance(i_list[0],np.ndarray):
return map(func, i_list)
else:
return map(inner, i_list)
return inner(o_list)
def binop_nested(func, o1, o2):
if not isinstance(o1,np.ndarray):
return [binop_nested(func, i1, i2) for (i1,i2) in zip(o1,o2)]
else:
return func(o1,o2)
def add_nested(s1,s2):
return binop_nested(np.add,s1,s2)
My code need to work with lists of ndarrays
and list of lists of ndarrays
. Profiling shows this is some of the most performance critical code in my project.
- How can I optimise it?
- Can I rewrite the recursion as a loop?
- Is there nice way to rewrite, them as Cython or a C extention? (I have no experience with this)
My Stack Overflow question here indicates that changing datatypes is probably not going to the solution.
More info:
- Operands (o1 o2, s1, s2) are short lists. Profiling has shown me that using
it.izip
is slower. - Function return results are unlikely to be repeated. As the ndarrays are full of floats being tweaked with mathematical operations based of floats. (We are talking a large segment of Rn possible values)
- Functions being applied are simple, the
add_nested
is the most common op by far, but there are a few onthers likeuniop_nested(np.zeros_like, o_list)
. - ndarrays are of different sizes/shapes. (so a multidimentional ndarray won't work)
Context:
This is being used for training Restricted Boltzmann Machines (RBMs) and Neural network.
I have a generic "Trainer" class,
that takes a Trainee class as a parameter.
the Trainee class exposes a few methods like:
- Get_update_gradient - a function that returns (for a RBM [Restricted Boltzmann Machine]) a list containing a ndarray of weight changes and 2 ndarrays of bias changes, or (for a multilayer neural net) a list containing a list of weight matrix changes and a list of bias changes
- knowledge: a property exposing either a list containing (for a RBM) a 2 bias arrays and a weight matrix or (for a neural net) a list of weight matrixes and bias arrays
It may seem that the Trainer class is simple, and unnesc, however it is moderately complex, and its use is common between the RBM and neural net classes. (Both benefit from the use of momentum and minibatchs) A typical use is:
trainee.knowledge = binop_nested(lambda current_value,update: learning_rate*update+current_value, trainee.knowledge, updates)
2 Answers 2
For recursion, you can try adding an @memoize decorator to it, to speed it up.
import functools
def memoize(f):
cache= {}
@functools.wraps(f)
def memf(*x):
if x not in cache:
cache[x] = f(*x)
return cache[x]
return memf
Not sure if in your case it will speed up a lot, but if you try with fibonacci, you will see that it vastly speeds up the recursions, since it caches previous results.
This is something that speeds up recursion things in general. For your specific case, we need a bit more info, on what your functions are, and what you want to achieve to get a more specific answer.
-
\$\begingroup\$ Won't work as this is not a pure function. It has state. Actually it might be pure. But in anycase the same result is almpst never returned twice. \$\endgroup\$Frames Catherine White– Frames Catherine White2014年03月11日 09:59:33 +00:00Commented Mar 11, 2014 at 9:59
-
\$\begingroup\$ Added more info to question \$\endgroup\$Frames Catherine White– Frames Catherine White2014年03月11日 10:09:14 +00:00Commented Mar 11, 2014 at 10:09
-
1\$\begingroup\$ @Oxinabox, not sure what's that about "not a pure function," but the function decorated with this
memoize
does always return the same output for the same input arguments (which have a limitation that they must be hashable or immutable). \$\endgroup\$K3---rnc– K3---rnc2014年03月12日 12:02:59 +00:00Commented Mar 12, 2014 at 12:02 -
\$\begingroup\$ Memorisation can only be done on Pure Functions: en.wikipedia.org/wiki/Pure_function In my case the functions may be pure (maybe), but it doesn't matter because I never call the function with the same inputs twice. Memoration is a great technique, for recursive functions, under the right circumstances. Unfortunatly for me, this is not them. (Not saying antyhing is wrong with your answer though, I +1 it) \$\endgroup\$Frames Catherine White– Frames Catherine White2014年03月13日 06:03:45 +00:00Commented Mar 13, 2014 at 6:03
Cython function calls are much faster. (I suspect cython type checking is too).
I recompiled your code with cython,
a simple matter of changing they file name to .pyx
,
adding: cimport cython
to the top,
and compiling it
Here is a simple test:
a = np.arange(788)
b = np.asarray([0.01]*200)
values = [[a,0.1*a,0.01*a],[a,0.1*a,0.01*a],b,b]
%timeit add_nested(values,values) #Define in this IPython notebook
10000 loops, best of 3: 32 μs per loop
%timeit c_add_nested(values,values) #Define in this IPython notebook with cythonmagic
10000 loops, best of 3: 32 μs per loop
%timeit numpyutil.add_nested(values,values) #In a seperate file called numpyutil.pyx/so
10000 loops, best of 3: 32 μs per loop
That is about a 25% speed increase.
Explore related questions
See similar questions with these tags.
isinstance
is what causing the performance hit. Getting rid of branching would be better though. \$\endgroup\$func
called from here? \$\endgroup\$