Evaluating Python lambda function with numpys np.fromfunction

Question 1

Let A1 and A2 be numpy arrays of the same shape, say ((d1,d2)). I want to build ((d1,d1)) array from it such that its [i,j]th entry is defined by applying a function to the tuple A1[i],A2[j]. I use np.fromfunction in the form

f=lambda i,j: np.inner(A1[i],A2[j])
A=np.fromfunction(f, shape=(d1, d1))

(as suggested in Fastest way to initialize numpy array with values given by function) .

However I get the error ''IndexError: arrays used as indices must be of integer (or boolean) type''. This is strange because changing the lambda function to for example

 f=lambda i,j: i*j

works fine! It seems calling another function in the lambda function leads to trouble with

np.fromfunction

(np.inner is just an example and I'd like to be able to replace it by other such functions).

Question 2

To debug the situation, make f a proper function and add a print statement to see the value of i and j:

import numpy as np
np.random.seed(2015)
d1, d2 = 5, 3
A1 = np.random.random((d1,d2))
A2 = np.random.random((d1,d2))
def f(i, j):
 print(i, j)
 return np.inner(A1[i],A2[j])
A = np.fromfunction(f, shape=(d1, d1))

You'll see (i, j) equals:

(array([[ 0., 0., 0., 0., 0.],
 [ 1., 1., 1., 1., 1.],
 [ 2., 2., 2., 2., 2.],
 [ 3., 3., 3., 3., 3.],
 [ 4., 4., 4., 4., 4.]]), array([[ 0., 1., 2., 3., 4.],
 [ 0., 1., 2., 3., 4.],
 [ 0., 1., 2., 3., 4.],
 [ 0., 1., 2., 3., 4.],
 [ 0., 1., 2., 3., 4.]]))

Aha. The problem is that these arrays are float-valued. As the error message says, indices have to be of integer or boolean type.

Perusing the docstring for np.fromfunction reveals it has a third parameter, dtype, which controls the data type of coordinate arrays:

Parameters
dtype : data-type, optional
 Data-type of the coordinate arrays passed to `function`.
 By default, `dtype` is float.

Therefore the solution is to add dtype=int to the call to np.fromfunction:

A = np.fromfunction(f, shape=(d1, d1), dtype=int)

Question 3

Sorry, I'm still confused that with using 'i*j' as function,one gets arrays of integers '(i,j)' whereas with np.inner one gets what you wrote? Shouldn't 'np.fromfunction' just apply 'f' to all tuples '(i,j)' that are combinations of indices (between 0 and d1)?

Question 4

I don't think np.fromfunction is the right function for this purpose. You see what the indices i and j look like. They are superfluous -- no i,j or f is necessary -- since the entire computation can be done with np.inner(A1,A2).

Question 5

Regarding: "Shouldn't 'np.fromfunction' just apply 'f' to all tuples '(i,j)'" No, this is not what np.fromfunction does. There is no function in NumPy to do this because calling a Python function f for each tuple would be terribly slow for large arrays. To leverage NumPy effectively, you generally want to express the computation with the fewest number of function calls necessary, and pass the biggest array possible to those functions. This off-loads the most work to NumPy's fast underlying C/Fortran functions and relies the least on slower Python code.

Question 6

Don't try to express the computation element-by-element (as you would in C). Instead try to find the NumPy function which achieves the same result while operating on whole arrays.

Question 7

Yes, thanks! I wasn't aware of this. I just benchmarked np.inner(A1,A2) versus a double loop over the indices with A[i,j]=np.inner(A1[i],A2[j]) and your solution is MUCH, MUCH faster!

unutbu 887k197 gold badges1.9k silver badges1.7k bronze badges · Accepted Answer · 2015-07-30 10:57:47Z

To debug the situation, make f a proper function and add a print statement to see the value of i and j:

import numpy as np
np.random.seed(2015)
d1, d2 = 5, 3
A1 = np.random.random((d1,d2))
A2 = np.random.random((d1,d2))
def f(i, j):
 print(i, j)
 return np.inner(A1[i],A2[j])
A = np.fromfunction(f, shape=(d1, d1))

You'll see (i, j) equals:

(array([[ 0., 0., 0., 0., 0.],
 [ 1., 1., 1., 1., 1.],
 [ 2., 2., 2., 2., 2.],
 [ 3., 3., 3., 3., 3.],
 [ 4., 4., 4., 4., 4.]]), array([[ 0., 1., 2., 3., 4.],
 [ 0., 1., 2., 3., 4.],
 [ 0., 1., 2., 3., 4.],
 [ 0., 1., 2., 3., 4.],
 [ 0., 1., 2., 3., 4.]]))

Aha. The problem is that these arrays are float-valued. As the error message says, indices have to be of integer or boolean type.

Perusing the docstring for np.fromfunction reveals it has a third parameter, dtype, which controls the data type of coordinate arrays:

Parameters
dtype : data-type, optional
 Data-type of the coordinate arrays passed to `function`.
 By default, `dtype` is float.

Therefore the solution is to add dtype=int to the call to np.fromfunction:

A = np.fromfunction(f, shape=(d1, d1), dtype=int)

Sorry, I'm still confused that with using 'i*j' as function,one gets arrays of integers '(i,j)' whereas with np.inner one gets what you wrote? Shouldn't 'np.fromfunction' just apply 'f' to all tuples '(i,j)' that are combinations of indices (between 0 and d1)?
I don't think np.fromfunction is the right function for this purpose. You see what the indices i and j look like. They are superfluous -- no i,j or f is necessary -- since the entire computation can be done with np.inner(A1,A2).
Regarding: "Shouldn't 'np.fromfunction' just apply 'f' to all tuples '(i,j)'" No, this is not what np.fromfunction does. There is no function in NumPy to do this because calling a Python function f for each tuple would be terribly slow for large arrays. To leverage NumPy effectively, you generally want to express the computation with the fewest number of function calls necessary, and pass the biggest array possible to those functions. This off-loads the most work to NumPy's fast underlying C/Fortran functions and relies the least on slower Python code.
Don't try to express the computation element-by-element (as you would in C). Instead try to find the NumPy function which achieves the same result while operating on whole arrays.
Yes, thanks! I wasn't aware of this. I just benchmarked np.inner(A1,A2) versus a double loop over the indices with A[i,j]=np.inner(A1[i],A2[j]) and your solution is MUCH, MUCH faster!

CollectivesTM on Stack Overflow

Evaluating Python lambda function with numpys np.fromfunction

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related