Let A1 and A2 be numpy arrays of the same shape, say ((d1,d2)). I want to build ((d1,d1)) array from it such that its [i,j]th entry is defined by applying a function to the tuple A1[i],A2[j]. I use np.fromfunction in the form
f=lambda i,j: np.inner(A1[i],A2[j])
A=np.fromfunction(f, shape=(d1, d1))
(as suggested in Fastest way to initialize numpy array with values given by function) .
However I get the error ''IndexError: arrays used as indices must be of integer (or boolean) type''. This is strange because changing the lambda function to for example
f=lambda i,j: i*j
works fine! It seems calling another function in the lambda function leads to trouble with
np.fromfunction
(np.inner is just an example and I'd like to be able to replace it by other such functions).
1 Answer 1
To debug the situation, make f a proper function and add a print statement to see the value of i and j:
import numpy as np
np.random.seed(2015)
d1, d2 = 5, 3
A1 = np.random.random((d1,d2))
A2 = np.random.random((d1,d2))
def f(i, j):
print(i, j)
return np.inner(A1[i],A2[j])
A = np.fromfunction(f, shape=(d1, d1))
You'll see (i, j) equals:
(array([[ 0., 0., 0., 0., 0.],
[ 1., 1., 1., 1., 1.],
[ 2., 2., 2., 2., 2.],
[ 3., 3., 3., 3., 3.],
[ 4., 4., 4., 4., 4.]]), array([[ 0., 1., 2., 3., 4.],
[ 0., 1., 2., 3., 4.],
[ 0., 1., 2., 3., 4.],
[ 0., 1., 2., 3., 4.],
[ 0., 1., 2., 3., 4.]]))
Aha. The problem is that these arrays are float-valued. As the error message says, indices have to be of integer or boolean type.
Perusing the docstring for np.fromfunction reveals it has a third parameter, dtype, which controls the data type of coordinate arrays:
Parameters
dtype : data-type, optional
Data-type of the coordinate arrays passed to `function`.
By default, `dtype` is float.
Therefore the solution is to add dtype=int to the call to np.fromfunction:
A = np.fromfunction(f, shape=(d1, d1), dtype=int)
5 Comments
np.fromfunction is the right function for this purpose. You see what the indices i and j look like. They are superfluous -- no i,j or f is necessary -- since the entire computation can be done with np.inner(A1,A2).np.fromfunction does. There is no function in NumPy to do this because calling a Python function f for each tuple would be terribly slow for large arrays. To leverage NumPy effectively, you generally want to express the computation with the fewest number of function calls necessary, and pass the biggest array possible to those functions. This off-loads the most work to NumPy's fast underlying C/Fortran functions and relies the least on slower Python code.Explore related questions
See similar questions with these tags.