How to fill an array without a for loop

Question 1

I have some code which generates the coordinates of a cylindrically-symmetric surface, with coordinates given as (r, theta, phi). At the moment, I generate the coordinates of one phi slice, and store that in a 2xN array (for N bins), and then in a for loop I copy this array for each value of phi from 0 to 2pi:

import numpy as np
# this is the number of bins that my surface is chopped into
numbins = 50
# these are the values for r
r_vals = np.linspace(0.0001, 50, numbins, endpoint = True)
# these are the values for theta
theta_vals = np.linspace(0, np.pi / 2, numbins, endpoint = True)
# I combine the r and theta values into a 2xnumbins array for one "slice" of phi
phi_slice = np.vstack([r_vals,theta_vals]).T
# this is the array which will store all of the coordinates of the surface
surface = np.zeros((numbins**2,3))
lower_bound = 0
upper_bound = numbins
# this is the size of the bin for phi
dphi = (2 * np.pi) / numbins
# this is the for loop I'd like to eliminate.
# For every value of phi, it puts a copy of the phi_slice array into
# the surface array, so that the surface is cylindrical about phi.
for phi in np.linspace(0, (2 * np.pi) - dphi, numbins):
 surface[lower_bound:upper_bound, :2] = phi_slice
 surface[lower_bound:upper_bound,2] = phi
 lower_bound += numbins
 upper_bound += numbins

I'm calling this routine in a numerical integration of 1e6 or 1e7 steps, and while numbins is 50 in the example above, in practice it'll be in the thousands. This for loop is a choking point, and I'd really like to eliminate it to speed things up. Is there a good NumPythonic way to do the same thing as this for loop?

Question 2

I'm voting to close this question as off-topic because the question belongs on another site in the Stack Exchange network: codereview.stackexchange.com

Question 3

Keep it here; eliminating loops from numpy code is a common SO question. There are more numpy experts here.

Question 4

But it helps if the example is MCVe - minimal, complete, verifiable. That way we can easily copy-n-paste, test results and test changes. Show some inputs and outputs.

Question 5

It's hard to picture what's going on by just reading the code. It needs to be simpler with some concrete values. Add some desciption. How many times are you looping?

Question 6

Thanks for the feedback! I've edited it, and it's now MCV; let me know if there's any more I can elaborate.

Question 7

Timing that loop:

In [9]: %%timeit
 ...: lower_bound, upper_bound = 0, numbins 
 ...: for phi in np.linspace(0, (2 * np.pi) - dphi, numbins):
 ...: surface[lower_bound:upper_bound,:2] = phi_slice
 ...: surface[lower_bound:upper_bound,2] = phi
 ...: lower_bound += numbins
 ...: upper_bound += numbins
10000 loops, best of 3: 176 μs per loop

Off hand that doesn't look bad, though if repeated in some larger context that does matter. You are looping 50 times to fill an array of 75000 items. For the size of the task the number of loops isn't large.

Daniel's alternative is a little faster, but not drastic

In [12]: %%timeit 
 ...: phi_slices = np.tile(phi_slice.T, numbins).T
 ...: phi_indices = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)
 ...: surface1 = np.c_[phi_slices, phi_indices]
 ...: 
10000 loops, best of 3: 137 μs per loop

kazemakase'sis a bit better still:

In [15]: %%timeit 
 ...: phis = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)[:, np.newaxis]
 ...: slices = np.repeat(phi_slice[np.newaxis, :, :], numbins, axis=0).reshape(-1, 2)
 ...: surface2 = np.hstack([slices, phis])
 ...: 
10000 loops, best of 3: 115 μs per loop

And my nomination (thanks to the others for helping me see the pattern); I'm taking advantage of broadcasting in assignment.

surface3 = np.zeros((numbins,numbins,3))
phis = np.linspace(0, (2 * np.pi) - dphi, numbins)
surface3[:,:,2] = phis[:,None]
surface3[:,:,:2] = phi_slice[None,:,:]
surface3.shape = (numbins**2,3)

A bit better:

In [50]: %%timeit
 ...: surface3 = np.zeros((numbins,numbins,3))
 ...: phis=np.linspace(0, (2 * np.pi) - dphi, numbins)
 ...: surface3[:,:,2]=phis[:,None]
 ...: surface3[:,:,:2]=phi_slice[None,:,:]
 ...: surface3.shape=(numbins**2,3)
 ...: 
10000 loops, best of 3: 73.3 μs per loop

edit

Replacing:

surface3[:,:,:2]=phi_slice[None,:,:]

with

surface3[:,:,0]=r_vals[None,:]
surface3[:,:,1]=theta_vals[None,:]

squeezes out a bit more of a time improvement, especially if phi_slice was constructed solely for this use.

Question 8

Interested to see how things stack up when numbins is "in the thousands" since both of you have a reshape at the end.

Question 9

yeah, that's when things start getting really slow because surface is necessarily numbins^2

Question 10

I don't think reshape, especially the inplace one, changes relative timings. It's a trivial operation.

Question 11

Building this sort of blockwise array is done with np.repeat and np.tile

phi_slices = np.tile(phi_slice.T, numbins).T
phi_indices = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)
surface = np.c_[phi_slices, phi_indices]

Question 12

Thanks! I've not seen np.repeat before, seems like exactly what I need

Question 13

This is a possible way to vectorize the loop:

phis = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)[:, np.newaxis]
slices = np.repeat(phi_slice[np.newaxis, :, :], numbins, axis=0).reshape(-1, 2)
surface2 = np.hstack([slices, phis])
print(np.allclose(surface, surface2))
# True

That's what is going on in detail:

np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins) takes the array with all values of phi and repeats each element numbins times. [:, np.newaxis] brings the result into 2D shape (numbins**2, 1).
phi_slice[np.newaxis, :, :] brings phi_slice into 3D shape (1, numbins, 2.. This array is repeated along the first axis, resulting in shape (numbins, numbins, 2). Finally reshape(-1, 2) brings the first two dimensions back together to final shape (numbins**2, 2).
np.hstack([slices, phis]) combines both parts of the final array.

hpaulj 233k14 gold badges260 silver badges392 bronze badges · Accepted Answer · 2017-03-03 08:15:15Z

Timing that loop:

In [9]: %%timeit
 ...: lower_bound, upper_bound = 0, numbins 
 ...: for phi in np.linspace(0, (2 * np.pi) - dphi, numbins):
 ...: surface[lower_bound:upper_bound,:2] = phi_slice
 ...: surface[lower_bound:upper_bound,2] = phi
 ...: lower_bound += numbins
 ...: upper_bound += numbins
10000 loops, best of 3: 176 μs per loop

Off hand that doesn't look bad, though if repeated in some larger context that does matter. You are looping 50 times to fill an array of 75000 items. For the size of the task the number of loops isn't large.

Daniel's alternative is a little faster, but not drastic

In [12]: %%timeit 
 ...: phi_slices = np.tile(phi_slice.T, numbins).T
 ...: phi_indices = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)
 ...: surface1 = np.c_[phi_slices, phi_indices]
 ...: 
10000 loops, best of 3: 137 μs per loop

kazemakase'sis a bit better still:

In [15]: %%timeit 
 ...: phis = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)[:, np.newaxis]
 ...: slices = np.repeat(phi_slice[np.newaxis, :, :], numbins, axis=0).reshape(-1, 2)
 ...: surface2 = np.hstack([slices, phis])
 ...: 
10000 loops, best of 3: 115 μs per loop

And my nomination (thanks to the others for helping me see the pattern); I'm taking advantage of broadcasting in assignment.

surface3 = np.zeros((numbins,numbins,3))
phis = np.linspace(0, (2 * np.pi) - dphi, numbins)
surface3[:,:,2] = phis[:,None]
surface3[:,:,:2] = phi_slice[None,:,:]
surface3.shape = (numbins**2,3)

A bit better:

In [50]: %%timeit
 ...: surface3 = np.zeros((numbins,numbins,3))
 ...: phis=np.linspace(0, (2 * np.pi) - dphi, numbins)
 ...: surface3[:,:,2]=phis[:,None]
 ...: surface3[:,:,:2]=phi_slice[None,:,:]
 ...: surface3.shape=(numbins**2,3)
 ...: 
10000 loops, best of 3: 73.3 μs per loop

edit

Replacing:

surface3[:,:,:2]=phi_slice[None,:,:]

with

surface3[:,:,0]=r_vals[None,:]
surface3[:,:,1]=theta_vals[None,:]

squeezes out a bit more of a time improvement, especially if phi_slice was constructed solely for this use.

Interested to see how things stack up when numbins is "in the thousands" since both of you have a reshape at the end.
yeah, that's when things start getting really slow because surface is necessarily numbins^2
I don't think reshape, especially the inplace one, changes relative timings. It's a trivial operation.

CollectivesTM on Stack Overflow

How to fill an array without a for loop

3 Answers 3

edit

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

edit

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related