I have some code which generates the coordinates of a cylindrically-symmetric surface, with coordinates given as (r, theta, phi). At the moment, I generate the coordinates of one phi slice, and store that in a 2xN array (for N bins), and then in a for loop I copy this array for each value of phi from 0 to 2pi:
import numpy as np
# this is the number of bins that my surface is chopped into
numbins = 50
# these are the values for r
r_vals = np.linspace(0.0001, 50, numbins, endpoint = True)
# these are the values for theta
theta_vals = np.linspace(0, np.pi / 2, numbins, endpoint = True)
# I combine the r and theta values into a 2xnumbins array for one "slice" of phi
phi_slice = np.vstack([r_vals,theta_vals]).T
# this is the array which will store all of the coordinates of the surface
surface = np.zeros((numbins**2,3))
lower_bound = 0
upper_bound = numbins
# this is the size of the bin for phi
dphi = (2 * np.pi) / numbins
# this is the for loop I'd like to eliminate.
# For every value of phi, it puts a copy of the phi_slice array into
# the surface array, so that the surface is cylindrical about phi.
for phi in np.linspace(0, (2 * np.pi) - dphi, numbins):
surface[lower_bound:upper_bound, :2] = phi_slice
surface[lower_bound:upper_bound,2] = phi
lower_bound += numbins
upper_bound += numbins
I'm calling this routine in a numerical integration of 1e6 or 1e7 steps, and while numbins is 50 in the example above, in practice it'll be in the thousands. This for loop is a choking point, and I'd really like to eliminate it to speed things up. Is there a good NumPythonic way to do the same thing as this for loop?
-
I'm voting to close this question as off-topic because the question belongs on another site in the Stack Exchange network: codereview.stackexchange.comAChampion– AChampion2017年03月03日 03:57:56 +00:00Commented Mar 3, 2017 at 3:57
-
Keep it here; eliminating loops from numpy code is a common SO question. There are more numpy experts here.hpaulj– hpaulj2017年03月03日 04:11:47 +00:00Commented Mar 3, 2017 at 4:11
-
1But it helps if the example is MCVe - minimal, complete, verifiable. That way we can easily copy-n-paste, test results and test changes. Show some inputs and outputs.hpaulj– hpaulj2017年03月03日 04:14:39 +00:00Commented Mar 3, 2017 at 4:14
-
It's hard to picture what's going on by just reading the code. It needs to be simpler with some concrete values. Add some desciption. How many times are you looping?hpaulj– hpaulj2017年03月03日 04:49:29 +00:00Commented Mar 3, 2017 at 4:49
-
Thanks for the feedback! I've edited it, and it's now MCV; let me know if there's any more I can elaborate.Alex– Alex2017年03月03日 05:57:21 +00:00Commented Mar 3, 2017 at 5:57
3 Answers 3
Timing that loop:
In [9]: %%timeit
...: lower_bound, upper_bound = 0, numbins
...: for phi in np.linspace(0, (2 * np.pi) - dphi, numbins):
...: surface[lower_bound:upper_bound,:2] = phi_slice
...: surface[lower_bound:upper_bound,2] = phi
...: lower_bound += numbins
...: upper_bound += numbins
10000 loops, best of 3: 176 μs per loop
Off hand that doesn't look bad, though if repeated in some larger context that does matter. You are looping 50 times to fill an array of 75000 items. For the size of the task the number of loops isn't large.
Daniel's alternative is a little faster, but not drastic
In [12]: %%timeit
...: phi_slices = np.tile(phi_slice.T, numbins).T
...: phi_indices = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)
...: surface1 = np.c_[phi_slices, phi_indices]
...:
10000 loops, best of 3: 137 μs per loop
kazemakase'sis a bit better still:
In [15]: %%timeit
...: phis = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)[:, np.newaxis]
...: slices = np.repeat(phi_slice[np.newaxis, :, :], numbins, axis=0).reshape(-1, 2)
...: surface2 = np.hstack([slices, phis])
...:
10000 loops, best of 3: 115 μs per loop
And my nomination (thanks to the others for helping me see the pattern); I'm taking advantage of broadcasting in assignment.
surface3 = np.zeros((numbins,numbins,3))
phis = np.linspace(0, (2 * np.pi) - dphi, numbins)
surface3[:,:,2] = phis[:,None]
surface3[:,:,:2] = phi_slice[None,:,:]
surface3.shape = (numbins**2,3)
A bit better:
In [50]: %%timeit
...: surface3 = np.zeros((numbins,numbins,3))
...: phis=np.linspace(0, (2 * np.pi) - dphi, numbins)
...: surface3[:,:,2]=phis[:,None]
...: surface3[:,:,:2]=phi_slice[None,:,:]
...: surface3.shape=(numbins**2,3)
...:
10000 loops, best of 3: 73.3 μs per loop
edit
Replacing:
surface3[:,:,:2]=phi_slice[None,:,:]
with
surface3[:,:,0]=r_vals[None,:]
surface3[:,:,1]=theta_vals[None,:]
squeezes out a bit more of a time improvement, especially if phi_slice was constructed solely for this use.
3 Comments
numbins is "in the thousands" since both of you have a reshape at the end.surface is necessarily numbins^2Building this sort of blockwise array is done with np.repeat and np.tile
phi_slices = np.tile(phi_slice.T, numbins).T
phi_indices = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)
surface = np.c_[phi_slices, phi_indices]
1 Comment
This is a possible way to vectorize the loop:
phis = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)[:, np.newaxis]
slices = np.repeat(phi_slice[np.newaxis, :, :], numbins, axis=0).reshape(-1, 2)
surface2 = np.hstack([slices, phis])
print(np.allclose(surface, surface2))
# True
That's what is going on in detail:
np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)takes the array with all values ofphiand repeats each elementnumbinstimes.[:, np.newaxis]brings the result into 2D shape(numbins**2, 1).phi_slice[np.newaxis, :, :]bringsphi_sliceinto 3D shape(1, numbins, 2.. This array is repeated along the first axis, resulting in shape(numbins, numbins, 2). Finallyreshape(-1, 2)brings the first two dimensions back together to final shape(numbins**2, 2).np.hstack([slices, phis])combines both parts of the final array.