0

I have some code which generates the coordinates of a cylindrically-symmetric surface, with coordinates given as (r, theta, phi). At the moment, I generate the coordinates of one phi slice, and store that in a 2xN array (for N bins), and then in a for loop I copy this array for each value of phi from 0 to 2pi:

import numpy as np
# this is the number of bins that my surface is chopped into
numbins = 50
# these are the values for r
r_vals = np.linspace(0.0001, 50, numbins, endpoint = True)
# these are the values for theta
theta_vals = np.linspace(0, np.pi / 2, numbins, endpoint = True)
# I combine the r and theta values into a 2xnumbins array for one "slice" of phi
phi_slice = np.vstack([r_vals,theta_vals]).T
# this is the array which will store all of the coordinates of the surface
surface = np.zeros((numbins**2,3))
lower_bound = 0
upper_bound = numbins
# this is the size of the bin for phi
dphi = (2 * np.pi) / numbins
# this is the for loop I'd like to eliminate.
# For every value of phi, it puts a copy of the phi_slice array into
# the surface array, so that the surface is cylindrical about phi.
for phi in np.linspace(0, (2 * np.pi) - dphi, numbins):
 surface[lower_bound:upper_bound, :2] = phi_slice
 surface[lower_bound:upper_bound,2] = phi
 lower_bound += numbins
 upper_bound += numbins

I'm calling this routine in a numerical integration of 1e6 or 1e7 steps, and while numbins is 50 in the example above, in practice it'll be in the thousands. This for loop is a choking point, and I'd really like to eliminate it to speed things up. Is there a good NumPythonic way to do the same thing as this for loop?

asked Mar 3, 2017 at 3:54
9
  • I'm voting to close this question as off-topic because the question belongs on another site in the Stack Exchange network: codereview.stackexchange.com Commented Mar 3, 2017 at 3:57
  • Keep it here; eliminating loops from numpy code is a common SO question. There are more numpy experts here. Commented Mar 3, 2017 at 4:11
  • 1
    But it helps if the example is MCVe - minimal, complete, verifiable. That way we can easily copy-n-paste, test results and test changes. Show some inputs and outputs. Commented Mar 3, 2017 at 4:14
  • It's hard to picture what's going on by just reading the code. It needs to be simpler with some concrete values. Add some desciption. How many times are you looping? Commented Mar 3, 2017 at 4:49
  • Thanks for the feedback! I've edited it, and it's now MCV; let me know if there's any more I can elaborate. Commented Mar 3, 2017 at 5:57

3 Answers 3

2

Timing that loop:

In [9]: %%timeit
 ...: lower_bound, upper_bound = 0, numbins 
 ...: for phi in np.linspace(0, (2 * np.pi) - dphi, numbins):
 ...: surface[lower_bound:upper_bound,:2] = phi_slice
 ...: surface[lower_bound:upper_bound,2] = phi
 ...: lower_bound += numbins
 ...: upper_bound += numbins
10000 loops, best of 3: 176 μs per loop

Off hand that doesn't look bad, though if repeated in some larger context that does matter. You are looping 50 times to fill an array of 75000 items. For the size of the task the number of loops isn't large.

Daniel's alternative is a little faster, but not drastic

In [12]: %%timeit 
 ...: phi_slices = np.tile(phi_slice.T, numbins).T
 ...: phi_indices = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)
 ...: surface1 = np.c_[phi_slices, phi_indices]
 ...: 
10000 loops, best of 3: 137 μs per loop

kazemakase'sis a bit better still:

In [15]: %%timeit 
 ...: phis = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)[:, np.newaxis]
 ...: slices = np.repeat(phi_slice[np.newaxis, :, :], numbins, axis=0).reshape(-1, 2)
 ...: surface2 = np.hstack([slices, phis])
 ...: 
10000 loops, best of 3: 115 μs per loop

And my nomination (thanks to the others for helping me see the pattern); I'm taking advantage of broadcasting in assignment.

surface3 = np.zeros((numbins,numbins,3))
phis = np.linspace(0, (2 * np.pi) - dphi, numbins)
surface3[:,:,2] = phis[:,None]
surface3[:,:,:2] = phi_slice[None,:,:]
surface3.shape = (numbins**2,3)

A bit better:

In [50]: %%timeit
 ...: surface3 = np.zeros((numbins,numbins,3))
 ...: phis=np.linspace(0, (2 * np.pi) - dphi, numbins)
 ...: surface3[:,:,2]=phis[:,None]
 ...: surface3[:,:,:2]=phi_slice[None,:,:]
 ...: surface3.shape=(numbins**2,3)
 ...: 
10000 loops, best of 3: 73.3 μs per loop

edit

Replacing:

surface3[:,:,:2]=phi_slice[None,:,:]

with

surface3[:,:,0]=r_vals[None,:]
surface3[:,:,1]=theta_vals[None,:]

squeezes out a bit more of a time improvement, especially if phi_slice was constructed solely for this use.

answered Mar 3, 2017 at 8:15
Sign up to request clarification or add additional context in comments.

3 Comments

Interested to see how things stack up when numbins is "in the thousands" since both of you have a reshape at the end.
yeah, that's when things start getting really slow because surface is necessarily numbins^2
I don't think reshape, especially the inplace one, changes relative timings. It's a trivial operation.
1

Building this sort of blockwise array is done with np.repeat and np.tile

phi_slices = np.tile(phi_slice.T, numbins).T
phi_indices = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)
surface = np.c_[phi_slices, phi_indices]
answered Mar 3, 2017 at 7:57

1 Comment

Thanks! I've not seen np.repeat before, seems like exactly what I need
1

This is a possible way to vectorize the loop:

phis = np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins)[:, np.newaxis]
slices = np.repeat(phi_slice[np.newaxis, :, :], numbins, axis=0).reshape(-1, 2)
surface2 = np.hstack([slices, phis])
print(np.allclose(surface, surface2))
# True

That's what is going on in detail:

  1. np.repeat(np.linspace(0, (2 * np.pi) - dphi, numbins), numbins) takes the array with all values of phi and repeats each element numbins times. [:, np.newaxis] brings the result into 2D shape (numbins**2, 1).
  2. phi_slice[np.newaxis, :, :] brings phi_slice into 3D shape (1, numbins, 2.. This array is repeated along the first axis, resulting in shape (numbins, numbins, 2). Finally reshape(-1, 2) brings the first two dimensions back together to final shape (numbins**2, 2).
  3. np.hstack([slices, phis]) combines both parts of the final array.
answered Mar 3, 2017 at 8:00

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.