Code Review of small scientific project, particuarly array vs. list perform

Question 1

I'm new to python but not to programming in general. This is my first project beyond the basics.

I was wondering if I could get some feedback on the code in general, in particular on any bad Java/C++ habits I might be dragging along.

I have some specific questions regarding performance of numpy's array vs. lists. As you can see from this diff, I have rewritten some of the code to use the array class, which seems more "pythonic" (code seems to read better) but I find that it is sometimes almost twice as slow. Is this an inevitable performance hit for the abstraction, or am I doing something obviously wrong?

To get a proper working copy of the code:

git clone git://github.com/hemmer/pyDLA.git

And to checkout the array version use (afterwards):

git checkout numpyarray_slow

Any advice, however small, is greatly appreciated!

EDIT: Here is the source of the older (faster code). For the changes I made to use numpy arrays, please see the diff above.

#!/usr/bin/env python
import time
from math import pi, sqrt, cos, sin
from random import choice, random, seed
from pylab import pcolormesh, axes, show
from numpy import zeros, int32, arange
twopi = 2 * pi # 2 pi for choosing starting position
hits = 0 # how many seeds have stuck
birthradius = 5 # starting radius for walkers
deathradius = 10 # radius to kill off walkers
maxradius = -1 # the extent of the growth
numParticles = 1000 # how many walkers to release
seed(42) # fixed seed for debugging
L = 500 # lattice goes from -L : L
size = (2 * L) + 1 # so total lattice width is 2L + 1
# preallocate and initialise centre point as "seed"
lattice = zeros((size, size), dtype=int32)
lattice[L, L] = -1
# possible (relative) nearest neighbour sites
nnsteps = ((0, 1), (0, -1), (1, 0), (-1, 0))
# returns whether site pos = (x, y) has
# an occupied nearest-neighbour site
def nnOccupied(pos):
 # convert from lattice coords to array coords
 latticepos = (pos[0] + L, pos[1] + L)
 for step in nnsteps:
 if lattice[latticepos[0] + step[0], latticepos[1] + step[1]] != 0:
 return True
 else:
 return False
# end of nnOccupied
# check if a point is within the
# allowed radius
def inCircle(pos):
 if (pos[0] ** 2 + pos[1] ** 2) > deathradius ** 2: # faster than sqrt
 return False
 else:
 return True
# end of inCircle
# registers an extension on the seed
def registerHit(pos):
 global hits, birthradius, deathradius, maxradius
 # check if this "hit" extends the max radius
 norm2 = (pos[0] ** 2 + pos[1] ** 2)
 if norm2 > maxradius ** 2:
 maxradius = int(sqrt(norm2))
 birthradius = maxradius + 5 if (maxradius + 5) < L else L
 deathradius = maxradius + 20 if (maxradius + 20) < L else L
 hits += 1
 lattice[pos[0] + L, pos[1] + L] = hits
# end of registerHit
starttime = time.time()
print "Running", numParticles, "particles..."
for particle in xrange(numParticles):
 #print particle
 # find angle on [0, 2pi)
 angle = random() * twopi
 # and convert to a starting position, pos = (x, y),
 # on a circle of radius "birthradius" around the centre seed
 pos = [int(sin(angle) * birthradius), int(cos(angle) * birthradius)]
 isDead = False # walker starts off alive
 while not isDead:
 # pick one of the nearest neighbour sites to explore
 moveDir = choice(nnsteps)
 # and apply the selected move to position coordinate, pos
 pos[0] += moveDir[0]
 pos[1] += moveDir[1]
 if not inCircle(pos):
 isDead = True
 break
 elif nnOccupied(pos):
 registerHit(pos)
 isDead = True
 break
endtime = time.time()
print "Ran in time:", (endtime - starttime)
print "Maximum radius:", maxradius
# select only the interesting parts
M = maxradius
grph = L - M, L + M
# and plot
axis = arange(-M, M + 1)
pcolormesh(axis, axis, lattice[grph[0]:grph[1], grph[0]:grph[1]])
axes().set_aspect('equal', 'datalim')
show()

Question 2

Welcome to the Code Review stack. You are asking the kind of question this stack was designed for, but you need to include a reasonable amount of the code in your question. Please check the faq.

Question 3

Thanks! The code is around 100 lines, I assumed that would be too much to paste in full? It probably needs most of the context to make sense...

Question 4

@Hemmer, 100 lines is large, but not completely unreasonable. You'll probably get more people willing to respond if you can find a shorter piece to get advice on.

Question 5

@Hemmer 100 lines is not necessarily too much to post here. You are much more likely to get responses with 100 lines here than with a link to somewhere else. You should be able to edit your post to add the code.

Question 6

@jimreeed ok thanks, quite a few of those 100 are whitespace so its not as bad as it seems!

Question 7

Use of global variables is frowned upon. You should probably define a class, and hold the data as attributes on it.
Don't put logic at the module level, i.e. your main for loop. That is better placed inside of a main() function
In your diff, you use python's sum rather then numpy's sum. when dealing with numpy array's numpy's sum will be much faster. Basically, you should never use python math functions on numpy arrays. Always use the numpy equivalents.
Numpy is really not designed for having small arrays of two elements. Its really focused on having larger arrays so you are going to have trouble getting the speed benefits.
You should avoid writing loops over numpy arrays, instead you should use numpy's vector operation features

To get an effective use of numpy, you probably need if possible to process multiple particles in parallel so that you are operating on large arrays. The numpy's processing abilities will be able to help.

EDIT: Quick Sample of numpy vector operations

angles = numpy.random.rand(numParticles) * twopi
positions = numpy.empty((numParticles, 2))
positions[0,:] = numpy.sin(angles) * birthradius
positions[1,:] = numpy.cos(angles) * birthradius

This constructs a 2D array of positions. No python loops are used, instead operations over arrays are used. I'm not seeing a good way to use it in your case. Ordinarilly, I'd use them to run all particles at the same time. But you have dependency between your different runs.

More thoughts:

By keeping an array of bools marking the positions next to "hit" locations, you can avoid checking the neighbours in nnOccupied. Its cheaper to update the "next_to" array when a hit is detected.
Using two coordinate systems 0-based and L-based, confusion results. Standardise on one.
Setting isDead and using break is redundant. Pick one and stick with it
Don't if expr: return True else return False just return it
Why is nnsteps redecared in registerHit?

** EDIT: My rewrite of your code **

#!/usr/bin/env python
import time
from pylab import pcolormesh, axes, show
import numpy as np
L = 500 # lattice goes from -L : L
SIZE = (2 * L) + 1 # so total lattice width is 2L + 1
NNSTEPS = np.array([
 (0, 1), 
 (0, -1), 
 (1, 0), 
 (-1, 0)
])
CHUNK_SIZE = 1000
class Simulation(object):
 def __init__(self, num_particles):
 self.hits = 0
 self.birthradius = 5
 self.deathradius = 10
 self.maxradius = -1
 self.lattice = np.zeros((SIZE, SIZE), dtype=np.int32)
 self.lattice[L, L] = -1
 # initialize center point as the seed
 self.hit_zone = np.zeros((SIZE, SIZE), dtype=bool)
 self.hit_zone[L, L] = True
 self.hit_zone[ NNSTEPS[:,0] + L, NNSTEPS[:,1] + L ] = True
 # the hit_zone is all of the position next to places that have
 # actually hit, these count as hits
 self.num_particles = num_particles
 def register_hit(self, pos):
 neighbors = NNSTEPS + pos
 self.hit_zone[neighbors[:,0], neighbors[:,1]] = True
 # update hit_zone
 # check if this "hit" extends the max radius
 norm2 = (pos[0] - L)**2 + (pos[1] - L)**2
 if norm2 > self.maxradius ** 2:
 self.maxradius = int(np.sqrt(norm2))
 self.birthradius = min(self.maxradius + 5, L)
 self.deathradius = min(self.maxradius + 20, L)
 self.hits += 1
 self.lattice[pos] = self.hits
 def run(self):
 for particle in xrange(self.num_particles):
 # find angle on [0, 2pi)
 angle = np.random.random() * 2 * np.pi
 # and convert to a starting position, pos = (x, y),
 # on a circle of radius "birthradius" around the centre seed
 pos = (np.sin(angle) * self.birthradius + L, np.cos(angle) * self.birthradius + L)
 while True:
 moves = np.random.randint(0, 4, CHUNK_SIZE) # pick the move numbers
 moves = np.cumsum(NNSTEPS[moves], axis = 0) # grab the move and do a sum
 # add starting position to all of that
 moves[:,0] += pos[0]
 moves[:,1] += pos[1]
 # calculate distance to center for all the points
 from_center = moves - L
 distances_to_center = from_center[:,0]**2 + from_center[:,1]**2
 alive = distances_to_center < self.deathradius ** 2
 alive = np.minimum.accumulate(alive)
 particle_hits = self.hit_zone[moves[:,0], moves[:,1]]
 if np.any(particle_hits):
 first_hit = particle_hits.nonzero()[0][0]
 if alive[first_hit]:
 pos = tuple(moves[first_hit])
 self.register_hit(pos)
 break
 else:
 if np.all(alive):
 pos = tuple(moves[-1])
 else:
 break
NUMBER_PARTICLES = 1000
def main():
 np.random.seed(42)
 simulation = Simulation(NUMBER_PARTICLES)
 starttime = time.time()
 print "Running", NUMBER_PARTICLES, "particles..."
 simulation.run()
 endtime = time.time()
 print "Ran in time:", (endtime - starttime)
 print "Maximum radius:", simulation.maxradius
 # select only the interesting parts
 M = simulation.maxradius
 grph = L - M, L + M
 # and plot
 axis = np.arange(-M, M + 1)
 pcolormesh(axis, axis, simulation.lattice[grph[0]:grph[1], grph[0]:grph[1]])
 axes().set_aspect('equal', 'datalim')
 show()
if __name__ == '__main__':
 main()

The tricky part here was to see how I could numpy's vector operations to help in this situation. Here I'll try to explain that.

 moves = np.random.randint(0, 4, CHUNK_SIZE) # pick the move numbers

Here we pick 1000 different numbers 0, 1, 2, or 3, corresponding to the directions we can move

 moves = np.cumsum(NNSTEPS[moves], axis = 0) # grab the move and do a sum

We use those numbers to index into the NNSTEPS array, thus we have an array of all the moves. Cumsum adds cumulatively over the whole array. So the nth element is the sum of the first n elements.

 # add starting position to all of that
 moves[:,0] += pos[0]
 moves[:,1] += pos[1]

Add the start position to all of that producing an array of the next 1000 positions that will be visited.

 # calculate distance to center for all the points
 from_center = moves - L
 distances_to_center = from_center[:,0]**2 + from_center[:,1]**2

We calculate all of the distances to the center point. Since we don't have to write explicit loops this is pretty efficient.

 alive = distances_to_center < self.deathradius ** 2
 alive = np.minimum.accumulate(alive)

alive indicates whether the particular is alive during a particular move. We start by marking those element as alive when the distance is less than deathradius. The minimum.accumulate sets the nth element to the be the minimum of the first n elements. The consequence is that after the first false (dead) element, all the rest are dead

 particle_hits = self.hit_zone[moves[:,0], moves[:,1]]

Here we do a vector operation checking the hit_zone. particle_hits becomes a boolean array of every that the path will hit something.

 if np.any(particle_hits):
 first_hit = particle_hits.nonzero()[0][0]

If anything of the 1000 we are considering hits, figure out where

 if alive[first_hit]:

Make sure we are still alive

 pos = tuple(moves[first_hit])
 self.register_hit(pos)

Record that hit break else: if np.all(alive): pos = tuple(moves[-1])

If we survived, take the tuple and run another round.

 else:
 break

Basically, this way we run 1000 elements of a path at a time avoiding much of the overhead of python.

Question 8

Thanks, not looked at python classes yet so this is a good excuse to learn. The rest of the advice is really solid too, already seeing speedups! I was wondering if there were any specific examples for point 5. that I could use?

Question 9

@Hemmer, I've added a rewrite of your code.

Question 10

Wow this is a fantastic update, really good stuff here. With the numpy stuff, it really requires a different way of thinking about the problem, I had just been forcing it into my traditional style (of manually iterating through with loops). This is exactly the sort of response I was looking for thanks (a lot to take in too!), I might try applying this kind of thinking to the Ising Model next.

Question 11

@Hemmer, numpy is bizarre at first, then somehow it eventually becomes oddly natural.

Winston Ewert Winston Ewert 30.7k4 gold badges52 silver badges79 bronze badges · Accepted Answer · 2011-08-23 17:37:12Z

Use of global variables is frowned upon. You should probably define a class, and hold the data as attributes on it.
Don't put logic at the module level, i.e. your main for loop. That is better placed inside of a main() function
In your diff, you use python's sum rather then numpy's sum. when dealing with numpy array's numpy's sum will be much faster. Basically, you should never use python math functions on numpy arrays. Always use the numpy equivalents.
Numpy is really not designed for having small arrays of two elements. Its really focused on having larger arrays so you are going to have trouble getting the speed benefits.
You should avoid writing loops over numpy arrays, instead you should use numpy's vector operation features

To get an effective use of numpy, you probably need if possible to process multiple particles in parallel so that you are operating on large arrays. The numpy's processing abilities will be able to help.

EDIT: Quick Sample of numpy vector operations

angles = numpy.random.rand(numParticles) * twopi
positions = numpy.empty((numParticles, 2))
positions[0,:] = numpy.sin(angles) * birthradius
positions[1,:] = numpy.cos(angles) * birthradius

This constructs a 2D array of positions. No python loops are used, instead operations over arrays are used. I'm not seeing a good way to use it in your case. Ordinarilly, I'd use them to run all particles at the same time. But you have dependency between your different runs.

More thoughts:

By keeping an array of bools marking the positions next to "hit" locations, you can avoid checking the neighbours in nnOccupied. Its cheaper to update the "next_to" array when a hit is detected.
Using two coordinate systems 0-based and L-based, confusion results. Standardise on one.
Setting isDead and using break is redundant. Pick one and stick with it
Don't if expr: return True else return False just return it
Why is nnsteps redecared in registerHit?

** EDIT: My rewrite of your code **

#!/usr/bin/env python
import time
from pylab import pcolormesh, axes, show
import numpy as np
L = 500 # lattice goes from -L : L
SIZE = (2 * L) + 1 # so total lattice width is 2L + 1
NNSTEPS = np.array([
 (0, 1), 
 (0, -1), 
 (1, 0), 
 (-1, 0)
])
CHUNK_SIZE = 1000
class Simulation(object):
 def __init__(self, num_particles):
 self.hits = 0
 self.birthradius = 5
 self.deathradius = 10
 self.maxradius = -1
 self.lattice = np.zeros((SIZE, SIZE), dtype=np.int32)
 self.lattice[L, L] = -1
 # initialize center point as the seed
 self.hit_zone = np.zeros((SIZE, SIZE), dtype=bool)
 self.hit_zone[L, L] = True
 self.hit_zone[ NNSTEPS[:,0] + L, NNSTEPS[:,1] + L ] = True
 # the hit_zone is all of the position next to places that have
 # actually hit, these count as hits
 self.num_particles = num_particles
 def register_hit(self, pos):
 neighbors = NNSTEPS + pos
 self.hit_zone[neighbors[:,0], neighbors[:,1]] = True
 # update hit_zone
 # check if this "hit" extends the max radius
 norm2 = (pos[0] - L)**2 + (pos[1] - L)**2
 if norm2 > self.maxradius ** 2:
 self.maxradius = int(np.sqrt(norm2))
 self.birthradius = min(self.maxradius + 5, L)
 self.deathradius = min(self.maxradius + 20, L)
 self.hits += 1
 self.lattice[pos] = self.hits
 def run(self):
 for particle in xrange(self.num_particles):
 # find angle on [0, 2pi)
 angle = np.random.random() * 2 * np.pi
 # and convert to a starting position, pos = (x, y),
 # on a circle of radius "birthradius" around the centre seed
 pos = (np.sin(angle) * self.birthradius + L, np.cos(angle) * self.birthradius + L)
 while True:
 moves = np.random.randint(0, 4, CHUNK_SIZE) # pick the move numbers
 moves = np.cumsum(NNSTEPS[moves], axis = 0) # grab the move and do a sum
 # add starting position to all of that
 moves[:,0] += pos[0]
 moves[:,1] += pos[1]
 # calculate distance to center for all the points
 from_center = moves - L
 distances_to_center = from_center[:,0]**2 + from_center[:,1]**2
 alive = distances_to_center < self.deathradius ** 2
 alive = np.minimum.accumulate(alive)
 particle_hits = self.hit_zone[moves[:,0], moves[:,1]]
 if np.any(particle_hits):
 first_hit = particle_hits.nonzero()[0][0]
 if alive[first_hit]:
 pos = tuple(moves[first_hit])
 self.register_hit(pos)
 break
 else:
 if np.all(alive):
 pos = tuple(moves[-1])
 else:
 break
NUMBER_PARTICLES = 1000
def main():
 np.random.seed(42)
 simulation = Simulation(NUMBER_PARTICLES)
 starttime = time.time()
 print "Running", NUMBER_PARTICLES, "particles..."
 simulation.run()
 endtime = time.time()
 print "Ran in time:", (endtime - starttime)
 print "Maximum radius:", simulation.maxradius
 # select only the interesting parts
 M = simulation.maxradius
 grph = L - M, L + M
 # and plot
 axis = np.arange(-M, M + 1)
 pcolormesh(axis, axis, simulation.lattice[grph[0]:grph[1], grph[0]:grph[1]])
 axes().set_aspect('equal', 'datalim')
 show()
if __name__ == '__main__':
 main()

The tricky part here was to see how I could numpy's vector operations to help in this situation. Here I'll try to explain that.

 moves = np.random.randint(0, 4, CHUNK_SIZE) # pick the move numbers

Here we pick 1000 different numbers 0, 1, 2, or 3, corresponding to the directions we can move

 moves = np.cumsum(NNSTEPS[moves], axis = 0) # grab the move and do a sum

We use those numbers to index into the NNSTEPS array, thus we have an array of all the moves. Cumsum adds cumulatively over the whole array. So the nth element is the sum of the first n elements.

 # add starting position to all of that
 moves[:,0] += pos[0]
 moves[:,1] += pos[1]

Add the start position to all of that producing an array of the next 1000 positions that will be visited.

 # calculate distance to center for all the points
 from_center = moves - L
 distances_to_center = from_center[:,0]**2 + from_center[:,1]**2

We calculate all of the distances to the center point. Since we don't have to write explicit loops this is pretty efficient.

 alive = distances_to_center < self.deathradius ** 2
 alive = np.minimum.accumulate(alive)

alive indicates whether the particular is alive during a particular move. We start by marking those element as alive when the distance is less than deathradius. The minimum.accumulate sets the nth element to the be the minimum of the first n elements. The consequence is that after the first false (dead) element, all the rest are dead

 particle_hits = self.hit_zone[moves[:,0], moves[:,1]]

Here we do a vector operation checking the hit_zone. particle_hits becomes a boolean array of every that the path will hit something.

 if np.any(particle_hits):
 first_hit = particle_hits.nonzero()[0][0]

If anything of the 1000 we are considering hits, figure out where

 if alive[first_hit]:

Make sure we are still alive

 pos = tuple(moves[first_hit])
 self.register_hit(pos)

Record that hit break else: if np.all(alive): pos = tuple(moves[-1])

If we survived, take the tuple and run another round.

 else:
 break

Basically, this way we run 1000 elements of a path at a time avoiding much of the overhead of python.

Thanks, not looked at python classes yet so this is a good excuse to learn. The rest of the advice is really solid too, already seeing speedups! I was wondering if there were any specific examples for point 5. that I could use?
Wow this is a fantastic update, really good stuff here. With the numpy stuff, it really requires a different way of thinking about the problem, I had just been forcing it into my traditional style (of manually iterating through with loops). This is exactly the sort of response I was looking for thanks (a lot to take in too!), I might try applying this kind of thinking to the Ising Model next.
@Hemmer, numpy is bizarre at first, then somehow it eventually becomes oddly natural.

Stack Exchange Network

Code Review of small scientific project, particuarly array vs. list perform

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Code Review of small scientific project, particuarly array vs. list perform

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions