6

I developed an application which simulates N robots moving in a grid which try to maximize the amount of visited grid cells in a limited amount of steps, meeting in a goal point. It all works correctly, but is horrible slow. It is currently python+numpy+mathplotlib.

Maximum robots there can be have a soft limit of 100 (if it can get higher, it is nice to have).

To do that, I do the following, simplified:

while steps > 0:
 for robot in robots:
 agent.calc(robot,steps)

A robot is a 1x2 numpy array (x-and-y-coordinates).

The agent here decides what to do. Since I need to switch the tactic and strategy on the fly, I cannot move that logic.

agent.calc updates a robot in place, one after another.

cProfiling it returns the following. Extracting the top

 39014272 function calls (39010490 primitive calls) in 150.314 seconds
 Ordered by: internal time
 ncalls tottime percall cumtime percall filename:lineno(function)
 12417735 62.807 0.000 62.807 0.000 distance.py:8(taxicab_distance)
 124596 36.882 0.000 36.882 0.000 {numpy.core.multiarray.array}
 113657 30.204 0.000 100.800 0.001 logical_agent.py:16(choose_max_distance_to...)
 12417013 6.579 0.000 69.384 0.000 squaregrid.py:30(distance)
 113700 2.900 0.000 109.769 0.001 logical_agent.py:73(calc)
 11652363 2.625 0.000 2.625 0.000 {method 'append' of 'list' objects}
 161849 1.653 0.000 1.653 0.000 distance.py:11(euclidean_distance)
 113664 1.632 0.000 1.632 0.000 {sorted}
 114834 1.185 0.000 1.185 0.000 {method 'keys' of 'dict' objects}
 113700 0.695 0.000 1.134 0.000 squaregrid.py:19(neighbours)

I implemented different environments for the robots, the most important is squaregird. Every environment has its own distance function, since I intended to use different metrics, i.e. Manhattan/taxicab and euclidean. I extracted the distance function into an own distance.py file, since I use it in several occasions.

One can see that taxicab_distance is called alot, since the agent needs to evaluate the distances of a robots four neighbours and itself to a goal point to see whether the next position can still reach the goal and to maximize the distance to all other robots as a optimizing heuristics.

The function does not do anything fancy, just

def taxicab_distance(u, v):
 return np.abs(u[0] - v[0]) + np.abs(u[1] - v[1])

I know that python has a pretty high function call overhead, and I assume that that hits the performance. The {numpy.core.multiarray.array} can be ignored, I think I know what I am doing wrong there.

Distance call chain: agent -> environment.distance -> taxicab_distance

The question is, how can I reduce the overhead of calling the function? I strongly considered using pythons c extensibility, cython, to be more concrete. Will it work? can there be another reason why it is so slow?

asked Jan 1, 2014 at 14:45
8
  • I doing very much that the overhead of calling the function is significant, seeing as it is almost certainly dominated by the actual work done by the function. Commented Jan 1, 2014 at 15:06
  • 7
    As an experiment, temporarily stop using taxicab_distance. Instead, execute function's code directly: np.abs(u[0] - v[0]) + np.abs(u[1] - v[1]). Does it help the performance in an appreciable way? My guess is that you should be hunting big (a better algorithm, caching strategies, etc), not small (micro-optimizations like function call overhead). Commented Jan 1, 2014 at 15:11
  • 1
    Are you taking advantage of vectorized operations? It doesn't look like it. If you don't, using NumPy will actually be slower than using regular lists. Commented Jan 1, 2014 at 15:15
  • 1
    method 'keys' of 'dict' objects - this is unlikely to have a major impact on your runtime, but why are you calling that method? Most of the time, it's unnecessary. If you want to iterate over a dict's keys, just use for key in d. Commented Jan 1, 2014 at 15:17
  • @user2357112 I dont need to iterate over the dict keys, I store them to animate them later. For every step, I snapshot the grid cells and robot positions for matplotlib animate. Commented Jan 1, 2014 at 15:45

2 Answers 2

5

First, I'd rewrite it into:

def taxicab_distance(u, v):
 return np.sum(np.abs(u - v))

Can you compute taxicab_distance for many robots at once?

answered Jan 1, 2014 at 15:02
Sign up to request clarification or add additional context in comments.

3 Comments

exactly. to take advantage of the performance benefits of numpy, you need to vectorize your code; which lifts the inner loops from python to the C-level.
How would I vectorize my agent.calc(robot,steps) ? The underlying data structure is an array of robots(2d-points), and I potentially could return the value of the call instead of updating in place.
You need to show us what agent.calc does. I suggest creating a question a la "How to vectorize ..."
3

I benchmarked it with inlining, and it took off ~15sec. In the end, I rewrote the number crunching in C++ and used cython for integrating. After that, it took only 1 second.

EDIT: cpython -> cython

answered Jun 22, 2014 at 22:47

3 Comments

do you really mean 'cpython', or something else?
I meant cython to call C++ from Python.
You gave yourself an XY problem; numpy is designed to work with aggregate data because of the exceptionally high overhead python has for making function calls, see stackoverflow.com/questions/1401712/…. You could have used np.sum(np.abs(np.subtract(robot, neighbors)), axis=1), and perhaps 'neighbors' would be the list of all robots, for cache coherence.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.