2
\$\begingroup\$

The "vectorizing" of fancy indexing by Python's NumPy library sometimes gives unexpected results.

For example:

import numpy
a = numpy.zeros((1000,4), dtype='uint32')
b = numpy.zeros((1000,4), dtype='uint32')
i = numpy.random.random_integers(0,999,1000)
j = numpy.random.random_integers(0,3,1000)
a[i,j] += 1
for k in xrange(1000):
 b[i[k],j[k]] += 1

It gives different results in the arrays 'a' and 'b' (i.e. the appearance of tuple (i,j) appears as 1 in 'a' regardless of repeats, whereas repeats are counted in 'b').

This is easily verified as follows:

numpy.sum(a)
883
numpy.sum(b)
1000

It is also notable that the fancy indexing version is almost two orders of magnitude faster than the for loop.

Is there an efficient way for NumPy to compute the repeat counts as implemented using the for loop in the provided example?

Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Jun 12, 2012 at 16:34
\$\endgroup\$
6
  • \$\begingroup\$ I'm not quite sure I understand the question. Wouldn't that always be the length of i and j? Is there a case where it isn't? \$\endgroup\$ Commented Jun 12, 2012 at 16:39
  • \$\begingroup\$ Nope. In the fancy indexing case, a[i,j] += 1, any repeated (i,j) combinations will only increment a[i,j] once. It is as if the right-hand side is evaluated all at once, a[:] + 1, where all values in 'a' are 0, and the result is saved back at once setting some values in 'a' to 1. In the for-loop case, b[i[k],j[k]] += 1, repeated (i,j) pairs each increment their count by 1. \$\endgroup\$ Commented Jun 12, 2012 at 17:06
  • \$\begingroup\$ I see that the two indexing methods yield different results, but wouldn't the for-loop indexing always result in numpy.sum(b) == j.size? Because for each element in i, j some element of b is incremented once. \$\endgroup\$ Commented Jun 12, 2012 at 18:10
  • \$\begingroup\$ Yes. The for-loop version of b always adds up to j.size. However, the result I need is the final counts in b. Is there a fast way to compute b? \$\endgroup\$ Commented Jun 12, 2012 at 18:42
  • \$\begingroup\$ Ah, now I understand. You're not interested in sum(b), but in b itself. Me blockhead today. \$\endgroup\$ Commented Jun 12, 2012 at 18:45

2 Answers 2

1
\$\begingroup\$

Ben,

The following modification to your code is a more accurate reflection of my larger problem:


 from timeit import timeit
 import numpy
 # Dimensions
 ROWS = 10000000
 SAMPLES = 10000
 COLS = 4
 # Number of times to run timeit
 TEST_COUNT = 100 
 # Matrices
 a = numpy.zeros((ROWS, COLS), dtype='uint32')
 b = numpy.zeros((ROWS, COLS), dtype='uint32')
 c = numpy.zeros((ROWS, COLS), dtype='uint32')
 # Indices
 i = numpy.random.random_integers(0, ROWS - 1, SAMPLES)
 j = numpy.random.random_integers(0, COLS - 1, SAMPLES)
 # Fancy
 def fancy():
 a[i,j] += 1
 print "Fancy:", timeit("fancy()", setup="from __main__ import fancy", number=TEST_COUNT)
 # Loop
 def loop():
 for k in xrange(SAMPLES):
 b[i[k],j[k]] += 1
 print "Loop:", timeit("loop()", setup="from __main__ import loop", number=TEST_COUNT)
 # Flat
 def flat():
 flatIndices = i*COLS + j
 sums = numpy.bincount(flatIndices)
 c.flat[:len(sums)] += sums
 print "Flat:", timeit("flat()", setup="from __main__ import flat", number=TEST_COUNT)
 # Check that for_loop and custom are equivalent
 print "Equivalent:", (c == b).all()

With the exception that there are probably few samples that are actually repeated in this version. Let's ignore that difference for now. On my machine, your code gave the following timings:


 Fancy: 0.109203100204
 Loop: 7.37937998772
 Flat: 126.571173906
 Equivalent: True

This slowdown is largely attributable to the large sparse array output by bincount().

answered Jun 12, 2012 at 21:02
\$\endgroup\$
1
  • \$\begingroup\$ See the second edit in my post. I'm off to bed. \$\endgroup\$ Commented Jun 12, 2012 at 21:50
1
\$\begingroup\$

This is what I came up with (see the flat function at the bottom):

from timeit import timeit
import numpy
# Dimensions
ROWS = 1000
COLS = 4
# Number of times to run timeit
TEST_COUNT = 100 
# Matrices
a = numpy.zeros((ROWS, COLS), dtype='uint32')
b = numpy.zeros((ROWS, COLS), dtype='uint32')
c = numpy.zeros((ROWS, COLS), dtype='uint32')
# Indices
i = numpy.random.random_integers(0, ROWS - 1, ROWS)
j = numpy.random.random_integers(0, COLS - 1, ROWS)
# Fancy
def fancy():
 a[i,j] += 1
print "Fancy:", timeit("fancy()", setup="from __main__ import fancy", number=TEST_COUNT)
# Loop
def loop():
 for k in xrange(1000):
 b[i[k],j[k]] += 1
print "Loop:", timeit("loop()", setup="from __main__ import loop", number=TEST_COUNT)
# Flat
def flat():
 flatIndices = i*COLS + j
 sums = numpy.bincount(flatIndices)
 c.flat[:len(sums)] += sums
print "Flat:", timeit("flat()", setup="from __main__ import flat", number=TEST_COUNT)
# Check that for_loop and custom are equivalent
print "Equivalent:", (c == b).all()

On my machine, that prints:

Fancy: 0.0117284889374
Loop: 0.937140445391
Flat: 0.0165873246295
Equivalent: True

So the flat version is about 50% slower than fancy, but much faster than loop. It does take up some memory for the flatIndices and sums, though. Note that c.flat does not create a copy of c, it only provides a linear interface to the array.

Edit: Out of curiousity, I ran another test with COLS = 1e6 with the following results:

Fancy: 24.6063425241
Loop: N/A
Flat: 22.4316268267

Memory usage according to the Windows task manager (not the most accurate measuring tool, I know) was at about 80 MB base-line, spiking up to 130 MB during flat().

Edit Nr 2: Another attempt with your benchmark:

def flat():
 flatIndices = i*COLS + j
 uniqueIndices = numpy.unique(flatIndices)
 bins = numpy.append(uniqueIndices, uniqueIndices.max() + 1)
 histo = numpy.histogram(flatIndices, bins)
 c.flat[uniqueIndices] += histo[0]

This gives me

Fancy: 0.143004997089
Loop: 9.48711325557
Flat: 0.474580977267
Equivalent: True

This looks much better. Unfortunately, I couldn't find a way to avoid the bins array since numpy.histogram treats the last bin as a closed interval. So if you only pass uniqueIndices as bins, histogram will put both the last and second to last index in the same bin, which is usually wrong.

answered Jun 12, 2012 at 19:24
\$\endgroup\$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.