I have a array of identifiers that have been grouped into threes. For each group, I would like to randomly assign them to one of three sets and to have those assignments stored in another array. So, for a given array of grouped identifiers (I presort them):
groupings = array([1,1,1,2,2,2,3,3,3])
A possible output would be
assignments = array([0,1,2,1,0,2,2,0,1])
Ultimately, I would like to be able to generate many of these assignment lists and to do so efficiently. My current method is just to create an zeroes array and set each consecutive subarray of length 3 to a random permutation of 3.
assignment = numpy.zeros((12,10),dtype=int)
for i in range(0,12,3):
for j in range(10):
assignment[i:i+3,j] = numpy.random.permutation(3)
Is there a better/faster way?
-
So I understand the '10' is a dummy example value that you'd like much bigger. What about the '12', is it also a dummy value or will it always be 12?Julien– Julien10/26/2015 23:47:25Commented Oct 26, 2015 at 23:47
-
It is also a dummy value. In reality, for my case, it's closer to 12k.dunstantom– dunstantom10/26/2015 23:54:45Commented Oct 26, 2015 at 23:54
1 Answer 1
Two things I can think about:
instead of visiting the 2D array
3 row * 1 column
in your inner loop, try to visit it1*3
. Accessing 2D array horizontally first is usually faster than vertically first, since it gives you better spatial locality, which is good for caching.instead of running
numpy.random.permutation(3)
each time, if3
is fixed and is a small number, try to generate the arrays of permutations beforehand and save them into a constant array of array like:(array([0,1,2]), array([0,2,1]), array([1,0,2])...)
. You just need to randomly pick one array from it each time.