Iterative uniform recombination of integer lists in python with numba

Question 1

I have a list of N integers and I iteratively take two integers at random, which are not at the same position in the list and uniformly recombine their binary representation, e.g. in intList = [1,4,1,5] (N=4) the number on the second and third position is chosen, aka 4 & 1, and their binary represenations 4=100 & 1=001 is uniformly recombined. Uniformly recombined means that I chose either the 1 or the 0 of the first position, one of the two 0's of the second position and either the 0 or 1 of the third position. This could result in 000 or 101 or 100 or 001. The result is saved as integer in the list. In each iteration I do this recombination for all N integers. This happens in a function with a numba decorator. My code is:

@nb.njit()
def comb():
 iterations = 100000
 N = 1000
 intList = list(range(N)) # this is just an example of possible integer lists
 l = 10 # length of the largest binary representation. 
 intList_temp = [0]*N
 for _ in range(iterations):
 for x3 in range(N):
 intList_temp[x3] = intList[x3]
 for x3 in range(N):
 randint1 = random.randint(0, N - 1)
 randint2 = random.randint(0, N - 1)
 while randint1 == randint2:
 randint1 = random.randint(0, N - 1)
 randint2 = random.randint(0, N - 1)
 a = intList[randint1]
 b = intList[randint2]
 c = a ^ ((a ^ b) & random.randint(0, (1 << l) - 1))
 intList_temp[x3] = c
 for x3 in range(N):
 intList[x3] = intList_temp[x3]
 return intList
print(timeit(lambda: comb(), number=1))
>>>2.59s

My question is, can this be improved?

Question 2

Are you sure the code works? (1 << l) throws l is undefined

Question 3

my mistake, l should be the length of the binary representation of the largest integer. If you choose l to be 10 in my example code it should work.

Question 4

The numbers in intList are chosen, such that they are equal or smaller than l but I think that is not important.

Question 5

asked this on the answer I took down, is the output supposed to be a list of the same number repeated 1000 times?

Question 6

In each iteration, we choose for each entry two random variables, which are recombined as described above. The update process is sequential and that is the reason why I create intList_temp. Did I make myself more clear? Sorry for the misunderstanding.

Question 7

No significant performance improvement but cleaner code.

Because temp_list is overwritten element-wise you can create it once and then leave it. At the end of each iteration you can then copy the entire list into int_list for the next iteration.

Similarly you can simplify creating the random ints a and b a bit. There are a lot of ways that get close to this in numpy but nothing I can find that beats the naive sampling directly works with numba and numba beats any overall solution I can think of without it.

Unfortunately (or maybe fortunately depending on your view), the numba compiler is compiling it down to the best solution I can think of for both your original version and this. Only difference is readabilitiy:

@nb.njit()
def comb(int_list, l, iterations):
 n = len(int_list)
 temp_list = int_list
 for _ in range(iterations):
 for i in range(n):
 a, b = 0, 0
 while a == b:
 a = random.randint(0, n - 1)
 b = random.randint(0, n - 1)
 temp_list[i] = a ^ ((a ^ b) & random.randint(0, l))
 int_list = temp_list
 return int_list
print(timeit(lambda: comb(list(range(1000)), (1 << 10) - 1, 100000), number=10))

Question 8

Thanks for the help! The numba decorator is kinda necessary, since other stuff is also happening in the function which would be difficult to optimize without numba. Unfortunately Random.sample does not work with numba. Do you know an efficient workaround for that?

Question 9

Do you mean the function that calls comb needs numba to work? I can get it working with numba or multiprocessing, another speedup approach, but both were slower than the solution I provided when I tested them. If it's faster not to use numba do you still need it for this particular function? A 6x speed up seems worth dropping it if possible

Question 10

Yes, unfortunately at the current state of my code I need comb to work with numba. Could you maybe add a version for that? Maybe I can make use of the non numba code later. Generally speaking, for me it is strange that it is slower with the numba decorator.

Question 11

Yeah, overall my results with numba seem to be hit or miss. Simple code tends to beat optimized complex code in my experience. I can add in the numba version for you see. It's possible your overall code would benefit from dropping numba and reducing complexity but it'd be up to you to decide if refactoring is worth it or not, especially without a guarantee of improved performance. I will say your native solution without numba was super slow though, 466 seconds vs .4

Question 12

I probably really have to think about it then. It’s always interesting to see the possible performance jumps in python. For now it would be really helpful for me if you would add a numba version.

Question 13

One obvious improvement is the repeated calculations of N-1 and 1<<l - 1 -- those can be calculated once outside the loop and put in variables.

Another wasted effort is the triple initialization of intList_temp. There is no reason to set it to 0, set it to intList, then set it to the actually calculated values.

Also that while loop to ensure the values are distinct is unnecessary. There's plenty of code out there to generate two distinct random numbers with just two random calls. (Hint: if the first random number is chosen arbitrarily, how many valid choices are there for the second random number?)

But I don't think either of these will make much of a difference in speed.

Have you profiled this?

Question 14

Predefining N-1 and 1<<l - 1 makes no measurable difference. In Numba I think there is no faster way to copy a list. Also the possibilites to generate two random numbers is limited.

Question 15

How does profiling works?

Question 16

en.wikipedia.org/wiki/Profiling_(computer_programming) toucantoco.com/en/tech-blog/tech/… stackoverflow.com/questions/55213687/…

Question 17

for x3 in range(N):
 intList_temp[x3] = intList[x3]

and

for x3 in range(N):
 intList[x3] = intList_temp[x3]

look pretty messy, I'm not proficient in python, but there should be an implementation of copying a vector much faster.

Question 18

Which one is faster in numba?

Question 19

intList_temp = intList[:] also does the job but seems to be a tiny bit slower

Coupcoup Coupcoup 1,0766 silver badges14 bronze badges · Accepted Answer · 2020-11-25 20:34:55Z

4

\$\begingroup\$

No significant performance improvement but cleaner code.

Because temp_list is overwritten element-wise you can create it once and then leave it. At the end of each iteration you can then copy the entire list into int_list for the next iteration.

Similarly you can simplify creating the random ints a and b a bit. There are a lot of ways that get close to this in numpy but nothing I can find that beats the naive sampling directly works with numba and numba beats any overall solution I can think of without it.

Unfortunately (or maybe fortunately depending on your view), the numba compiler is compiling it down to the best solution I can think of for both your original version and this. Only difference is readabilitiy:

@nb.njit()
def comb(int_list, l, iterations):
 n = len(int_list)
 temp_list = int_list
 for _ in range(iterations):
 for i in range(n):
 a, b = 0, 0
 while a == b:
 a = random.randint(0, n - 1)
 b = random.randint(0, n - 1)
 temp_list[i] = a ^ ((a ^ b) & random.randint(0, l))
 int_list = temp_list
 return int_list
print(timeit(lambda: comb(list(range(1000)), (1 << 10) - 1, 100000), number=10))

Share

edited Nov 26, 2020 at 17:08

answered Nov 25, 2020 at 20:34

Coupcoup's user avatar

Coupcoup Coupcoup

1,0766 silver badges14 bronze badges

\$\endgroup\$

10

\$\begingroup\$ Thanks for the help! The numba decorator is kinda necessary, since other stuff is also happening in the function which would be difficult to optimize without numba. Unfortunately Random.sample does not work with numba. Do you know an efficient workaround for that? \$\endgroup\$

HighwayJohn
– HighwayJohn

2020年11月25日 22:29:55 +00:00
Commented Nov 25, 2020 at 22:29
\$\begingroup\$ Do you mean the function that calls comb needs numba to work? I can get it working with numba or multiprocessing, another speedup approach, but both were slower than the solution I provided when I tested them. If it's faster not to use numba do you still need it for this particular function? A 6x speed up seems worth dropping it if possible \$\endgroup\$

Coupcoup
– Coupcoup

2020年11月25日 22:38:16 +00:00
Commented Nov 25, 2020 at 22:38
\$\begingroup\$ Yes, unfortunately at the current state of my code I need comb to work with numba. Could you maybe add a version for that? Maybe I can make use of the non numba code later. Generally speaking, for me it is strange that it is slower with the numba decorator. \$\endgroup\$

HighwayJohn
– HighwayJohn

2020年11月25日 22:43:22 +00:00
Commented Nov 25, 2020 at 22:43
\$\begingroup\$ Yeah, overall my results with numba seem to be hit or miss. Simple code tends to beat optimized complex code in my experience. I can add in the numba version for you see. It's possible your overall code would benefit from dropping numba and reducing complexity but it'd be up to you to decide if refactoring is worth it or not, especially without a guarantee of improved performance. I will say your native solution without numba was super slow though, 466 seconds vs .4 \$\endgroup\$

Coupcoup
– Coupcoup

2020年11月25日 22:47:56 +00:00
Commented Nov 25, 2020 at 22:47
1

\$\begingroup\$ I probably really have to think about it then. It’s always interesting to see the possible performance jumps in python. For now it would be really helpful for me if you would add a numba version. \$\endgroup\$

HighwayJohn
– HighwayJohn

2020年11月25日 22:54:28 +00:00
Commented Nov 25, 2020 at 22:54

| Show 5 more comments

Stack Exchange Network

Iterative uniform recombination of integer lists in python with numba

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Iterative uniform recombination of integer lists in python with numba

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions