Miller-Rabin implementation is very slow

Question 1

I implemented the Miller-Rabin-Primetest in python3. But I have the feeling it is very slow. I know that there are faster version in gmpy2 package, but I want to compare it with another code I have in python. So I don't want to use an existing implementation. I used the Pseudocode from Wikipedia. What am I doing wrong?

Here is my code.

def rab_test(n, rounds):
 start = time.time()
 if n == 1:
 return False
 if n in [2, 3]:
 return True
 if n % 2 == 0:
 return False
 d = n - 1
 s = 1
 while d % 2 == 0:
 d = d // 2
 s += 1
 for _ in range(rounds):
 a = random.randrange(2, n-1)
 x = gmpy2.powmod(a, d, n)
 for _ in range(s):
 y = gmpy2.powmod(x, 2, n)
 if y == 1 and x != 1 and x != n-1:
 return False
 x = y
 if y != 1:
 return False
 return True

Question 2

Let me start with "does the code produce the correct output?"

Question 3

@possum yes it does

Question 4

When in doubt, profile. Where the code spends most of the time? That said, my first instinct is that gmpy2.powmod(x, 2, n) is a culprit. A simple x = (x*x)% n might be faster. I don't have gmpy2 installed to confirm.

Question 5

I don't have many suggestions for how to improve the actual function. That said, I see some evidence that you could use a better context for checking both performance and correctness. Specifically, you're assigning start = time.time() and then never using it; don't do that.

Here's a quick test rig:

import gmpy2
import random
import timeit
def rab_test(n, rounds):
 if n == 1:
 return False
 if n in [2, 3]:
 return True
 if n % 2 == 0:
 return False
 d = n - 1
 s = 1
 while d % 2 == 0:
 d = d // 2
 s += 1
 for _ in range(rounds):
 a = random.randrange(2, n-1)
 x = gmpy2.powmod(a, d, n)
 for _ in range(s):
 y = gmpy2.powmod(x, 2, n)
 if y == 1 and x != 1 and x != n-1:
 return False
 x = y
 if y != 1:
 return False
 return True
def main():
 for _ in range(1000*10):
 n = random.randrange(1000*1000*1000, 1000*1000*1000*1000)
 mine = rab_test(n, 50)
 theirs = gmpy2.is_prime(n, 50)
 if mine != theirs:
 print(f"{n}, {mine}, {theirs}\n")
 raise
 print(timeit.timeit("rab_test(random.randrange(1000*1000*1000*100, 1000*1000*1000*1000), 50)", globals=globals()))
 print(timeit.timeit("gmpy2.is_prime(random.randrange(1000*1000*1000*100, 1000*1000*1000*1000), 50)", globals=globals()))
if __name__ == "__main__":
 main()

The package timeit is the normal choice in python, and probably good enough. Whether randomly checking 10k values for correctness is good enough is less clear; you can and should tweak the above code to convince yourself that it's doing a good test.

So what's it output?

6.803874525998253
0.7338983130030101

Your code is 10x slower than the library! But the library's written in C, so it's impossible to say if they're doing something trickier in terms of the algorithm or if this is just C's natural advantage. (Actually it's not impossible, you could always just check their source code.)

Now that you have a decent way of comparing alternatives, you might try out alterations to the function. My guess is that, while you could maybe make it fancier looking in various ways, none of them will increase speed, with the possible exception of parallelism.

Question 6

I think this can benefit from using timeit(..., setup=...) - on my machine the gmpy2 timeit spends about 20% of its time just generating the random number. Or even better, generate 1000 or so random numbers before the timeits, and test on those, so the two functions have the same input exactly. It may be a somewhat unfair comparison - for numbers in this range it suffices to just test the bases 2, 3, 5, 7, 11, and 13. I wouldn't be surprised if gmpy is aware of this type of fact.

Question 7

One thing to note is that for large N, gmpy2 is limiting the number of cases that are actually run. See the source code here. This means that your tests might not be apples to apples, and you are running more iterations than gmpy2 is

Question 8

Interesting point. I don't think it affects any of the tests we've actually done though; the cap is 1k. And, if I understand the math correctly, 1k is actually quite generous.

Question 9

I dug into the GMP source. gmpy2.is_prime wraps the GMP function mpz_probab_prime_p. This does trial division by small primes, then calls mpz_millerrabin. For numbers below 64 bits, mpz_millerrabin just does a BPSW test, which is deterministic. In particular it totally ignores the reps argument. You can observe that if you change reps to 1000 in the call to gmpy2.is_prime, the runtime barely changes. You could even the playing field a bit by testing much larger integers.

Question 10

That's important! I re-ran the tests, only changing the sampling range and iterations (couldn't be bothered to get them both using the same pre-gen samples). At (2**65, 2**68), 100 the gmpy2 advantage is down to 4, and at (2**74, 2**75), 1k it's down to ~2. So maybe for really large numbers they're the same...

ShapeOfMatter 4,4477 silver badges25 bronze badges · Answer 1 · 2024-03-15 18:26:59Z

I don't have many suggestions for how to improve the actual function. That said, I see some evidence that you could use a better context for checking both performance and correctness. Specifically, you're assigning start = time.time() and then never using it; don't do that.

Here's a quick test rig:

import gmpy2
import random
import timeit
def rab_test(n, rounds):
 if n == 1:
 return False
 if n in [2, 3]:
 return True
 if n % 2 == 0:
 return False
 d = n - 1
 s = 1
 while d % 2 == 0:
 d = d // 2
 s += 1
 for _ in range(rounds):
 a = random.randrange(2, n-1)
 x = gmpy2.powmod(a, d, n)
 for _ in range(s):
 y = gmpy2.powmod(x, 2, n)
 if y == 1 and x != 1 and x != n-1:
 return False
 x = y
 if y != 1:
 return False
 return True
def main():
 for _ in range(1000*10):
 n = random.randrange(1000*1000*1000, 1000*1000*1000*1000)
 mine = rab_test(n, 50)
 theirs = gmpy2.is_prime(n, 50)
 if mine != theirs:
 print(f"{n}, {mine}, {theirs}\n")
 raise
 print(timeit.timeit("rab_test(random.randrange(1000*1000*1000*100, 1000*1000*1000*1000), 50)", globals=globals()))
 print(timeit.timeit("gmpy2.is_prime(random.randrange(1000*1000*1000*100, 1000*1000*1000*1000), 50)", globals=globals()))
if __name__ == "__main__":
 main()

The package timeit is the normal choice in python, and probably good enough. Whether randomly checking 10k values for correctness is good enough is less clear; you can and should tweak the above code to convince yourself that it's doing a good test.

So what's it output?

6.803874525998253
0.7338983130030101

Your code is 10x slower than the library! But the library's written in C, so it's impossible to say if they're doing something trickier in terms of the algorithm or if this is just C's natural advantage. (Actually it's not impossible, you could always just check their source code.)

Now that you have a decent way of comparing alternatives, you might try out alterations to the function. My guess is that, while you could maybe make it fancier looking in various ways, none of them will increase speed, with the possible exception of parallelism.

I think this can benefit from using timeit(..., setup=...) - on my machine the gmpy2 timeit spends about 20% of its time just generating the random number. Or even better, generate 1000 or so random numbers before the timeits, and test on those, so the two functions have the same input exactly. It may be a somewhat unfair comparison - for numbers in this range it suffices to just test the bases 2, 3, 5, 7, 11, and 13. I wouldn't be surprised if gmpy is aware of this type of fact.
One thing to note is that for large N, gmpy2 is limiting the number of cases that are actually run. See the source code here. This means that your tests might not be apples to apples, and you are running more iterations than gmpy2 is
Interesting point. I don't think it affects any of the tests we've actually done though; the cap is 1k. And, if I understand the math correctly, 1k is actually quite generous.
I dug into the GMP source. gmpy2.is_prime wraps the GMP function mpz_probab_prime_p. This does trial division by small primes, then calls mpz_millerrabin. For numbers below 64 bits, mpz_millerrabin just does a BPSW test, which is deterministic. In particular it totally ignores the reps argument. You can observe that if you change reps to 1000 in the call to gmpy2.is_prime, the runtime barely changes. You could even the playing field a bit by testing much larger integers.
That's important! I re-ran the tests, only changing the sampling range and iterations (couldn't be bothered to get them both using the same pre-gen samples). At (2**65, 2**68), 100 the gmpy2 advantage is down to 4, and at (2**74, 2**75), 1k it's down to ~2. So maybe for really large numbers they're the same...

Stack Exchange Network

Miller-Rabin implementation is very slow

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Miller-Rabin implementation is very slow

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions