4
\$\begingroup\$

I have updated my code from the last time it got reviewed. If anyone has some more suggestions on how I can improve this and/or make it faster it would be appreciated!

def is_prime(num):
 '''Calculates whether num is a prime number,
 if it is return True otherwise return False'''
 # SPECIALCASE: These are special cases that are not primes
 if num % 2 == 0 and num != 2 or num <= 1:
 return False
 # Look for a factor
 for i in range(3, int(num ** (0.5)) + 1, 2):
 # If there is a factor
 if num % i == 0:
 # Its not a prime number
 return False
 # If there is no factors and its not a special case
 return True
asked Jun 28, 2016 at 10:46
\$\endgroup\$
1
  • \$\begingroup\$ do you want it to be faster in detecting that a number is a prime or that a number is not one ? ☺ \$\endgroup\$ Commented Jun 28, 2016 at 12:27

1 Answer 1

12
\$\begingroup\$

0. Overview

Your code is well written and what it does is clear. However I will argue that if you want a drastic increase in performance, you need to switch your algorithm. Testing primality can be done in polynomial time, but is generally a problem which requires sophisticated math. The problem can be solved in polynomal time as shown by the AKS_primality_test. However there are far faster algorithms than this one. As you can see, it is not the easiest thing to understand without a background in higher mathematics.

  1. A review of your algorithm
  2. Show a probabilistic approach
  3. Compare your algorithm, the probabilistic, and the isprime function from the primefac package.

1. Review

As your code is fairly short, there is not much to comment on. I really like that you use a docstring in the beginning of your code to document it. The code looks good.

  1. Use the if __name__ == "__main__": module. It makes it easier to test and easier to import.
  2. Some of the comments are unnecessary and there is too much space in the last return.
  3. The num <= 1 is a bit strange. It catches everything from inputing \0ドル\$ and \1ドル\$ to negative and decimal numbers. In my opinion it is better to treat negative numbers and non-integers with an error instead. Then we can explicitly remove \0ドル\$ and \1ドル\$ as special cases.

One way to handle this could be

try:
 int(num)
except ValueError:
 raise ValueError("Ops. Input must be a positive integer")
if int(num) != num or num < 0:
 raise ValueError("Ops. Input must be a positive integer")
if num in [0, 1]: return False

The last line could also have been written as

if num <= 1: return False

Since we have already tested that num => 0. However I prefer the first version as it is much clearer what it actually checks. You should always try to make it as clear as possible what you want to catch.

1.1 Some improvements on the algorithm

Say you check some number with your algorithm. If the number is not divisible by \3ドル\$ you do not need to ever check any multiples of \3ドル\$ again. Yet your code checks \6ドル\,ドル \9ドル\$ and so forth. A better approach would be to iterate over the numbers not divisible by \2ドル\$ or \3ドル\$. This removes the checking of \2ドル/3\$ of all numbers in comparison to only removing \1ドル/2\$. So, what numbers are not divisible by \3ドル\$ or \2ドル\$? Well all numbers are on the form \6ドルk+1\,ドル \6ドルk+2\,ドル \6ドルk+3\,ドル \6ドルk+4\$ or \6ドルk+5\$. Since they are either \0ドル\,ドル \1ドル\,ドル \2ドル\,ドル \3ドル\,ドル \4ドル\$ or \5ドル\$ numbers away from being a multiple of \6ドル\$. This means that only numbers on the form \6ドルk+1\$ or \6ドルk+5\$ are not divisible by \2ドル\$ or \3ドル\$.

Implementation is simple increase counter by \6ドル\,ドル test \$+1\$ and \$+5\$.

primes_to_remove = [2, 3, 5]
for prime in primes_to_remove:
 if num == prime: return True
 elif num % prime == 0: return False
# Look for a factor
limit = int(num**(0.5))+1
for multiple in range(6, int(num**(0.5))+1, 6):
 for k in [1, 5]:
 # If there is a factor
 if num % (multiple+k) == 0:
 # Its not a prime number
 return False

This can further be generalized by removing [2, 3, 5] as shown below

primes_to_remove = [2, 3, 5, 7, 11, 13, 19, 23, 29]
for prime in primes_to_remove:
 if num == prime: return True
 elif num % prime == 0: return False
# Look for a factor
limit = int(num**(0.5))+1
for multiple in range(30, int(num**(0.5))+1, 30):
 for k in [1, 7, 11, 13, 19, 23, 29]:
 # If there is a factor
 if num % (multiple+k) == 0:
 # Its not a prime number
 return False
return True

This idea of removing the first primes can again be modified into removing [2,3,5,7] and then check the remaining numbers. However the advantages of this method quickly diminish for larger primes as we are removing fewer and fewer numbers. As a last example, it is possible to only iterate over the primes. From the primesieve library we can import a prime generator.

from primesieve import Iterator
def is_prime_primegenerator(num):
 it = Iterator()
 prime = it.next_prime()
 # Iterate over the primes below sqrt(n)
 limit = int(num**(0.5))+1
 while prime < limit:
 prime = it.next_prime()
 if num % prime == 0:
 print num, prime
 return False
 return True

This is the best we can do following down your path of checking primes from the smallest to the largest. It assumes however that the next_prime() is extremely fast. Which it luckily is.

Using the 235 version of the sieve above, it gave me a \2ドル.5\$ times speed increase of your code. Putting it all together gives

def is_prime(num):
 '''Calculates whether num is a prime number,
 if it is return True otherwise return Flase'''
 try:
 int(num)
 except ValueError:
 raise ValueError("Ops. Input must be a positive integer")
 if int(num) != num or num < 0:
 raise ValueError("Ops. Input must be a positive integer")
 # SPECIALCASE: These are special cases that are not primes
 if num in [0, 1]: return False
 primes_to_remove = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
 for prime in primes_to_remove:
 if num == prime: return True
 elif num % prime == 0: return False
 # Look for a factor
 limit = int(num**(0.5))+1
 for multiple in range(30, int(num**(0.5))+1, 30):
 for k in [1, 7, 11, 13, 19, 23, 29]:
 # If there is a factor
 if num % (multiple+k) == 0:
 # Its not a prime number
 return False
 return True
if __name__ == '__main__':
 print is_prime(600851)

Using some quick comparisons it seems that the 235 version is the fastest for solving larger primes while 23 and the naive are pretty similar in running times.

 naive 23 235 primes
600851475151 | 117ms 71ms 40ms 19ms

2. Probabilistic primality test

A short explanation can be found on Wikipedia.

Probabilistic tests

Probabilistic tests are more rigorous than heuristics in that they provide provable bounds on the probability of being fooled by a composite number. Many popular primality tests are probabilistic tests. These tests use, apart from the tested number \$n\,ドル some other numbers a which are chosen at random from some sample space; the usual randomized primality tests never report a prime number as composite, but it is possible for a composite number to be reported as prime. The probability of error can be reduced by repeating the test with several independently chosen values of a; for two commonly used tests, for any composite \$n\$ at least half the a's detect n's compositeness, so \$k\$ repetitions reduce the error probability to at most \2ドル^{−k}\,ドル which can be made arbitrarily small by increasing \$k\$.

The basic structure of randomized primality tests is as follows:

  1. Randomly pick a number \$a\$.
  2. Check some equality (corresponding to the chosen test) involving \$a\$ and the given number \$n\$. If the equality fails to hold true, then \$n\$ is a composite number, a is known as a witness for the compositeness, and the test stops.
  3. Repeat from step 1 until the required accuracy is achieved.

After one or more iterations, if \$n\$ is not found to be a composite number, then it can be declared probably prime.

The important bit is the three steps at the bottom. We put our number through some test and if it passes there is a \1ドル/2\$ chance of it being composite. After passing \$k\$ tests, the chances of being composite is \2ドル^{-k}\$. When big banks check for primality they often use a probabilistic prime test where \$k\$ is around \80ドル\$. The chances of a composite passing \80ドル\$ tests is just

$$ P_{80}(\text{composite}) = 2^{-80} = \frac{1}{1208925819614629174706176} $$

So if you test \1208925819614629174706176ドル\$ numbers, on average one of them will be a composite passing all the tests. Below I will show a simple implementation of the Miller-Rabin primality test. It is one of the fastest and simplest ways to implement a primetest. This algorithm can be divided into three steps:

  1. Generate a small list of primes below some limit (I used 1 million)
  2. Check if the prime number is even, and some other small tests. Then check if it is in the primeset.
  3. If the prime is bigger than the limit apply the Miller-Rabin primality test \25ドル\$ times. The number \25ドル\$ is arbitrary, however it does not need to be particularly big as the chances of a composite slipping through is \2ドル^{-k}\$.

The reason for step 1 is both because it is faster, but also because the Miller-Rabin test have a harder time detecting small primes. I imported the fastest prime_generator I could find through google. However if you allow for external libaries primesieve is faster. However generating primes is just a small part of the excecution time, and not worth optimizing further. \${}{}\$

import numpy as np
def primesbelow(n):
 # https://stackoverflow.com/questions/2068372/fastest-way-to-list-all-primes-below-n-in-python/3035188#3035188
 """ Input n>=6, Returns a array of primes, 2 <= p < n """
 sieve = np.ones(n/3 + (n%6==2), dtype=np.bool)
 sieve[0] = False
 for i in xrange(int(n**0.5)/3+1):
 if sieve[i]:
 k=3*i+1|1
 sieve[ ((k*k)/3) ::2*k] = False
 sieve[(k*k+4*k-2*k*(i&1))/3::2*k] = False
 return np.r_[2,3,((3*np.nonzero(sieve)[0]+1)|1)]
smallprimeset = set(primesbelow(100000))
_smallprimeset = 100000
def isprime(n, precision=25):
 # http://en.wikipedia.org/wiki/Miller-Rabin_primality_test#Algorithm_and_running_time
 try:
 n = abs(int(n))
 except ValueError:
 print("Input must be a non-negative integer")
 if n in [0, 1]:
 return False
 elif n in [2, 3, 5]:
 return True
 elif n % 2 == 0:
 return False
 elif n < _smallprimeset:
 return n in smallprimeset
# Miller-Rabin primality test
d = n - 1
s = 0
while d % 2 == 0:
 d //= 2
 s += 1
for repeat in range(precision):
 a = random.randrange(2, n - 2)
 x = pow(a, d, n)
 if x == 1 or x == n - 1: continue
 for r in range(s - 1):
 x = pow(x, 2, n)
 if x == 1: return False
 if x == n - 1: break
 else: return False
return True

3. Comparison

I used

pip install primefac

To be able to use the isprime function from the primefac package. Then I did some simple primality testing on a few semi-large numbers.

All the tests where done a \100ドル\$ times and then i took the average. I did a \1000ドル\$ averages for Miller-Rabin and primefac for better accuracy. There is no need to test composite numbers, because all the implementations stop once a prime has been found. All the tests are therefore done on primes.

 number naive 23 235 primegen miller primefac
 -----------------------------------------------------------------------------------------
 60085159 0.69 ms 1.09 ms 0.47 ms 1.02 ms 0.25 ms 0.24 ms
 6008514763 22 ms 61 ms 7 ms 3.8 ms 0.104 ms 0.070 ms
 60085147517 56 ms 30 ms 12 ms 6.5 ms 0.072 ms 0.090 ms
 600851475151 283 ms 281 ms 155 ms 56 ms 0.364 ms 0.331ms
 99194853094755497 117328 ms 113053 ms 58010 ms 19117 ms 0.638 ms 0.239180 ms
 -----------------------------------------------------------------------------------------

As you can see you do fairly well for small values. For \60085159ドル\$ your code is about \2ドル.75\$ times as slow as the Miller-Rabin. However as the values increase so does the difference. For \600851475151ドル\$ your code is about \777ドル\$ times slower. Finally for \99194853094755497ドル\$ the Miller-Rabin is \183899ドル\$ times faster.

There might be something wrong with my testing since the 23-version is about as fast as the naive implementation (yours). However the 235 is a simple speed increase. If you want a drastic improvement however you need to switch to another implementation. Either use a library which has a builtin isprime function, or use a probabilistic approach: Fermat primality test, Miller-Rabin, Solovay-Strassen primality test, Frobenius primality are just some of the many fast implementations.

answered Jun 29, 2016 at 1:44
\$\endgroup\$
3
  • \$\begingroup\$ I used 'num <= 1' as a special case because "A prime number is a whole number greater than 1, whose only two whole-number factors are 1 and itself." and 1 is not a prime number. \$\endgroup\$ Commented Jun 29, 2016 at 2:48
  • \$\begingroup\$ Doesn't 17 need to be in 235? \$\endgroup\$ Commented Aug 22, 2016 at 17:10
  • \$\begingroup\$ Funny you should choose 60085159 as first test case! \$\endgroup\$ Commented Oct 9, 2016 at 13:32

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.