This code seems surprisingly fast at checking if a number is prime.
def is_prime(number: int) -> bool:
"""Checks if a number is prime or not"""
# number must be integer
if type(number) != int:
raise TypeError("Non-integers cannot be tested for primality.")
# negatives aren't prime, 0 and 1 aren't prime either
if number <= 1:
return False
# 2 and 3 are prime
elif number <= 3:
return True
# multiples of 2 and 3 aren't prime
elif number % 2 == 0 or number % 3 == 0:
return False
# only need to check if divisible by potential prime numbers
# all prime numbers are of form 6k +/- 1
# only need to check up to square root of number
for i in range(5, int(number**0.5) + 1, 6):
if number % i == 0 or number % (i + 2) == 0:
return False
return True
However, is there any way to speed this prime testing up?
4 Answers 4
Congratulations, you just came up with a prime sieve! And indeed, it's faster than always considering all the multiples of 2 of 3. But why do you stop at multiples of 2 and 3? What about 5? And 7? But why stop somewhere?
Let's be more ambitious. What about skipping all the multiples of all the primes encountered so far? This is called the sieve of Eratosthenes. It uses more memory (if you double the maximum number you're interested in, you need twice the memory), but it will be faster and will tell you if any number below your maximum number is prime.
If you want more improvements, note that other prime sieves exist: both the sieve of Sundaram and the sieve of Atkin improve over the sieve of Eratosthenes. And of course, if you're interested in raw speed, use numpy or even a language like C.
def is_prime(number: int) -> bool: ... # number must be integer if type(number) != int: raise TypeError("Non-integers cannot be tested for primality.")
By providing a type annotation, you have already documented that only int
s should be passed to the function, and you could even use static analysis tools to verify this. There is no need to do a dynamic type check on top of that.
Even without type annotations, there are good arguments for not doing type checks, as explained very well, for example, in section 4 of this answer to another question.
You can use Memoization. That is use a dictionary to cache the results of the is_prime function and perform a lookup. This will make the function run in O(1) time once the result for a given input is cached.
-
2\$\begingroup\$ Great idea, but it only speeds up if I am checking the same number twice. \$\endgroup\$Neil A.– Neil A.2016年09月11日 18:49:27 +00:00Commented Sep 11, 2016 at 18:49
When testing your solution compared to a regex, the result looks like the regex is slightly faster in this test.
import re
import timeit
start_time = timeit.default_timer()
re.match(r'^1?$|^(11+?)1円+$', '1'*12739)
elapsed = timeit.default_timer() - start_time
start_time = timeit.default_timer()
re.match(r'^1?$|^(11+?)1円+$', '1'*14)
elapsed = timeit.default_timer() - start_time
print(elapsed)
def is_prime(number):
"""Checks if a number is prime or not"""
# number must be integer
if type(number) != int:
raise TypeError("Non-integers cannot be tested for primality.")
# negatives aren't prime
if number <= 1:
return False
# 2 and 3 are prime
elif number <= 3:
return True
# multiples of 2 and 3 aren't prime
elif number % 2 == 0 or number % 3 == 0:
return False
# only need to check if divisible by potential prime numbers
# all prime numbers are of form 6k +/- 1
# only need to check up to square root of number
for i in range(5, int(number**0.5) + 1, 6):
if number % i == 0 or number % (i + 2) == 0:
return False
return True
start_time = timeit.default_timer()
is_prime(12739)
elapsed = timeit.default_timer() - start_time
print(elapsed)
You can try the above code online.
-
2\$\begingroup\$ Could you please explain what that expression
re.match(r'^1?$|^(11+?)1円+$', '1'*12739)
does? Also, wouldn't that compare it to a string of 12739 "1"'s (since"1"*12739
repeats that string)? \$\endgroup\$Neil A.– Neil A.2016年09月11日 06:30:25 +00:00Commented Sep 11, 2016 at 6:30 -
\$\begingroup\$ (1) This is not a review of the posted code. (2) The first
print(elapsed)
prints the result of the regex using the number 14, not 12739. \$\endgroup\$mkrieger1– mkrieger12016年09月11日 11:01:30 +00:00Commented Sep 11, 2016 at 11:01 -
1\$\begingroup\$ Interesting...If I put the first print to after the test of 12739, the regex code (whatever it means) is actually much slower... \$\endgroup\$Neil A.– Neil A.2016年09月11日 16:16:49 +00:00Commented Sep 11, 2016 at 16:16
-
2\$\begingroup\$ You're using the primality-testing regex to test 14, but calling
is_prime()
for 12739. That's an unfair test. The regex, as expected, is still slower. \$\endgroup\$200_success– 200_success2018年01月29日 22:55:19 +00:00Commented Jan 29, 2018 at 22:55
return False
in the second to last row. \$\endgroup\$if number <= 1
, so in fact it’s excluding any negative number or 0 and 1 – your comment should make this clear. \$\endgroup\$