Fast prime generator

Question 1

I wanted to test my performance optimizing skills, and so wanted to find how fast I could get the first \$n\$ prime numbers. I limited myself to only the standard library, as I'm sure numpy or another library written in C has a prime generator way faster than Python ever will be - and offloading to a library doesn't really improve my ability to improve performance.

I've implemented both the Sieve of Eratosthenes and the Sieve of Sundaram. The SoE was based off my answer here, and the SoS was based off Wikipedias definition. These are available at the end of the answer.

I improved performance by:

SoE: Vectorizing the creation of primes.

primes[base*2:limit:base] = [False]*(ceil(limit / base) - 2)

SoE: Change the start of the slice from \2ドルb\$ to \$b^2\$. ^[1]

primes[base*base:limit:base] = [False]*((((limit - base*base) - 1) // base) + 1)

SoE: Simplify the calculations - addition seems to be faster than multiplication.
```
primes[base * base::base] = [False] * ((limit - 1) // base - base + 1)
```
SoE: Use itertools.compress, rather than a comprehension.

SoS: Vectorize the inner loop.

start = 1 + 3*j
step = 1 + 3*j
primes[start::step] = [False] * ceil((n - start) / step)

SoS: Vectorize the creation of values that only have one value in the sequence.

When \$\frac{n - \text{start}}{\text{stop}} = \frac{n - (1 + 3j)}{1 + 2j} \le 1\$ is equivalent to \$n \le 2 + 5j\$ we know we can stop at \$j = \frac{n - 2}{5}\$.

multi_stop = (n - 2) // 5
for j in range(1, multi_stop):
 start = 1 + 3*j
 step = 1 + 2*j
 primes[start::step] = [False] * ceil((n - start) / step)
if multi_stop >= 1:
 single_start = multi_stop * 3 + 1
 primes[single_start::3] = [False] * ceil((n - single_start) / 3)

SoS: It doesn't seem like you need the if created above, and so you can just save wasted cycles.

I tried defining false = [False]*limit and slice it, but found it to be slower than creating new lists in the loop.

This got the following prime sieves:

from math import ceil
from itertools import compress
def sieve_eratosthenes(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 for base in range(2, int(limit**0.5 + 1)):
 if primes[base]:
 primes[base * base::base] = [False] * ((limit - 1) // base - base + 1)
 primes[0] = primes[1] = False
 return list(compress(range(limit), primes))
def sieve_sundaram(limit):
 if limit <= 1:
 return []
 n = (limit - 1) // 2
 primes = [True] * n
 for j in range(1, (n - 2) // 5):
 start = 1 + 3*j
 step = 1 + 2*j
 primes[start::step] = [False] * ceil((n - start) / step)
 return [2] + [2*i + 1 for i, p in enumerate(primes) if p][1:]

Both are faster than the both the original functions.

enter image description here enter image description here

Code to generate graphs:

from math import ceil
from itertools import compress
import numpy as np
import matplotlib.pyplot as plt
from graphtimer import Plotter, MultiTimer
def sieve_eratosthenes_orig(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 for base in range(2, int(limit**0.5 + 1)):
 if primes[base]:
 for composite in range(base * 2, limit, base):
 primes[composite] = False
 return [num for num, is_prime in enumerate(primes) if is_prime][2:]
def sieve_eratosthenes(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 for base in range(2, int(limit**0.5 + 1)):
 if primes[base]:
 primes[base * base::base] = [False] * ((limit - 1) // base - base + 1)
 primes[0] = primes[1] = False
 return list(compress(range(limit), primes))
def sieve_sundaram_orig(limit):
 if limit <= 1:
 return []
 n = (limit - 1) // 2
 primes = [True] * n
 for j in range(1, n):
 for i in range(1, j + 1):
 value = i + j + 2*i*j
 if value < n:
 primes[value] = False
 return [2] + [2*i + 1 for i, p in enumerate(primes) if p][1:]
def sieve_sundaram(limit):
 if limit <= 1:
 return []
 n = (limit - 1) // 2
 primes = [True] * n
 for j in range(1, (n - 2) // 5):
 start = 1 + 3*j
 step = 1 + 2*j
 primes[start::step] = [False] * ceil((n - start) / step)
 return [2] + [2*i + 1 for i, p in enumerate(primes) if p][1:]
def sieve_test(limit):
 if limit <= 1:
 return []
 n = (limit - 1) // 2
 primes = [True] * n
 multi_stop = (n - 2) // 5
 for j in range(1, multi_stop):
 start = 1 + 3*j
 step = 1 + 2*j
 primes[start::step] = [False] * ceil((n - start) / step)
 return [2] + [2*i + 1 for i, p in enumerate(primes) if p][1:]
def test():
 for exp in range(6):
 limit = 10 ** exp
 assert sieve_test(limit) == sieve_eratosthenes(limit)
def main():
 fig, axs = plt.subplots()
 axs.set_yscale('log')
 axs.set_xscale('log')
 (
 Plotter(MultiTimer([
 sieve_eratosthenes_orig,
 sieve_eratosthenes,
 sieve_sundaram,
 sieve_sundaram_orig,
 # sieve_test,
 ]))
 .repeat(5, 5, np.logspace(0.35, 2), args_conv=int)
 .min()
 .plot(axs, x_label='limit')
 )
 fig.show()
if __name__ == '__main__':
 test()
 main()

To use the above code snippet you need to install numpy, matplotlib and graphtimer. All should be available via pypi.

Can they be made faster, or is a different sieve faster?

Question 2

Did you try the Sieve of Atkin? Something tells me it's faster than the Sieve of Erathosthenes.

Question 3

@Justin No, I didn't. I'm not too sure about it being faster than SoE in Python. But I'd be happy to be proven wrong.

Question 4

Here, I might have just found something - programmingpraxis.com/2010/02/19/sieve-of-atkin-improved. Does this help?

Question 5

@Justin Since it was written for Python 2 it may be unfair for it to be performance tested in Python 3. However with xrange = range and ensuring no IndexErrors I get the following graph. Which makes it faster than SoS at 1000 and always slower than SoE.

Question 6

Does this help - stackoverflow.com/questions/2068372/…?

Question 7

def sieve_eratosthenes(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 for base in range(2, int(limit**0.5 + 1)):
 if primes[base]:
 primes[base * base::base] = [False] * ((limit - 1) // base - base + 1)
 primes[0] = primes[1] = False
 return list(compress(range(limit), primes))

No attempt at all to use a wheel? I get roughly a 25% speedup just by special-casing the prime 2 with:

def sieve_eratosthenes_wheel(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 if limit > 4:
 primes[4::2] = [False] * ((limit - 1) // 2 - 2 + 1)
 for base in range(3, int(limit**0.5 + 1), 2):
 if primes[base]:
 # We require off + (len-1)*step < limit <= off + len*step
 # So len = ceil((limit - off) / step)
 primes[base*base::2*base] = [False] * ((limit - base*base + 2*base - 1) // (2*base))
 primes[0] = primes[1] = False
 return list(compress(range(limit), primes))

Using primes 2 and 3 it's possible to do two range updates with step sizes of 6*base, but it gets more complicated to calculate the initial offsets, which depend on base % 6:

def sieve_eratosthenes_wheel3(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 def mark_composite(off, step):
 # We require off + (len-1)*step < limit <= off + len*step
 # So len = ceil((limit - off) / step)
 primes[off::step] = [False] * ((limit - off + step - 1) // step)
 mark_composite(4, 2)
 mark_composite(9, 6)
 base = 5
 max_base = int(limit**0.5)
 while base <= max_base:
 # base == 5 (mod 6)
 if primes[base]:
 mark_composite(base*base, 6*base)
 mark_composite(base*(base+2), 6*base)
 base += 2
 # base == 1 (mod 6)
 if primes[base]:
 mark_composite(base*base, 6*base)
 mark_composite(base*(base+4), 6*base)
 base += 4
 primes[0] = primes[1] = False
 return list(compress(range(limit), primes))

For limit 50 million, taking sieve_eratosthenes as the baseline of 100 time units, I measure sieve_eratosthenes_wheel at about 73 time units and sieve_eratosthenes_wheel3 at about 63 time units.

Question 8

Could you explain what "a wheel" is? I know car wheels, but this meaning is definitely new to me :)

Question 9

@Peilonrayz en.wikipedia.org/wiki/Wheel_factorization

Peter Taylor Peter Taylor 24.5k1 gold badge49 silver badges94 bronze badges · Answer 1 · 2019-08-28 16:38:32Z

def sieve_eratosthenes(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 for base in range(2, int(limit**0.5 + 1)):
 if primes[base]:
 primes[base * base::base] = [False] * ((limit - 1) // base - base + 1)
 primes[0] = primes[1] = False
 return list(compress(range(limit), primes))

No attempt at all to use a wheel? I get roughly a 25% speedup just by special-casing the prime 2 with:

def sieve_eratosthenes_wheel(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 if limit > 4:
 primes[4::2] = [False] * ((limit - 1) // 2 - 2 + 1)
 for base in range(3, int(limit**0.5 + 1), 2):
 if primes[base]:
 # We require off + (len-1)*step < limit <= off + len*step
 # So len = ceil((limit - off) / step)
 primes[base*base::2*base] = [False] * ((limit - base*base + 2*base - 1) // (2*base))
 primes[0] = primes[1] = False
 return list(compress(range(limit), primes))

Using primes 2 and 3 it's possible to do two range updates with step sizes of 6*base, but it gets more complicated to calculate the initial offsets, which depend on base % 6:

def sieve_eratosthenes_wheel3(limit):
 if limit <= 1:
 return []
 primes = [True] * limit
 def mark_composite(off, step):
 # We require off + (len-1)*step < limit <= off + len*step
 # So len = ceil((limit - off) / step)
 primes[off::step] = [False] * ((limit - off + step - 1) // step)
 mark_composite(4, 2)
 mark_composite(9, 6)
 base = 5
 max_base = int(limit**0.5)
 while base <= max_base:
 # base == 5 (mod 6)
 if primes[base]:
 mark_composite(base*base, 6*base)
 mark_composite(base*(base+2), 6*base)
 base += 2
 # base == 1 (mod 6)
 if primes[base]:
 mark_composite(base*base, 6*base)
 mark_composite(base*(base+4), 6*base)
 base += 4
 primes[0] = primes[1] = False
 return list(compress(range(limit), primes))

For limit 50 million, taking sieve_eratosthenes as the baseline of 100 time units, I measure sieve_eratosthenes_wheel at about 73 time units and sieve_eratosthenes_wheel3 at about 63 time units.

Could you explain what "a wheel" is? I know car wheels, but this meaning is definitely new to me :)

Stack Exchange Network

Fast prime generator

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Fast prime generator

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions