2
\$\begingroup\$

I was Googling for a 'correct' or 'efficient' implementation of the Sieve of Eratosthenes. I found them all more complicated than what I came up with. I am by no means a Python expert, so I was wondering if something is wrong with my code.

def get_primes(max_int):
 numbers = range(2, max_int) 
 while len(numbers) > 0:
 numbers = [ num for num in numbers if num % numbers[0] != 0 or num == numbers[0] ] # only keep numbers that are NOT divisible by the prime 
 yield numbers[0] # yield the prime
 numbers = numbers[1:] # make a new numbers-array, the first num is a new prime

Call it with

primes = get_primes(1000)

And it gives you a generator of primes up until 1000.

Martin R
24.2k2 gold badges37 silver badges95 bronze badges
asked May 19, 2018 at 13:06
\$\endgroup\$
0

2 Answers 2

7
\$\begingroup\$

Your code looks correct to me, it does produce prime numbers in the given range.

PEP8 Online reports some code style violations with respect to whitespace and line length, but it is not too bad.

But: This is not the Sieve of Eratosthenes. The Sieve of Eratosthenes keeps a (single) list of all candidate numbers, and computes multiples of each found prime to mark subsequent composite numbers in the list. Your algorithm computes the remainder of all remaining candidates instead, and creates new lists in each step.

As a simple benchmark I ran

print(sum(p for p in get_primes(100000)))

with your code, this takes approximately 5 seconds on my MacBook.

It can be improved slightly by filtering the list only once in each step and not twice:

def get_primes(max_int):
 numbers = range(2, max_int)
 while len(numbers) > 0:
 yield numbers[0]
 numbers = [num for num in numbers if num % numbers[0] != 0]

This reduces the time to 4.6 seconds.

But any "real" Eratosthenes sieve is much faster. As an example, with this one from Rosetta code

def primes_upto(limit):
 is_prime = [False] * 2 + [True] * (limit - 1) 
 for n in range(int(limit**0.5 + 1.5)): # stop at ``sqrt(limit)``
 if is_prime[n]:
 for i in range(n*n, limit+1, n):
 is_prime[i] = False
 return [i for i, prime in enumerate(is_prime) if prime]

the above benchmark runs in 0.08 seconds.

answered May 19, 2018 at 14:48
\$\endgroup\$
1
  • \$\begingroup\$ It seems I did not understand The Sieve enough, your answer helped me clarify. I've coded the implementation correctly now and it is indeed way faster. \$\endgroup\$ Commented Jun 2, 2018 at 17:01
3
\$\begingroup\$

If you want to speed it up further, you can use slice assignment instead of the inner "for" loop:

def primes_slice(limit):
 is_prime = [False] * 2 + [True] * (limit - 1)
 for n in range(int(limit**0.5 + 1.5)): # stop at ``sqrt(limit)``
 if is_prime[n]:
 is_prime[n*n::n] = [False] * ((limit - n*n)/n + 1)
 return [i for i, prime in enumerate(is_prime) if prime]

That lets C code do the former inner loop for you under the covers, at the expense of some complicated stuff on the right side of the slice assignment. Even that can go away if you use the bitarray package for is_prime, which also reduces the memory footprint considerably. This tweak gives me about a 2x speed boost with n = 10,000,000

$ time ./prime_rosetta_slice.py 1e7
664579
real 0m1.134s
user 0m0.797s
sys 0m0.344s
$ time ./prime_rosetta_upto.py 1e7
664579
real 0m2.242s
user 0m1.875s
sys 0m0.281s

For the next performance improvement, try keeping only odd numbers> 2 in is_prime and then manually returning 2 as a special case.

answered Sep 28, 2019 at 21:57
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.