I've written a more powerful version of the Sieve of Eratosthenes, this Sieve returns the number of factors of all the numbers below the limit
.Sadly the running time is \$O(n^2)\,ドル so it is much slower than the ordinary sieve. Can you suggest a way to improve and optimize it?
import doctest
def factors_sieve(limit):
"""
Returns the number of factors of all the numbers below the limit.
For better convenience use list(enumerate(factors_sieve(limit)))
>>> list(enumerate(factors_sieve(9)))
[(0, 0), (1, 1), (2, 2), (3, 2), (4, 3), (5, 2), (6, 4), (7, 2), (8, 4)]
>>> factors_sieve(100000).count(2) # How many prime numbers below 100000?
9592
"""
sieve = [2]*limit
sieve[0],sieve[1] = 0,1
for number,number_of_divisors in enumerate(sieve):
if number >= 2:
for multiple in range(number*2, len(sieve), number):
sieve[multiple] += 1
return sieve
if __name__ == "__main__":
doctest.testmod()
1 Answer 1
The running time is not \$ O(n^2) \$. It is \$ O(n \log n) \$. Why?
$$ T(n) = O \bigg( { n - 2 · 2 \over 2 } + { n - 2 · 3 \over 3 } + \dotsb + { n - 2 · { n \over 2 } \over { n \over 2 }} \bigg) = O(n \log n). $$ This is very similar to the sum of a harmonic series. So it is pretty good in terms of time complexity.
It is possible to optimize the code a little bit:
divisors_count = [2] * limit
divisors_count[0], divisors_count[1] = 0, 1
min_divisor = 2
max_divisor = len(divisors_count) // 2
for divisor in range(min_divisor, max_divisor + 1):
for multiple in range(divisor * 2, len(divisors_count), divisor):
divisors_count[multiple] += 1
return divisors_count
I changed the loop over the enumerated list to a loop over a range of divisors and removed the if
statement inside the loop's body. In my opinion, it also makes the logic more clear because we do not need a list itself in the outer loop. It is also sufficient to iterate only up to len(sieve) // 2
, because otherwise range(number * 2, len(sieve), number)
is empty.
I also gave the variables more meaningful names: divisor
instead of number
and so on to make the code easier to understand (but the initial version is rather clear, too, in my opinion).
But either way, this code shows a good scalability: it runs for about 1.8–1.9 seconds with my optimizations, and 2.1–2.2 seconds without them, for limit = 10 ** 6
.
-
\$\begingroup\$ Thanks for the complexity analysis, as soon as I saw two nested loops I thought it were O(N^2), but I didn't realize that the second loop was smaller, thanks! On a side note, isn't
len(divisors_count)
always equal tolimit
? \$\endgroup\$Caridorc– Caridorc2015年02月27日 20:17:26 +00:00Commented Feb 27, 2015 at 20:17 -
1\$\begingroup\$ @Caridorc Yes,
limit == len(divisors_count)
. \$\endgroup\$kraskevich– kraskevich2015年02月27日 20:45:35 +00:00Commented Feb 27, 2015 at 20:45
Explore related questions
See similar questions with these tags.