4
\$\begingroup\$

I don't understand what is taking the program so long:

#!/bin/python
primes=[]
i=0
j=0
k=0
for i in range(2,2000000): #fill in the list
 primes.append(i)
i=0
while i<len(primes):
 j=primes[i]
 print(j)
 k=0
 while j*(j+k)<primes[len(primes)-1]: ##referred as 'line A'
 try:
 primes.remove(j*(j+k))
 except ValueError:
 k=k+1
 continue
 k=k+1
 i=i+1
sump=0
i=0
for i in range(len(primes)):
 sump=sump+primes[i]
print(sump)

I understand why the overall code is very inefficient, but line A takes 2 hours for j=2, and I don't understand why that is. Surely the list.remove(x) method is very inefficient?

Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Feb 10, 2014 at 16:11
\$\endgroup\$
8
  • \$\begingroup\$ There is a lot you can do to improve performances in this script. The first for loop can be replaced with primes = list(range(2,2000000)), primes[len(primes)-1] is primes[-1] and the last for loop is sump = sum(primes). I don't think that's your problem, but it's a start. \$\endgroup\$ Commented Feb 10, 2014 at 16:16
  • \$\begingroup\$ Is this Python 2? Have you done any profiling before asking? \$\endgroup\$ Commented Feb 10, 2014 at 16:19
  • \$\begingroup\$ Try moving the value of len(primes) into a variable and use that in the while loop. And try modifying the expression to "j*(j+k)" to "jj + j * k" where "jj" is pre calculated and put into some variable which would be used in computing the value of the expression \$\endgroup\$ Commented Feb 10, 2014 at 16:20
  • \$\begingroup\$ Tight loops are not exactly Python strongest point. Removing the inner loop with an equivalent idiomatic construct would improve things a lot. Eg. primes = [x for x in primes if x % j or x == j]. Even creating a completely new list each loop, this is orders of magnitude faster \$\endgroup\$ Commented Feb 10, 2014 at 16:22
  • 1
    \$\begingroup\$ Anyway, yes, list.remove is an O(n) operation; first it needs to search through the list to find the item, then move all of the items after it in the list one spot to the left. Using a set would be much more efficient. \$\endgroup\$ Commented Feb 10, 2014 at 16:28

1 Answer 1

7
\$\begingroup\$

I don't believe it is actually the specific Lne A that is the problem, but everything that is happening inside that loop, and most importantly, the line: primes.remove(...). From this StackOverflow answer you can see that remove(...) is an O(n) operation, thus it's performance is relative to the amount of data in the array. In this case, there's a lot.... and you are doing a lot of removes() since you are essentially removing all even values except 2.

What you should be doing is using a more efficient algorithm. You should read up on the Sieve of Eratosthenes, and introduce that algorithm here.

it will be much faster because it does not do any array-size changes. There are a number of posts here on CodeReview that have this problem solved quite nicely:

(Primes in Python...)

(Sieve in other languages)

answered Feb 10, 2014 at 16:34
\$\endgroup\$
1
  • \$\begingroup\$ Thanks, this is helpful (: This was actually a (failed) implementation of Sieves of Eratosthenes. I see that the list.remove method isn't the right thing to use, thanks! \$\endgroup\$ Commented Feb 10, 2014 at 16:43

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.