Project Euler problem # 3 is this the most efficient approach in Python or can I do better?

Question 1

The problem statement: The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

This code takes 9 seconds!, is there a better way?

from time import time
def is_prime(number):
 for divisor in range(2, int(number ** 0.5) + 1):
 if number % divisor == 0:
 return False
 return True
def calculate_largest(number):
 # checking backwards to get the largest number first
 for divisor in range(int(number ** 0.5) + 1, 0, -1): 
 if is_prime(divisor) and number % divisor == 0:
 return divisor
if __name__ == '__main__':
 time1 = time()
 print(calculate_largest(600851475143))
 print(time() - time1)

Question 2

That is not the most efficient solution. I suggest to check some related questions

Question 3

I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you.

Question 4

didn't know that, thanks for letting me know, anyway I want to mark it as answered or so.

Question 5

@emadboctor If you want to mark the question as answered then you should upvote answers if they helped you. If you think on of the answers helped the most, then you should press the tick next to that answer, which marks it as the best answer.

Question 6

order

You wrote:

 if is_prime(divisor) and number % divisor == 0:

that is, if A and B:.

But B is typically false and is quickly computed. Prefer if B and A:

stride

Rather than

 for divisor in range(int(number ** 0.5) + 1, 0, -1):

you might want to begin with an odd number and then use a stride of -2. You needn't special case for when number is a power of two, since the input clearly isn't.

sieve

Consider sieving for the solution.

Question 7

Yeah I don't think sieving is the way to go with this one. Whilst you can use similar fundamentals they achieve fundamentally different things.

Question 8

Thanks, I edited the code and ran it again, time decreased dramatically (0.11 secs)

Question 9

Sieving is one of the easiest ways to speed up searching for small primes, and anyone can understand how it works. I think it's a fine way of speeding up this code. EDIT: I should note that the way to use a sieve in this case is to make a cache of primes, so you don't have to check every time if a number is prime.

Question 10

Problems

There is one correcntness and two performance problems with your approach, both connected with running through the problem from the high numbers and down.

For the correctness problem consider the number 91, it has the factorization of 7*13, but your approach would find 7 to be the largest, since 9 < 91**0.5 < 10 < 13, but the correct one would be 13. The reason we usually only need to consider primes up to the square-root of the target, is that whatever number is left when we have factorized it would be a prime number itself. This includes the number itself if we have found no prime factors.

The first performance problem is that you have no cached data in is_prime, so you have to run over all of the numbers to check if it is a prime every time, this is as opposed to building up knowledge of the primes in the region, which would likely save you a lot of time, as you will likely need to test is_prime for a lot of numbers before you hit the right one.

The second performance problem is that since you are running from above, you cannot make use of decreases to the target when you find smaller prime factors, which would have reduced the space you needed to look at.

Alternative

It is much simpler to write out the entire factorization, and then return the largest factor, than to try to go strictly after the last one.

Another thing to notice is that primes will come before any other numbers that is a factor of that prime, so we can simply factorize from the lowest number and not try to remove non-primes, because we will already have factored out the parts those numbers would factor out, then they will never show up in the factorization.

Note that if you want to do factorization for many numbers, you might want to construct the set of relevant prime numbers, while for a single use it is hard to eliminate a number faster than a single modulus operation.

Here is an implementation of the above principles:

def factorize(target):
 factors = []
 while target % 2 == 0 and target > 1:
 factors.append(2)
 target = target // 2
 max_factor = target **0.5
 factor = 3
 while target > 1 and factor <= max_factor:
 while target % factor == 0 and target > 1:
 factors.append(factor)
 target = target // factor
 max_factor = target ** 0.5
 factor += 2
 if target > 1:
 factors.append(target)
 return factors
print(factorize(600851475143)[-1])

When I measure the time of the above code, it takes roughly 0.3ms to run (a bit more for a single run and a bit less when I run it in a loop, about 0.35ms vs 0.29ms per run).

Question 11

In the nested while loop, shouldn't target = target // 2 be target = target // factor?

Question 12

@RootTwo That was indeed my intent, and I somehow ended up writing it wrong. I have corrected it now.

J_H J_H 41.4k3 gold badges38 silver badges157 bronze badges · Answer 1 · 2019-07-13 19:49:05Z

2

\$\begingroup\$

order

You wrote:

 if is_prime(divisor) and number % divisor == 0:

that is, if A and B:.

But B is typically false and is quickly computed. Prefer if B and A:

stride

Rather than

 for divisor in range(int(number ** 0.5) + 1, 0, -1):

you might want to begin with an odd number and then use a stride of -2. You needn't special case for when number is a power of two, since the input clearly isn't.

sieve

Consider sieving for the solution.

Share

edited Jun 10, 2020 at 13:24

Community's user avatar

Community Bot

1

answered Jul 13, 2019 at 19:49

J_H's user avatar

J_H J_H

41.4k3 gold badges38 silver badges157 bronze badges

\$\endgroup\$

3

2

\$\begingroup\$ Yeah I don't think sieving is the way to go with this one. Whilst you can use similar fundamentals they achieve fundamentally different things. \$\endgroup\$

Peilonrayz
– Peilonrayz ♦

2019年07月13日 19:51:45 +00:00
Commented Jul 13, 2019 at 19:51
2

\$\begingroup\$ Thanks, I edited the code and ran it again, time decreased dramatically (0.11 secs) \$\endgroup\$

user203258
– user203258

2019年07月13日 20:06:47 +00:00
Commented Jul 13, 2019 at 20:06
1

\$\begingroup\$ Sieving is one of the easiest ways to speed up searching for small primes, and anyone can understand how it works. I think it's a fine way of speeding up this code. EDIT: I should note that the way to use a sieve in this case is to make a cache of primes, so you don't have to check every time if a number is prime. \$\endgroup\$

Turksarama
– Turksarama

2019年07月14日 12:14:36 +00:00
Commented Jul 14, 2019 at 12:14

Add a comment |

Ninetails Ninetails 4692 silver badges5 bronze badges · Answer 2 · 2019-07-14 12:51:01Z

Problems

There is one correcntness and two performance problems with your approach, both connected with running through the problem from the high numbers and down.

For the correctness problem consider the number 91, it has the factorization of 7*13, but your approach would find 7 to be the largest, since 9 < 91**0.5 < 10 < 13, but the correct one would be 13. The reason we usually only need to consider primes up to the square-root of the target, is that whatever number is left when we have factorized it would be a prime number itself. This includes the number itself if we have found no prime factors.

The first performance problem is that you have no cached data in is_prime, so you have to run over all of the numbers to check if it is a prime every time, this is as opposed to building up knowledge of the primes in the region, which would likely save you a lot of time, as you will likely need to test is_prime for a lot of numbers before you hit the right one.

The second performance problem is that since you are running from above, you cannot make use of decreases to the target when you find smaller prime factors, which would have reduced the space you needed to look at.

Alternative

It is much simpler to write out the entire factorization, and then return the largest factor, than to try to go strictly after the last one.

Another thing to notice is that primes will come before any other numbers that is a factor of that prime, so we can simply factorize from the lowest number and not try to remove non-primes, because we will already have factored out the parts those numbers would factor out, then they will never show up in the factorization.

Note that if you want to do factorization for many numbers, you might want to construct the set of relevant prime numbers, while for a single use it is hard to eliminate a number faster than a single modulus operation.

Here is an implementation of the above principles:

def factorize(target):
 factors = []
 while target % 2 == 0 and target > 1:
 factors.append(2)
 target = target // 2
 max_factor = target **0.5
 factor = 3
 while target > 1 and factor <= max_factor:
 while target % factor == 0 and target > 1:
 factors.append(factor)
 target = target // factor
 max_factor = target ** 0.5
 factor += 2
 if target > 1:
 factors.append(target)
 return factors
print(factorize(600851475143)[-1])

When I measure the time of the above code, it takes roughly 0.3ms to run (a bit more for a single run and a bit less when I run it in a loop, about 0.35ms vs 0.29ms per run).

In the nested while loop, shouldn't target = target // 2 be target = target // factor?
@RootTwo That was indeed my intent, and I somehow ended up writing it wrong. I have corrected it now.

Stack Exchange Network

Project Euler problem # 3 is this the most efficient approach in Python or can I do better?

2 Answers 2

order

stride

sieve

Problems

Alternative

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Project Euler problem # 3 is this the most efficient approach in Python or can I do better?

2 Answers 2

order

stride

sieve

Problems

Alternative

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions