The problem statement: The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
This code takes 9 seconds!, is there a better way?
from time import time
def is_prime(number):
for divisor in range(2, int(number ** 0.5) + 1):
if number % divisor == 0:
return False
return True
def calculate_largest(number):
# checking backwards to get the largest number first
for divisor in range(int(number ** 0.5) + 1, 0, -1):
if is_prime(divisor) and number % divisor == 0:
return divisor
if __name__ == '__main__':
time1 = time()
print(calculate_largest(600851475143))
print(time() - time1)
-
3\$\begingroup\$ That is not the most efficient solution. I suggest to check some related questions \$\endgroup\$Martin R– Martin R2019年07月13日 19:21:44 +00:00Commented Jul 13, 2019 at 19:21
-
\$\begingroup\$ I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you. \$\endgroup\$Peilonrayz– Peilonrayz ♦2019年07月13日 20:05:25 +00:00Commented Jul 13, 2019 at 20:05
-
\$\begingroup\$ didn't know that, thanks for letting me know, anyway I want to mark it as answered or so. \$\endgroup\$user203258– user2032582019年07月13日 20:09:43 +00:00Commented Jul 13, 2019 at 20:09
-
\$\begingroup\$ @emadboctor If you want to mark the question as answered then you should upvote answers if they helped you. If you think on of the answers helped the most, then you should press the tick next to that answer, which marks it as the best answer. \$\endgroup\$Peilonrayz– Peilonrayz ♦2019年07月16日 00:51:40 +00:00Commented Jul 16, 2019 at 0:51
2 Answers 2
order
You wrote:
if is_prime(divisor) and number % divisor == 0:
that is, if A and B:
.
But B
is typically false and is quickly computed.
Prefer if B and A:
stride
Rather than
for divisor in range(int(number ** 0.5) + 1, 0, -1):
you might want to begin with an odd number and then use a stride of -2
.
You needn't special case for when number
is a power of two,
since the input clearly isn't.
sieve
Consider sieving for the solution.
-
2\$\begingroup\$ Yeah I don't think sieving is the way to go with this one. Whilst you can use similar fundamentals they achieve fundamentally different things. \$\endgroup\$2019年07月13日 19:51:45 +00:00Commented Jul 13, 2019 at 19:51
-
2\$\begingroup\$ Thanks, I edited the code and ran it again, time decreased dramatically (0.11 secs) \$\endgroup\$user203258– user2032582019年07月13日 20:06:47 +00:00Commented Jul 13, 2019 at 20:06
-
1\$\begingroup\$ Sieving is one of the easiest ways to speed up searching for small primes, and anyone can understand how it works. I think it's a fine way of speeding up this code. EDIT: I should note that the way to use a sieve in this case is to make a cache of primes, so you don't have to check every time if a number is prime. \$\endgroup\$Turksarama– Turksarama2019年07月14日 12:14:36 +00:00Commented Jul 14, 2019 at 12:14
Problems
There is one correcntness and two performance problems with your approach, both connected with running through the problem from the high numbers and down.
For the correctness problem consider the number 91
, it has the factorization of 7*13
, but your approach would find 7
to be the largest, since 9 < 91**0.5 < 10 < 13
, but the correct one would be 13
. The reason we usually only need to consider primes up to the square-root of the target, is that whatever number is left when we have factorized it would be a prime number itself. This includes the number itself if we have found no prime factors.
The first performance problem is that you have no cached data in is_prime
, so you have to run over all of the numbers to check if it is a prime every time, this is as opposed to building up knowledge of the primes in the region, which would likely save you a lot of time, as you will likely need to test is_prime
for a lot of numbers before you hit the right one.
The second performance problem is that since you are running from above, you cannot make use of decreases to the target when you find smaller prime factors, which would have reduced the space you needed to look at.
Alternative
It is much simpler to write out the entire factorization, and then return the largest factor, than to try to go strictly after the last one.
Another thing to notice is that primes will come before any other numbers that is a factor of that prime, so we can simply factorize from the lowest number and not try to remove non-primes, because we will already have factored out the parts those numbers would factor out, then they will never show up in the factorization.
Note that if you want to do factorization for many numbers, you might want to construct the set of relevant prime numbers, while for a single use it is hard to eliminate a number faster than a single modulus operation.
Here is an implementation of the above principles:
def factorize(target):
factors = []
while target % 2 == 0 and target > 1:
factors.append(2)
target = target // 2
max_factor = target **0.5
factor = 3
while target > 1 and factor <= max_factor:
while target % factor == 0 and target > 1:
factors.append(factor)
target = target // factor
max_factor = target ** 0.5
factor += 2
if target > 1:
factors.append(target)
return factors
print(factorize(600851475143)[-1])
When I measure the time of the above code, it takes roughly 0.3ms
to run (a bit more for a single run and a bit less when I run it in a loop, about 0.35ms
vs 0.29ms
per run).
-
\$\begingroup\$ In the nested
while
loop, shouldn'ttarget = target // 2
betarget = target // factor
? \$\endgroup\$RootTwo– RootTwo2019年07月20日 03:09:10 +00:00Commented Jul 20, 2019 at 3:09 -
\$\begingroup\$ @RootTwo That was indeed my intent, and I somehow ended up writing it wrong. I have corrected it now. \$\endgroup\$Ninetails– Ninetails2019年07月20日 06:44:59 +00:00Commented Jul 20, 2019 at 6:44