Efficient way of code reuse for finding the largest prime factor

Question 1

I was doing timing with variations of function largest_prime_factor. I was able to write a better one largest_prime_factor2. There were some repetitions in the new function so I wrote another function largest_prime_factor3.

I thought the 3rd one has to be faster than the 2nd because I broke it up into more functions but it wasn't faster but slower.

I wanted to know whether my method of code reuse is good or not?
Is there a better way of reusing code in Python?
Did I miss something due to which the new function became slower?

My functions alongwith the testing() function used to test all three.

def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = 0
 if num % 2 == 0:
 ans = 2
 num /= 2
 while num % 2 == 0:
 num /= 2
 i = 3
 while i <= num:
 if num % i == 0:
 ans = i
 num /= i
 while num % i == 0:
 num /= i
 i += 2
 return ans
def largest_prime_factor2(num):
 '''returns the largest prime factor of a number''' 
 ans = 0
 if num % 2:
 pass
 else:
 ans = 2
 while True:
 num //= 2
 if num % 2:
 break
 i = 3
 while i <= num:
 if num % i:
 pass
 else:
 ans = i
 while True:
 num //= i
 if num % i:
 break
 i += 2
 return ans
def largest_prime_factor3(num):
 '''returns the largest prime factor of a number'''
 def check(j):
 nonlocal ans
 nonlocal num
 if num % j:
 return
 ans = j
 while True:
 num //= j
 if num % j:
 return
 ans = 0
 check(2)
 i = 3
 while i <= num:
 check(i)
 i += 2
 return ans
def testing():
 from timeit import Timer
 import random
 def tests(f, times):
 def test1(f):
 f(random.randint(1, 1000))
 def test2(f):
 f(random.randint(100000, 1000000))
 print(f.__name__)
 print(Timer(lambda: test1(f)).timeit(number = times))
 print(Timer(lambda: test2(f)).timeit(number = times//10))
 print()
 tests(largest_prime_factor, 10000)
 tests(largest_prime_factor2, 10000)
 tests(largest_prime_factor3, 10000)
if __name__ == "__main__":
 testing()

The timing results that I found:

largest_prime_factor
0.338549207387318
16.599112750324068
largest_prime_factor2
0.21575289594063918
12.086949738861868
largest_prime_factor3
0.36032199674939847
18.893444539398857

This format of results is produced by the testing function. See that for details.

Question 2

As this question is about time efficiency any discussions for this can take place in this chat room. Please use this to avoid extended discussions.

Question 3

"I thought the 3rd one has to be faster than the 2nd because I broke it up into more functions but it wasn't faster but slower. " Why would you think that? Using function call instead of "inlining the code" can only produce a slowdown. Depending on the interpreter/compiler the code might be optimized to obtain the same efficiency as the inlined version, otherwise you end up with function-call overhead which means slower code. This is especially true in python where the interpreter doesn't do this kind of optimization and function calls are relatively expensive.

Question 4

In general no. In your example you had to copy the loop to special case 2. But you can create an iterable that first yields 2 and then only the odd numbers. For example using itertools.chain: for i in it.chain([2], range(3, num, 2)): ... would do what you want. (Last remark: instead of while True you could use while n % i == 0 and remove the break).

Question 5

Did you remove the if when timing that? for i in range(3, num, 2): while n %i == 0: ... n //= i. Also, do not obsess to much about timing. Readable code is much better than a code that take 0.1 msec less. Having while Trues around doesn't make code readable.

Question 6

By the way: you are considering only micro-optimizations when you could optimize the number of iterations in a much more significant way. You don't have to check numbers over sqrt(n) since they cannot be factors of n. the only case is when n is prime).

Question 7

In order to avoid code duplication and the performance regression that you get when putting that code into a function you can use itertools.chain:

import itertools as it
def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = num
 for div in it.chain([2], range(3, num, 2)):
 while not num % div:
 num //= div
 ans = div
 return ans

To improve performances in a significant way you can reduce the number of iterations noting that you don't have to check factors bigger than sqrt(num), since if num isn't a prime, then it can have at most one prime factor bigger than its square root(otherwise, if p and q where prime factors bigger than sqrt(n) you'd get that n = A * p *q > A * sqrt(n) * sqrt(n) = A * n > n which cannot be true when A != 1).

This leads to:

def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = num
 sqrt_num = int(num**.5)
 for div in it.chain([2], range(3, sqrt_num + 1, 2)):
 while not num % div:
 num //= div
 ans = div
 return ans if num == 1 else num

Note that, at the end of the for loop, the value of num is 1 if all its prime factors were smaller than the square root, otherwise it is equal to the only prime factor bigger than the square root.

Sample run:

In [34]: list(map(largest_prime_factor, range(2, 25)))
Out[34]: [2, 3, 2, 5, 3, 7, 2, 3, 5, 11, 3, 13, 7, 5, 2, 17, 3, 19, 5, 7, 11, 23, 3]

In fact we can further optimize the code noting that, whenever num becomes 1 inside the for loop we can safely break out of it(while the current code tries all the other factors up to sqrt(n) anyway).

def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = num
 sqrt_num = int(num**.5)
 for div in it.chain([2], range(3, sqrt_num + 1, 2)):
 while not num % div:
 num //= div
 ans = div
 if div > num // div:
 return ans
 return num

We could also do something like:

def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = num
 sqrt_num = int(num**.5)
 candidates = it.chain([2], range(3, sqrt_num + 1, 2))
 for div in it.takewhile(lambda div: div <= num // div, candidates):
 while not num % div:
 num //= div
 ans = div
 return max(ans, num)

However this last version does a function call for each iteration and it will be slightly slower.

Question 8

I figured that there is a bigger number. There is slight formatting problem in the beginning of the question when importing itertools. Also when sqrt_num = int(num**.5) + 1 then you don't need the + 1 in the range. Redundant. Please fix that

Question 9

@AseemBansal Fixed.

Question 10

largest prime factor of (2^200 * 9) is 3. To find this, the maximal divisor to try by is 3, not (3 * 2^100). :)

Question 11

welcome; your code isn't right still though. What about (2^100 * 10000000019)? What is the biggest divisor to consider?

Question 12

@WillNess I see, changed; now it should be as fast as possible for a simple implementation. Regardin the rest I don't see why should I use a while loop where a for will do(note that in python using a while will decrease performances, even though in this case, hopefully, the operations inside the loop should take most of the time). Also the OP specifically asked how to do this without testing 2 separately. Regarding the not: it's common practice in python to use not or simply if expression to check for truth values(e.g. if some_sequence: means some_sequence isn't empty.

Question 13

Usually, splitting your code into different smaller functions is a good idea as it makes things easier to read, to reuse and to test.

However, this comes at a (cheap) price as far as performances are concerned because functions calls are not for free (nor are function definitions).

Now, for additional advices, whenever you want to compare performances of functions, it might also be a good idea to return what they compare. It's not good having a fast function if it doesn't return the right thing. By using assert, you could ensure you get notified quickly whenever something goes wrong.

This being said, I'd like to run and comment your code but I don't have a python3 handy and it seems that it is the version you are using so I'll try to do this later :-)

Question 14

I am not using many features of Python3. You just need to change the print in the testing function. Maybe the nested function in that also. Everything else is same. Otherwise you can use ideone.com if you want.

Question 15

ideone.com is your friend :-) The code: ideone.com/565hyx (Note that I divided the number of tests by 100.)

Question 16

@Lstor Making the numbers too small would give wrong results. the large numbers were the reason it is giving reproducible results. Instead just changing the testing function should suffice. Not much Python3 in this.

Question 17

@AseemBansal I fully agree, and larger numbers are definitely better. But ideone.com has a time limitation, and the purpose of running the code there is running it, not timing it.

Question 18

Thanks for the advice about assert. About function calls costing performance then what about this?

Bakuriu Bakuriu 2961 silver badge8 bronze badges · Accepted Answer · 2013-07-23 08:44:36Z

In order to avoid code duplication and the performance regression that you get when putting that code into a function you can use itertools.chain:

import itertools as it
def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = num
 for div in it.chain([2], range(3, num, 2)):
 while not num % div:
 num //= div
 ans = div
 return ans

To improve performances in a significant way you can reduce the number of iterations noting that you don't have to check factors bigger than sqrt(num), since if num isn't a prime, then it can have at most one prime factor bigger than its square root(otherwise, if p and q where prime factors bigger than sqrt(n) you'd get that n = A * p *q > A * sqrt(n) * sqrt(n) = A * n > n which cannot be true when A != 1).

This leads to:

def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = num
 sqrt_num = int(num**.5)
 for div in it.chain([2], range(3, sqrt_num + 1, 2)):
 while not num % div:
 num //= div
 ans = div
 return ans if num == 1 else num

Note that, at the end of the for loop, the value of num is 1 if all its prime factors were smaller than the square root, otherwise it is equal to the only prime factor bigger than the square root.

Sample run:

In [34]: list(map(largest_prime_factor, range(2, 25)))
Out[34]: [2, 3, 2, 5, 3, 7, 2, 3, 5, 11, 3, 13, 7, 5, 2, 17, 3, 19, 5, 7, 11, 23, 3]

In fact we can further optimize the code noting that, whenever num becomes 1 inside the for loop we can safely break out of it(while the current code tries all the other factors up to sqrt(n) anyway).

def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = num
 sqrt_num = int(num**.5)
 for div in it.chain([2], range(3, sqrt_num + 1, 2)):
 while not num % div:
 num //= div
 ans = div
 if div > num // div:
 return ans
 return num

We could also do something like:

def largest_prime_factor(num):
 '''returns the largest prime factor of a number'''
 ans = num
 sqrt_num = int(num**.5)
 candidates = it.chain([2], range(3, sqrt_num + 1, 2))
 for div in it.takewhile(lambda div: div <= num // div, candidates):
 while not num % div:
 num //= div
 ans = div
 return max(ans, num)

However this last version does a function call for each iteration and it will be slightly slower.

I figured that there is a bigger number. There is slight formatting problem in the beginning of the question when importing itertools. Also when sqrt_num = int(num**.5) + 1 then you don't need the + 1 in the range. Redundant. Please fix that
largest prime factor of (2^200 * 9) is 3. To find this, the maximal divisor to try by is 3, not (3 * 2^100). :)
welcome; your code isn't right still though. What about (2^100 * 10000000019)? What is the biggest divisor to consider?
@WillNess I see, changed; now it should be as fast as possible for a simple implementation. Regardin the rest I don't see why should I use a while loop where a for will do(note that in python using a while will decrease performances, even though in this case, hopefully, the operations inside the loop should take most of the time). Also the OP specifically asked how to do this without testing 2 separately. Regarding the not: it's common practice in python to use not or simply if expression to check for truth values(e.g. if some_sequence: means some_sequence isn't empty.

Stack Exchange Network

Efficient way of code reuse for finding the largest prime factor

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Efficient way of code reuse for finding the largest prime factor

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions