I was doing timing with variations of function largest_prime_factor
. I was able to write a better one largest_prime_factor2
. There were some repetitions in the new function so I wrote another function largest_prime_factor3
.
I thought the 3rd one has to be faster than the 2nd because I broke it up into more functions but it wasn't faster but slower.
- I wanted to know whether my method of code reuse is good or not?
- Is there a better way of reusing code in Python?
- Did I miss something due to which the new function became slower?
My functions alongwith the testing()
function used to test all three.
def largest_prime_factor(num):
'''returns the largest prime factor of a number'''
ans = 0
if num % 2 == 0:
ans = 2
num /= 2
while num % 2 == 0:
num /= 2
i = 3
while i <= num:
if num % i == 0:
ans = i
num /= i
while num % i == 0:
num /= i
i += 2
return ans
def largest_prime_factor2(num):
'''returns the largest prime factor of a number'''
ans = 0
if num % 2:
pass
else:
ans = 2
while True:
num //= 2
if num % 2:
break
i = 3
while i <= num:
if num % i:
pass
else:
ans = i
while True:
num //= i
if num % i:
break
i += 2
return ans
def largest_prime_factor3(num):
'''returns the largest prime factor of a number'''
def check(j):
nonlocal ans
nonlocal num
if num % j:
return
ans = j
while True:
num //= j
if num % j:
return
ans = 0
check(2)
i = 3
while i <= num:
check(i)
i += 2
return ans
def testing():
from timeit import Timer
import random
def tests(f, times):
def test1(f):
f(random.randint(1, 1000))
def test2(f):
f(random.randint(100000, 1000000))
print(f.__name__)
print(Timer(lambda: test1(f)).timeit(number = times))
print(Timer(lambda: test2(f)).timeit(number = times//10))
print()
tests(largest_prime_factor, 10000)
tests(largest_prime_factor2, 10000)
tests(largest_prime_factor3, 10000)
if __name__ == "__main__":
testing()
The timing results that I found:
largest_prime_factor 0.338549207387318 16.599112750324068 largest_prime_factor2 0.21575289594063918 12.086949738861868 largest_prime_factor3 0.36032199674939847 18.893444539398857
This format of results is produced by the testing
function. See that for details.
2 Answers 2
In order to avoid code duplication and the performance regression that you get when putting that code into a function you can use itertools.chain
:
import itertools as it
def largest_prime_factor(num):
'''returns the largest prime factor of a number'''
ans = num
for div in it.chain([2], range(3, num, 2)):
while not num % div:
num //= div
ans = div
return ans
To improve performances in a significant way you can reduce the number of iterations noting that you don't have to check factors bigger than sqrt(num)
, since if num
isn't a prime, then it can have at most one prime factor bigger than its square root(otherwise, if p
and q
where prime factors bigger than sqrt(n)
you'd get that n = A * p *q > A * sqrt(n) * sqrt(n) = A * n > n
which cannot be true when A != 1
).
This leads to:
def largest_prime_factor(num):
'''returns the largest prime factor of a number'''
ans = num
sqrt_num = int(num**.5)
for div in it.chain([2], range(3, sqrt_num + 1, 2)):
while not num % div:
num //= div
ans = div
return ans if num == 1 else num
Note that, at the end of the for
loop, the value of num
is 1 if all its prime factors were smaller than the square root, otherwise it is equal to the only prime factor bigger than the square root.
Sample run:
In [34]: list(map(largest_prime_factor, range(2, 25)))
Out[34]: [2, 3, 2, 5, 3, 7, 2, 3, 5, 11, 3, 13, 7, 5, 2, 17, 3, 19, 5, 7, 11, 23, 3]
In fact we can further optimize the code noting that, whenever num
becomes 1
inside the for
loop we can safely break
out of it(while the current code tries all the other factors up to sqrt(n)
anyway).
def largest_prime_factor(num):
'''returns the largest prime factor of a number'''
ans = num
sqrt_num = int(num**.5)
for div in it.chain([2], range(3, sqrt_num + 1, 2)):
while not num % div:
num //= div
ans = div
if div > num // div:
return ans
return num
We could also do something like:
def largest_prime_factor(num):
'''returns the largest prime factor of a number'''
ans = num
sqrt_num = int(num**.5)
candidates = it.chain([2], range(3, sqrt_num + 1, 2))
for div in it.takewhile(lambda div: div <= num // div, candidates):
while not num % div:
num //= div
ans = div
return max(ans, num)
However this last version does a function call for each iteration and it will be slightly slower.
-
\$\begingroup\$ I figured that there is a bigger number. There is slight formatting problem in the beginning of the question when importing
itertools
. Also whensqrt_num = int(num**.5) + 1
then you don't need the+ 1
in the range. Redundant. Please fix that \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月23日 08:49:45 +00:00Commented Jul 23, 2013 at 8:49 -
\$\begingroup\$ @AseemBansal Fixed. \$\endgroup\$Bakuriu– Bakuriu2013年07月23日 09:26:53 +00:00Commented Jul 23, 2013 at 9:26
-
2\$\begingroup\$ largest prime factor of (2^200 * 9) is 3. To find this, the maximal divisor to try by is 3, not (3 * 2^100). :) \$\endgroup\$Will Ness– Will Ness2013年07月23日 15:02:15 +00:00Commented Jul 23, 2013 at 15:02
-
\$\begingroup\$ welcome; your code isn't right still though. What about (2^100 * 10000000019)? What is the biggest divisor to consider? \$\endgroup\$Will Ness– Will Ness2013年07月23日 18:28:30 +00:00Commented Jul 23, 2013 at 18:28
-
1\$\begingroup\$ @WillNess I see, changed; now it should be as fast as possible for a simple implementation. Regardin the rest I don't see why should I use a
while
loop where afor
will do(note that in python using awhile
will decrease performances, even though in this case, hopefully, the operations inside the loop should take most of the time). Also the OP specifically asked how to do this without testing2
separately. Regarding thenot
: it's common practice in python to usenot
or simplyif expression
to check for truth values(e.g.if some_sequence:
meanssome_sequence
isn't empty. \$\endgroup\$Bakuriu– Bakuriu2013年07月23日 19:18:05 +00:00Commented Jul 23, 2013 at 19:18
Usually, splitting your code into different smaller functions is a good idea as it makes things easier to read, to reuse and to test.
However, this comes at a (cheap) price as far as performances are concerned because functions calls are not for free (nor are function definitions).
Now, for additional advices, whenever you want to compare performances of functions, it might also be a good idea to return what they compare. It's not good having a fast function if it doesn't return the right thing. By using assert
, you could ensure you get notified quickly whenever something goes wrong.
This being said, I'd like to run and comment your code but I don't have a python3 handy and it seems that it is the version you are using so I'll try to do this later :-)
-
\$\begingroup\$ I am not using many features of Python3. You just need to change the
print
in thetesting
function. Maybe the nested function in that also. Everything else is same. Otherwise you can use ideone.com if you want. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月19日 13:35:04 +00:00Commented Jul 19, 2013 at 13:35 -
\$\begingroup\$ ideone.com is your friend :-) The code: ideone.com/565hyx (Note that I divided the number of tests by 100.) \$\endgroup\$Lstor– Lstor2013年07月19日 13:35:29 +00:00Commented Jul 19, 2013 at 13:35
-
\$\begingroup\$ @Lstor Making the numbers too small would give wrong results. the large numbers were the reason it is giving reproducible results. Instead just changing the
testing
function should suffice. Not much Python3 in this. \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月19日 13:37:49 +00:00Commented Jul 19, 2013 at 13:37 -
\$\begingroup\$ @AseemBansal I fully agree, and larger numbers are definitely better. But ideone.com has a time limitation, and the purpose of running the code there is running it, not timing it. \$\endgroup\$Lstor– Lstor2013年07月19日 13:39:07 +00:00Commented Jul 19, 2013 at 13:39
-
1\$\begingroup\$ Thanks for the advice about
assert
. About function calls costing performance then what about this? \$\endgroup\$Aseem Bansal– Aseem Bansal2013年07月19日 14:01:25 +00:00Commented Jul 19, 2013 at 14:01
Explore related questions
See similar questions with these tags.
2
. But you can create an iterable that first yields 2 and then only the odd numbers. For example usingitertools.chain
:for i in it.chain([2], range(3, num, 2)): ...
would do what you want. (Last remark: instead ofwhile True
you could usewhile n % i == 0
and remove thebreak
). \$\endgroup\$if
when timing that?for i in range(3, num, 2): while n %i == 0: ... n //= i
. Also, do not obsess to much about timing. Readable code is much better than a code that take 0.1 msec less. Havingwhile True
s around doesn't make code readable. \$\endgroup\$sqrt(n)
since they cannot be factors ofn
. the only case is whenn
is prime). \$\endgroup\$