This is my attempt at problem 12 from Project Euler. However, my code freezes my old, linux computer when I run it. I've only got 512MB of RAM and an old Intel Core 2 Duo processor on my machine which, I assume, is why it can't run the code.
Is there a way to optimize this code without using up too much resources?
#!/bin/python
def nth_triangle_number(nth):
""" Sequence of triangle numbers is generated by adding the natural numbers.
So the 7th triangle number would be 1 +たす 2 +たす 3 +たす 4 +たす 5 +たす 6 +たす 7 =わ 28. """
return sum([i for i in range(nth+1)])
def get_number_of_factors(num):
quantity = 0
for i in range(1, int(num/2)+1): # I think the problem lies here in range()
if not num % i: # BTW, using math.sqrt() produces the wrong
quantity += 1 # answer for smaller numbers. I don't know why.
return quantity + 1 # Added 1 to count for num (as in 1 * num)
def get_triangle_number(divisors):
""" Find the triangle number with the given amount of divisors """
i = 1
while get_number_of_factors(nth_triangle_number(i)) <= divisors:
i += 1
return nth_triangle_number(i)
if __name__ == '__main__':
number_of_divisors = 500
print "Answer found: ", get_triangle_number(number_of_divisors)
I've tried initializing get_triangle_number(), i, at higher values, but it didn't seem to help. I believe the problem is with get_number_of_factors(). I had originally stored all the factors of a given integer into a list then compared its length to number_of_divisors. I figured counting each integer that satisfied the condition would save some resources.
The code does run properly and produces the correct answer when I tested the example provided on Project Euler and the following values:
number_of_divisors = 5 # Runs properly and produces correct answer
number_of_divisors = 50 # Runs properly and " "
number_of_divisors = 150 # Runs properly and " "
number_of_divisors = 200 # Crash
1 Answer 1
First, you can easily find the bottleneck in your code by running it with the profiler (ran with number_of_divisors = 100
):
~> python -m cProfile t.py
Answer found: 73920
1926 function calls in 0.522 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.522 0.522 t.py:1(<module>)
385 0.006 0.000 0.008 0.000 t.py:1(nth_triangle_number)
1 0.000 0.000 0.522 0.522 t.py:13(get_triangle_number)
384 0.472 0.001 0.514 0.001 t.py:6(get_number_of_factors)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
769 0.043 0.000 0.043 0.000 {range}
385 0.001 0.000 0.001 0.000 {sum}
You can immediately see that:
- Indeed,
get_number_of_factors()
slows everything down; - You can slightly optimize things by replacing
range
withxrange
(which does not create a list).
In your case get_number_of_factors()
is a good example of a function that requires algorithmic optimization. Currently it has O(N) complexity. As you correctly guessed, iterating only till sqrt(num)
will make it faster:
def get_number_of_factors(num):
limit = math.sqrt(num)
int_sqrt = int(round(limit))
if int_sqrt ** 2 == num:
int_limit = int_sqrt
quantity = 1
else:
int_limit = int(limit) + 1
quantity = 0
for i in xrange(1, int_limit):
if not num % i:
quantity += 2
return quantity
Some highlights:
math.sqrt
returns a float, so if you want to be extra-safe, you need to convert it to integer properly (ifsqrt(4)
gives you3.9999999
, justint()
will produce3
).- When you are iterating to the square root, each factor you found stands for two factors:
i
andnum / i
. - But you do not want to include the root itself, because if
num
is a perfect root, you only need to countsqrt(num)
once.
This makes your algorithm run in O(sqrt(N)). Edit: if you want even more performance, you can try to implement number field sieve.
To add more performance, you can replace the direct sum in nth_triangle_number()
with a known formula for arithmetic progression.
The result is:
import math
def nth_triangle_number(nth):
""" Sequence of triangle numbers is generated by adding the natural numbers.
So the 7th triangle number would be 1 +たす 2 +たす 3 +たす 4 +たす 5 +たす 6 +たす 7 =わ 28.
Using the fact that 1 + ... + n = n * (n + 1) / 2
"""
return nth * (nth + 1) / 2
def get_number_of_factors(num):
limit = math.sqrt(num)
int_sqrt = int(round(limit))
if int_sqrt ** 2 == num:
int_limit = int_sqrt
quantity = 1
else:
int_limit = int(limit) + 1
quantity = 0
for i in xrange(1, int_limit):
if not num % i:
quantity += 2
return quantity
def get_triangle_number(divisors):
""" Find the triangle number with the given amount of divisors """
i = 1
while get_number_of_factors(nth_triangle_number(i)) <= divisors:
i += 1
return nth_triangle_number(i)
if __name__ == '__main__':
number_of_divisors = 100
print "Answer found: ", get_triangle_number(number_of_divisors)
And it runs much faster now. For 100
:
~> python -m cProfile t.py
Answer found: 73920
1540 function calls in 0.007 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.007 0.007 t.py:1(<module>)
384 0.006 0.000 0.006 0.000 t.py:10(get_number_of_factors)
1 0.000 0.000 0.006 0.006 t.py:28(get_triangle_number)
385 0.000 0.000 0.000 0.000 t.py:3(nth_triangle_number)
384 0.000 0.000 0.000 0.000 {math.sqrt}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
384 0.000 0.000 0.000 0.000 {round}
And for 500
:
~> python -m cProfile t.py
Answer found: 76576500
49504 function calls in 5.364 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 5.364 5.364 t.py:1(<module>)
12375 5.342 0.000 5.349 0.000 t.py:10(get_number_of_factors)
1 0.010 0.010 5.363 5.363 t.py:28(get_triangle_number)
12376 0.005 0.000 0.005 0.000 t.py:3(nth_triangle_number)
12375 0.003 0.000 0.003 0.000 {math.sqrt}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
12375 0.004 0.000 0.004 0.000 {round}
-
1\$\begingroup\$ Excellent! So much useful information in your review! I especially like the algorithms you've introduced. I'm at the end my first year in my CS course and I have yet to explore this stuff. Thanks for the head start! \$\endgroup\$alexako– alexako2013年07月16日 14:22:13 +00:00Commented Jul 16, 2013 at 14:22