I just figured out Project Euler problem #12. I'm still really new to Python, so I'm putting my code up here in the hopes that someone will be able to give me some tips for optimizing my code to run faster. It took an incredibly long time to run, so I'm hoping that someone will be able to give me pointers.
If anyone is willing, would you mind looking over my solution and letting me know how I can improve on it, either through optimization or style?
The link to the problem is here: Project Euler #12
run = True
p = 2
numberInUse = 3
def checkDivisors(number):
numberOfDivisors = 0
for x in range(1, int(number+1)):
if number%x ==0:
numberOfDivisors += 1
return(numberOfDivisors)
while run == True:
if checkDivisors(numberInUse) > 500:
run = False
print(numberInUse, "is the answer!")
else:
p += 1
numberInUse += p
-
2\$\begingroup\$ "If you can't solve it, then you can't solve it!" -- quote Project Euler. If you can already solve it, then you can read the forum thread. \$\endgroup\$user158419– user1584192018年01月19日 08:01:45 +00:00Commented Jan 19, 2018 at 8:01
5 Answers 5
First some stylistic stuff
Variable names
According to the PEP 8 style guide, Python variables use snake_case
instead of camelCase
so, for instance, numberInUse
should be number_in_use
.
Add a __main__
The equivalent of a main
function in Python is:
def main():
...
if __name__ == '__main__':
main()
The reasons for this are described in this answer. You should move your while
loop into the main function.
Refactoring the while loop
Don't use a run
flag. Just break
when you are done:
while True:
if checkDivisors(numberInUse) > 500:
print(numberInUse, "is the answer!")
break
else:
p += 1
numberInUse += p
Optimizations
There is a lot that can be optimized when it comes to checkDivisors
(check_divsors
). Recall that a number can be factorized as a product of prime numbers:
$$p_1^{k_1}p_2^{k_2} \ldots p_n^{k_n}$$
The number of factors then becomes:
$$(k_1 + 1)(k_2 + 1)\cdots(k_n + 1)$$
So this is the idea:
- Keep track of the factors in some sort of
counter
variable. (Initialize it at1
) - Create a variable
i = 2
. - Keep incrementing
i
untilnumber
is equal to1
. - If
i
dividesnumber
, dividenumber
byi
. Don't increment and keep track of how many times you dividednumber
byi
call thisdiv
. - Otherwise, update
counter
by multiplyingcouneter
withdiv + 1
. Incrementi
.
So let's see what this does with 6
:
2
divides6
so now assignnumber
to6/2 = 3
and updatediv = 1
.- Update
counter
by multiplying it bydiv + 1
socounter = 1*(1+1) = 2
and incrementi
. 3
divides3
so now assignnumber
to3/3 = 1
and updatediv = 1
.- Update
counter
by multiplying it bydiv + 1
socounter = 2*(1+1) = 4
and incrementi
. number = 1
so terminate the algorithm. Returncounter = 4
which is the number of divisors for6
.
I will leave it to you to implement the algorithm as an exercise.
-
\$\begingroup\$ Thank you so much for your help! I was able to easily implement this solution, and it ran in a very short amount of time. I really appreciate you helping me out! \$\endgroup\$readjfb– readjfb2018年01月18日 13:56:55 +00:00Commented Jan 18, 2018 at 13:56
-
\$\begingroup\$ @user9207170: How does your code look like now? You can write an answer to yourself to show it to others, without modifying your question. \$\endgroup\$Eric Duminil– Eric Duminil2018年01月18日 14:30:27 +00:00Commented Jan 18, 2018 at 14:30
Project Euler problems are more about math than programming.
You must prove (or at least observe) the following:
An
n
th triangular number is \$\dfrac{n(n+1)}{2}\$. One of the numerator's factors is obviously even.The
number_of_divisors
is a multiplicative function (known as \$\sigma\$), meaning that if \$m\$ and \$n\$ are coprime, then \$\sigma(mn) = \sigma(m)\cdot\sigma(n)\$\$n\$ and \$n+1\$ are necessarily coprime.
Once you computed \$\sigma(n)\$ and \$\sigma(n+1)\$ on a certain iteration, on the next iteration you'd only need to compute \$\sigma(n+2)\$. Using memoization, you may even avoid this when \$n+2\$ is even.
This should be enough to build an efficient solution.
-
\$\begingroup\$ Very nice. It works fine, and is 3 times faster than my naive approach. See here. \$\endgroup\$Eric Duminil– Eric Duminil2018年01月18日 14:47:34 +00:00Commented Jan 18, 2018 at 14:47
-
\$\begingroup\$ That's the spirit :) +1 \$\endgroup\$Ma0– Ma02018年01月19日 09:07:45 +00:00Commented Jan 19, 2018 at 9:07
First things first...
Pep8
Python has a strong idea of how the code should be styled, and it is expressed in pep8.
I suggest you get a style/lint checker. I use the pycharm ide which will show you style and compile issues right in the editor.
This is important when you are sharing code with others. (Like maybe on Code Review)
Comprehensions are our friend
Your code has a function:
def checkDivisors(number):
numberOfDivisors = 0
for x in range(1, int(number+1)):
if number%x ==0:
numberOfDivisors += 1
return(numberOfDivisors)
To greatly speed things up, we can convert to a comprehension, and change the loop to terminate earlier.
number_of_divisors = len({
factor for i in range(1, int(number_in_use ** 0.5) + 1) if
number_in_use % i == 0 for factor in [i, number_in_use // i]})
How does this work
This loop is similar to yours, but it only loops up to the square root of the number we are getting divisors for, and then when it finds a divisor, also keeps the number divided by the divisor as a divisor.
If only one exit from permaloop, use break
, not boolean flag
Instead of:
while run == True:
if ready_to_finsh:
run = False
use:
while True:
....
if ready_to_finish:
break
This saves the need for the intermediate variable
Resulting Code
from math import sqrt
number_of_divisors, p, number_in_use = 0, 2, 3
while True:
number_of_divisors = len({
factor for i in range(1, int(sqrt(number_in_use)) + 1) if
number_in_use % i == 0 for factor in [i, number_in_use // i]})
if number_of_divisors > finish:
break
p += 1
number_in_use += p
Theory
Calculate prime factors
For a semi-naive approach, you could calculate the prime factors of n
first:
from collections import Counter
from math import floor
def factors(n, found=Counter(), start=2):
if not found:
found = Counter()
for i in range(start, floor(n**0.5)+1):
if n % i == 0:
found[i] += 1
return factors(n // i, found, start=i)
found[n] += 1
return found
The function returns a Counter
, with prime factors as keys and exponent as values:
print(factors(1))
# Counter({1: 1})
print(factors(7))
# Counter({7: 1})
print(factors(24))
# Counter({2: 3, 3: 1})
print(factors(25))
# Counter({5: 2})
print(factors(35))
# Counter({5: 1, 7: 1})
print(factors(1024))
# Counter({2: 10})
Calculate number of divisors
You can then use this counter to calculate the number of divisors:
def number_of_divisors(n):
if n == 1:
return 1
result = 1
for exponent in factors(n).values():
result *= exponent + 1
return result
Example:
print(number_of_divisors(60))
# 12
Solution
You simply need to iterate over n
until the n-th triangle number has more than 500 divisors:
from itertools import count
for n in count():
triangle_number = n * (n + 1) // 2
divisors_count = number_of_divisors(triangle_number)
if divisors_count > 500:
print("The %d-th triangle number is %d and has %d divisors." %
(n, triangle_number, divisors_count))
break
It returns a solution in less than 300ms on my slowish computer:
The 12375-th triangle number is 76576500 and has 576 divisors.
Optimization
- Since
factors
returns aCounter
, you can use the fact thatfactors(n1*n2)
isfactors(n1) + factors(n2)
for any integersn1, n2
larger than 1. - You could cache the result of
factors(n+1)
, and use it asfactors(n)
during the next iteration. - @vnp has written an excellent answer with further optimization.
The complete code becomes:
from collections import Counter
from math import floor
from functools import lru_cache
from itertools import count
def factors(n, found=Counter(), start=2):
if not found:
found = Counter()
for i in range(start, floor(n**0.5) + 1):
if n % i == 0:
found[i] += 1
return factors(n // i, found, start=i)
found[n] += 1
return found
@lru_cache(maxsize=1024)
def number_of_divisors(n):
if n == 1:
return 1
result = 1
for exponent in factors(n).values():
result *= exponent + 1
return result
for n in count():
if n % 2 == 0:
even = n
odd = n + 1
else:
odd = n
even = n + 1
divisors_count = number_of_divisors(even // 2) * number_of_divisors(odd)
if divisors_count > 500:
print("The %d-th triangle number is %d and has %d divisors." %
(n, even * odd // 2, divisors_count))
break
It is 3 times faster than the previous solution:
The 12375-th triangle number is 76576500 and has 576 divisors.
Just for fun, it also finds the solution for 5000+ divisors in less than a minute:
The 2203200-th triangle number is 2427046221600 and has 5760 divisors.
Your checkDivisors
function is incredibly inefficient; you can calculate the number of divisors from the prime factorization rather than checking whether each number is a factor (and even if you were brute forcing, you could improve by a factor of two by realizing that the only divisor greater than half the number is the number itself). Consider the number 24 = 2^3 * 3. Since the only prime factors of 24 are 2 and 3, a factor of 24 can't have any prime factors other than 2 or 3. So a factor of 24 will be in the form of n = 2^a * 3^b. But if a>3, then n will not divide 24. So the allowable values of a are 0,1,2, and 3. Notice that there are four allowable values of a, and this is one more than the power of 2 (the power of 2 in 24 is 3). Similarly, there two allowable values of b. So the total number of combinations of a and b is 4*2 = 8. Thus, there are 8 factors of 24: 1,2,3,4,6,8,12,24.
So you should break checkDivisors
into two functions. One should find the prime factorization of n, and the other should multiply together one more than the powers of the prime factorization. You can improve the prime factorization function by maintaining a dictionary of numbers that you've factored. Then once you've found a prime factor p of n, you can use the prime factorization of n/p. And since prime factorization uses prime numbers, you should keep track of what numbers are prime.
You can also use the fact that the nth triangle number is equal to (n+1)n/2, so you just have to check the number of factors of those two numbers. I recommend checking two numbers at a time, so you don't have to deal with checking which number is even. That is, if you check 3 and 4 in the first iteration, then 5 and 6, etc., then you know that you can divide the second number by two.
Also, you can get rid of the "magic number" of 500 by setting a constant target_factors
.
from functools import reduce
from operator import mul
target_factors = 500
prime_numbers = [2,3]
factorizations = {2:{2:1},3:{3:1}}
def number_of_factors(prime_factorization):
return(reduce(mul, [power+1 for power in prime_factorization.values()], 1))
def factor(n):
if n in factorizations.keys():
return(factorizations[n].copy())
for prime in prime_numbers:
if not n%prime: #n%p give the remainder of n/p, so if there's no remainder, that means p is a factor of n
factorization = factorizations[n/prime].copy()
factorization[prime] = factorization.get(prime,0)+1
factorizations[n] = factorization
break
if prime**2 > n:
prime_numbers.append(n)
factorizations[n] = {n:1}
break
return(factorizations[n].copy())
n = 4
while True:
previous_factorization = factor(n-1)
previous_num_of_factors = number_of_factors(previous_factorization)
current_factorization = factor(n)
current_factorization[2] -=1 #dividing by 2 is the same as subtracting one from the power of 2
current_num_of_factors = number_of_factors(current_factorization)
if previous_num_of_factors*current_num_of_factors > target_factors:
print('The {}th triangle number has {} factors'.format(n-1,previous_num_of_factors*current_num_of_factors))
break
next_factorization = factor(n+1)
next_num_of_factors = number_of_factors(next_factorization)
if next_num_of_factors*current_num_of_factors > target_factors:
print('The {}th triangle number has {} factors'.format(n,next_num_of_factors*current_num_of_factors))
break
n+=2
I found that the time taken varied a bit, which is odd since this should be deterministic, but on average this took about half as much time as @Eric Duminil 's solution.