Optimizing Project Euler #12 (Python 3)

Question 1

I just figured out Project Euler problem #12. I'm still really new to Python, so I'm putting my code up here in the hopes that someone will be able to give me some tips for optimizing my code to run faster. It took an incredibly long time to run, so I'm hoping that someone will be able to give me pointers.

If anyone is willing, would you mind looking over my solution and letting me know how I can improve on it, either through optimization or style?

The link to the problem is here: Project Euler #12

run = True
p = 2
numberInUse = 3
def checkDivisors(number):
 numberOfDivisors = 0
 for x in range(1, int(number+1)):
 if number%x ==0:
 numberOfDivisors += 1
 return(numberOfDivisors)
while run == True: 
 if checkDivisors(numberInUse) > 500:
 run = False
 print(numberInUse, "is the answer!")
 else:
 p += 1
 numberInUse += p

Question 2

"If you can't solve it, then you can't solve it!" -- quote Project Euler. If you can already solve it, then you can read the forum thread.

Question 3

First some stylistic stuff

Variable names

According to the PEP 8 style guide, Python variables use snake_case instead of camelCase so, for instance, numberInUse should be number_in_use.

Add a `main`

The equivalent of a main function in Python is:

def main():
 ...
if __name__ == '__main__':
 main()

The reasons for this are described in this answer. You should move your while loop into the main function.

Refactoring the while loop

Don't use a run flag. Just break when you are done:

while True: 
 if checkDivisors(numberInUse) > 500:
 print(numberInUse, "is the answer!")
 break
 else:
 p += 1
 numberInUse += p

Optimizations

There is a lot that can be optimized when it comes to checkDivisors (check_divsors). Recall that a number can be factorized as a product of prime numbers:

$$p_1^{k_1}p_2^{k_2} \ldots p_n^{k_n}$$

The number of factors then becomes:

$$(k_1 + 1)(k_2 + 1)\cdots(k_n + 1)$$

So this is the idea:

Keep track of the factors in some sort of counter variable. (Initialize it at 1)
Create a variable i = 2.
Keep incrementing i until number is equal to 1.
If i divides number, divide number by i. Don't increment and keep track of how many times you divided number by i call this div.
Otherwise, update counter by multiplying couneter with div + 1. Increment i.

So let's see what this does with 6:

2 divides 6 so now assign number to 6/2 = 3 and update div = 1.
Update counter by multiplying it by div + 1 so counter = 1*(1+1) = 2 and increment i.
3 divides 3 so now assign number to 3/3 = 1 and update div = 1.
Update counter by multiplying it by div + 1 so counter = 2*(1+1) = 4 and increment i.
number = 1 so terminate the algorithm. Return counter = 4 which is the number of divisors for 6.

I will leave it to you to implement the algorithm as an exercise.

Question 4

Thank you so much for your help! I was able to easily implement this solution, and it ran in a very short amount of time. I really appreciate you helping me out!

Question 5

@user9207170: How does your code look like now? You can write an answer to yourself to show it to others, without modifying your question.

Question 6

Project Euler problems are more about math than programming.

You must prove (or at least observe) the following:

An nth triangular number is \$\dfrac{n(n+1)}{2}\$. One of the numerator's factors is obviously even.
The number_of_divisors is a multiplicative function (known as \$\sigma\$), meaning that if \$m\$ and \$n\$ are coprime, then \$\sigma(mn) = \sigma(m)\cdot\sigma(n)\$
\$n\$ and \$n+1\$ are necessarily coprime.
Once you computed \$\sigma(n)\$ and \$\sigma(n+1)\$ on a certain iteration, on the next iteration you'd only need to compute \$\sigma(n+2)\$. Using memoization, you may even avoid this when \$n+2\$ is even.

This should be enough to build an efficient solution.

Question 7

Very nice. It works fine, and is 3 times faster than my naive approach. See here.

Question 8

That's the spirit :) +1

Question 9

First things first...

Pep8

Python has a strong idea of how the code should be styled, and it is expressed in pep8.

I suggest you get a style/lint checker. I use the pycharm ide which will show you style and compile issues right in the editor.

This is important when you are sharing code with others. (Like maybe on Code Review)

Comprehensions are our friend

Your code has a function:

def checkDivisors(number):
 numberOfDivisors = 0
 for x in range(1, int(number+1)):
 if number%x ==0:
 numberOfDivisors += 1
 return(numberOfDivisors)

To greatly speed things up, we can convert to a comprehension, and change the loop to terminate earlier.

number_of_divisors = len({
 factor for i in range(1, int(number_in_use ** 0.5) + 1) if
 number_in_use % i == 0 for factor in [i, number_in_use // i]})

How does this work

This loop is similar to yours, but it only loops up to the square root of the number we are getting divisors for, and then when it finds a divisor, also keeps the number divided by the divisor as a divisor.

If only one exit from permaloop, use `break`, not boolean flag

Instead of:

while run == True: 
 if ready_to_finsh:
 run = False

use:

while True:
 ....
 if ready_to_finish:
 break

This saves the need for the intermediate variable

Resulting Code

from math import sqrt
number_of_divisors, p, number_in_use = 0, 2, 3
while True:
 number_of_divisors = len({
 factor for i in range(1, int(sqrt(number_in_use)) + 1) if
 number_in_use % i == 0 for factor in [i, number_in_use // i]})
 if number_of_divisors > finish:
 break
 p += 1
 number_in_use += p

Question 10

Theory

Calculate prime factors

For a semi-naive approach, you could calculate the prime factors of n first:

from collections import Counter
from math import floor
def factors(n, found=Counter(), start=2):
 if not found:
 found = Counter()
 for i in range(start, floor(n**0.5)+1):
 if n % i == 0:
 found[i] += 1
 return factors(n // i, found, start=i)
 found[n] += 1
 return found

The function returns a Counter, with prime factors as keys and exponent as values:

print(factors(1))
# Counter({1: 1})
print(factors(7))
# Counter({7: 1})
print(factors(24))
# Counter({2: 3, 3: 1})
print(factors(25))
# Counter({5: 2})
print(factors(35))
# Counter({5: 1, 7: 1})
print(factors(1024))
# Counter({2: 10})

Calculate number of divisors

You can then use this counter to calculate the number of divisors:

def number_of_divisors(n):
 if n == 1:
 return 1 
 result = 1 
 for exponent in factors(n).values():
 result *= exponent + 1
 return result

Example:

print(number_of_divisors(60))
# 12

Solution

You simply need to iterate over n until the n-th triangle number has more than 500 divisors:

from itertools import count
for n in count():
 triangle_number = n * (n + 1) // 2
 divisors_count = number_of_divisors(triangle_number)
 if divisors_count > 500:
 print("The %d-th triangle number is %d and has %d divisors." %
 (n, triangle_number, divisors_count))
 break

It returns a solution in less than 300ms on my slowish computer:

The 12375-th triangle number is 76576500 and has 576 divisors.

Optimization

Since factors returns a Counter, you can use the fact that factors(n1*n2) is factors(n1) + factors(n2) for any integers n1, n2 larger than 1.
You could cache the result of factors(n+1), and use it as factors(n) during the next iteration.
@vnp has written an excellent answer with further optimization.

The complete code becomes:

from collections import Counter
from math import floor
from functools import lru_cache
from itertools import count
def factors(n, found=Counter(), start=2):
 if not found:
 found = Counter()
 for i in range(start, floor(n**0.5) + 1):
 if n % i == 0:
 found[i] += 1
 return factors(n // i, found, start=i)
 found[n] += 1
 return found
@lru_cache(maxsize=1024)
def number_of_divisors(n):
 if n == 1:
 return 1
 result = 1
 for exponent in factors(n).values():
 result *= exponent + 1
 return result
for n in count():
 if n % 2 == 0:
 even = n
 odd = n + 1
 else:
 odd = n
 even = n + 1
 divisors_count = number_of_divisors(even // 2) * number_of_divisors(odd)
 if divisors_count > 500:
 print("The %d-th triangle number is %d and has %d divisors." %
 (n, even * odd // 2, divisors_count))
 break

It is 3 times faster than the previous solution:

The 12375-th triangle number is 76576500 and has 576 divisors.

Just for fun, it also finds the solution for 5000+ divisors in less than a minute:

The 2203200-th triangle number is 2427046221600 and has 5760 divisors.

Question 11

Your checkDivisors function is incredibly inefficient; you can calculate the number of divisors from the prime factorization rather than checking whether each number is a factor (and even if you were brute forcing, you could improve by a factor of two by realizing that the only divisor greater than half the number is the number itself). Consider the number 24 = 2^3 * 3. Since the only prime factors of 24 are 2 and 3, a factor of 24 can't have any prime factors other than 2 or 3. So a factor of 24 will be in the form of n = 2^a * 3^b. But if a>3, then n will not divide 24. So the allowable values of a are 0,1,2, and 3. Notice that there are four allowable values of a, and this is one more than the power of 2 (the power of 2 in 24 is 3). Similarly, there two allowable values of b. So the total number of combinations of a and b is 4*2 = 8. Thus, there are 8 factors of 24: 1,2,3,4,6,8,12,24.

So you should break checkDivisors into two functions. One should find the prime factorization of n, and the other should multiply together one more than the powers of the prime factorization. You can improve the prime factorization function by maintaining a dictionary of numbers that you've factored. Then once you've found a prime factor p of n, you can use the prime factorization of n/p. And since prime factorization uses prime numbers, you should keep track of what numbers are prime.

You can also use the fact that the nth triangle number is equal to (n+1)n/2, so you just have to check the number of factors of those two numbers. I recommend checking two numbers at a time, so you don't have to deal with checking which number is even. That is, if you check 3 and 4 in the first iteration, then 5 and 6, etc., then you know that you can divide the second number by two.

Also, you can get rid of the "magic number" of 500 by setting a constant target_factors.

from functools import reduce
from operator import mul
target_factors = 500
prime_numbers = [2,3]
factorizations = {2:{2:1},3:{3:1}} 
def number_of_factors(prime_factorization): 
 return(reduce(mul, [power+1 for power in prime_factorization.values()], 1))
def factor(n):
 if n in factorizations.keys():
 return(factorizations[n].copy())
 for prime in prime_numbers:
 if not n%prime: #n%p give the remainder of n/p, so if there's no remainder, that means p is a factor of n
 factorization = factorizations[n/prime].copy()
 factorization[prime] = factorization.get(prime,0)+1 
 factorizations[n] = factorization
 break
 if prime**2 > n:
 prime_numbers.append(n)
 factorizations[n] = {n:1}
 break
 return(factorizations[n].copy()) 
n = 4 
while True:
 previous_factorization = factor(n-1)
 previous_num_of_factors = number_of_factors(previous_factorization)
 current_factorization = factor(n)
 current_factorization[2] -=1 #dividing by 2 is the same as subtracting one from the power of 2
 current_num_of_factors = number_of_factors(current_factorization)
 if previous_num_of_factors*current_num_of_factors > target_factors:
 print('The {}th triangle number has {} factors'.format(n-1,previous_num_of_factors*current_num_of_factors))
 break
 next_factorization = factor(n+1)
 next_num_of_factors = number_of_factors(next_factorization)
 if next_num_of_factors*current_num_of_factors > target_factors:
 print('The {}th triangle number has {} factors'.format(n,next_num_of_factors*current_num_of_factors))
 break
 n+=2

I found that the time taken varied a bit, which is odd since this should be deterministic, but on average this took about half as much time as @Eric Duminil 's solution.

Dair Dair 6,2201 gold badge21 silver badges45 bronze badges · Accepted Answer · 2018-01-18 05:40:02Z

First some stylistic stuff

Variable names

According to the PEP 8 style guide, Python variables use snake_case instead of camelCase so, for instance, numberInUse should be number_in_use.

Add a `main`

The equivalent of a main function in Python is:

def main():
 ...
if __name__ == '__main__':
 main()

The reasons for this are described in this answer. You should move your while loop into the main function.

Refactoring the while loop

Don't use a run flag. Just break when you are done:

while True: 
 if checkDivisors(numberInUse) > 500:
 print(numberInUse, "is the answer!")
 break
 else:
 p += 1
 numberInUse += p

Optimizations

There is a lot that can be optimized when it comes to checkDivisors (check_divsors). Recall that a number can be factorized as a product of prime numbers:

$$p_1^{k_1}p_2^{k_2} \ldots p_n^{k_n}$$

The number of factors then becomes:

$$(k_1 + 1)(k_2 + 1)\cdots(k_n + 1)$$

So this is the idea:

Keep track of the factors in some sort of counter variable. (Initialize it at 1)
Create a variable i = 2.
Keep incrementing i until number is equal to 1.
If i divides number, divide number by i. Don't increment and keep track of how many times you divided number by i call this div.
Otherwise, update counter by multiplying couneter with div + 1. Increment i.

So let's see what this does with 6:

2 divides 6 so now assign number to 6/2 = 3 and update div = 1.
Update counter by multiplying it by div + 1 so counter = 1*(1+1) = 2 and increment i.
3 divides 3 so now assign number to 3/3 = 1 and update div = 1.
Update counter by multiplying it by div + 1 so counter = 2*(1+1) = 4 and increment i.
number = 1 so terminate the algorithm. Return counter = 4 which is the number of divisors for 6.

I will leave it to you to implement the algorithm as an exercise.

Thank you so much for your help! I was able to easily implement this solution, and it ran in a very short amount of time. I really appreciate you helping me out!
@user9207170: How does your code look like now? You can write an answer to yourself to show it to others, without modifying your question.

Stack Exchange Network

Optimizing Project Euler #12 (Python 3)

5 Answers 5

First some stylistic stuff

Variable names

Add a `main`

Refactoring the while loop

Optimizations

Pep8

Comprehensions are our friend

How does this work

If only one exit from permaloop, use `break`, not boolean flag

Resulting Code

Theory

Calculate prime factors

Calculate number of divisors

Solution

Optimization

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Optimizing Project Euler #12 (Python 3)

5 Answers 5

First some stylistic stuff

Variable names

Add a __main__

Refactoring the while loop

Optimizations

Pep8

Comprehensions are our friend

How does this work

If only one exit from permaloop, use break, not boolean flag

Resulting Code

Theory

Calculate prime factors

Calculate number of divisors

Solution

Optimization

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Add a `main`

If only one exit from permaloop, use `break`, not boolean flag