Python generator for "barely obtuse" triangles (Project Euler 224)

Question 1

I am trying to solve Project Euler Problem 224:

Let us call an integer sided triangle with sides a ≤ b ≤ c barely obtuse if the sides satisfy a² + b² = c² - 1.

How many barely obtuse triangles are there with perimeter ≤ 75,000,000?

For large v_max and v_max_n, the generator below yields very slow when I have 750000000 and 500000000, respectively. Having tried for days, I am still looking for an algorithm that can do this in under a minute.

from itertools import product
import math
n = 15 * 10 ** 8
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3
(i for i in product(range(2, v_max_n + 1, 2), 
 range(2, v_max + 1, 2), 
 range(3, v_max + 1)) 
 if i[0] <= i[1] and i[1] <= i[2] and sum(i) <= n and
 i[0] ** 2 + i[1] ** 2 == i[2] ** 2 - 1)

I have tried using it as map(lambda x: x, gen), nested for loops, and have generally been racking my brain to no avail.

I tried re-writing b, c as expressions. This tends to be quick but misses some of the triples.

n = 12
v_max = math.floor(n / 2 - 1 / (2 * n))
v_max_n = n // 3
j = 0 # tested this with low n and compared: 12, 150 to see 
for a in range(1, v_max_n + 1, 1):
 b = (n ** 2 - 2 * n * a - 1) / (2 * (n - a))
 c = (a ** 2 + b ** 2 + 1) ** 0.5
 if c % 1 == 0 and b % 1 == 0 and a + int(b) + int(c) <= n:
 # print(a, int(b), int(c))
 j += 1
print(j)

Question 2

For me this does not even fit into memory (16 GB). Which is weird, since product should not make any intermediate lists and neither should range...

Question 3

@Graipher you should just get a generator object when you initialize the generator which shouldn't use any memory

Question 4

@Graipher sum(i) <= n. Added the variable definitions.

Question 5

It would probably help to tell what problem you are actually trying to achieve.

Question 6

A running time "under a minute" (or in a few minutes, but somewhere in that ballpark) is certainly possible, but not using a brute-force-search over (10^6)^3 possibilities.

Question 7

It seems like you want to find a solution for a2+b2=c2-1 with some added constraints, like a<=b<=c, a+b+c<=n, etc. by trying all possible combinations of a, b, and c and testing whether they satisfy all your conditions at once. This is pretty wasteful.

Instead of itertools.product, you can use a generator expression with three regular for loops. This has several advantages:

you could move some of the if clauses up after the first or second loop, i.e. you don't have to run the other loops if those conditions already fail at this point, e.g.
```
 for x in ... if test(x) for y in ... if test(y) ...
```
you can use the values of your previous variables in the lower and upper bounds for the later variables (in your case this replaces most of your if conditions)
you can basically calculate the value for c from a and b without testing all possible values

This is how I'd do it:

gen = ((a, b, int(c))
 for a in range(2, v_max_n + 1, 2)
 for b in range(a, min(v_max + 1, n - a - a), 2) # use a for lower and upper bounds
 for c in [(a ** 2 + b ** 2 + 1)**0.5] # just a single candidate for c
 if c % 1 == 0) # whole-numbered c found?

Note that the calculation of c using very_large_number**0.5 might be imprecise with float; using decimal might work, though. However, even with those optimizations, testing much fewer values than your original loop (on the order of O(n2) instead of O(n3)), it might not be feasible to find solution for large values of a, b and c.

Also note that I did not thoroughly test this for off-by-one errors, since performance was the main concern.

Another thing that you might try: Invert the loops for a and for b, i.e. instead of testing all b for each a, no matter how large the difference, test all a up to b first. This does not decrease the number of combinations to test, but it seems to yield much more valid combinations in the "smaller" number much faster. Like this (upper bound for b can probably be reduced):

 for b in range(2, v_max + 1, 2)
 for a in range(2, b+1, 2)

Question 8

It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224

Question 9

@dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.

Question 10

I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.

Question 11

I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.

Question 12

yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.

Question 13

v_max_n = n // 3

The name of this variable is a mystery to me, but you seem to be using it as an upper bound on \$a\$. You can get a notably better upper bound on \$a\$ very easily: since \$a \le b\$ and \$c^2 = a^2 + b^2 + 1\$, \$a + b + c \ge 2a + \sqrt{2a^2 + 1} > 2a + \sqrt{2a^2} = (2 + \sqrt 2)a\$. Then your upper bound on \$a\$ can be reduced by 12.13%.

Given a value of \$a\$, the constraints on \$b\$ are \$a \le b\$ and \$a + b + \sqrt{a^2+b^2+1} \le n\$. Rearrange and square: \$a^2 + b^2 + 1 \le n^2 + a^2 + b^2 - 2an +2ab -2bn\$ or \$b \le \frac{n^2 - 2an - 1}{2n - 2a}\$. I'm not sure why you appear to be using this bound as a forced value in the code you added to the question.

$$\sum_{a=1}^{\frac{n}{2 + \sqrt 2}} \left(\frac{n^2 - 2an - 1}{2n - 2a} - a\right) = \left(\sum_{a=1}^{\frac{n}{2 + \sqrt 2}} \frac{n^2 - 2an - 1}{2n - 2a}\right) - \frac{n(n + 2 + \sqrt 2)}{2(2 + \sqrt 2)^2} \\ = \frac{n^2 + 1}2 \left(\psi\left(\frac{n}{2 + \sqrt 2}-n+1\right) - \psi(1-n)\right) + \frac{n^2}{2 + \sqrt 2} - \frac{n(n + 2 + \sqrt 2)}{2(2 + \sqrt 2)^2} \\ = \frac{n^2 + 1}2 \left(\psi\left(1-\frac{1+\sqrt2}{2 + \sqrt 2}n\right) - \psi(1-n)\right) + \frac{(3 + 2\sqrt 2)n^2 + (2 + \sqrt 2)n}{2(2 + \sqrt 2)^2} \\ > 0.9n^2 $$ The step from the sum to an expression in terms of the digamma function is courtesy of Wolfram Alpha.

As a rule of thumb, you don't want to be repeating a loop body \10ドル^{12}\$ times, so looping over all values of \$a\$ and all values of \$b\$ is a non-starter.

Hint: try rearranging \$a^2 + b^2 = c^2 - 1\$.

Question 14

One improvement over @tobias_k s answer is to add the extra upper bound on b of b<1 + a*a//2. This is valid because after that, the distance between b^2 and (b+1)^2 will be bigger than a^2, so c^2 does not exist. This provides a massive speedup at least a first.

Question 15

Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance

Question 16

Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though)

Question 17

@dustin How did you add that "extra check"? I added it to the min; did you do something else?

Question 18

@tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long.

Question 19

@dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster.

tobias_k tobias_k 1,69914 silver badges18 bronze badges · Answer 1 · 2018-12-17 14:51:20Z

It seems like you want to find a solution for a2+b2=c2-1 with some added constraints, like a<=b<=c, a+b+c<=n, etc. by trying all possible combinations of a, b, and c and testing whether they satisfy all your conditions at once. This is pretty wasteful.

Instead of itertools.product, you can use a generator expression with three regular for loops. This has several advantages:

you could move some of the if clauses up after the first or second loop, i.e. you don't have to run the other loops if those conditions already fail at this point, e.g.
```
 for x in ... if test(x) for y in ... if test(y) ...
```
you can use the values of your previous variables in the lower and upper bounds for the later variables (in your case this replaces most of your if conditions)
you can basically calculate the value for c from a and b without testing all possible values

This is how I'd do it:

gen = ((a, b, int(c))
 for a in range(2, v_max_n + 1, 2)
 for b in range(a, min(v_max + 1, n - a - a), 2) # use a for lower and upper bounds
 for c in [(a ** 2 + b ** 2 + 1)**0.5] # just a single candidate for c
 if c % 1 == 0) # whole-numbered c found?

Note that the calculation of c using very_large_number**0.5 might be imprecise with float; using decimal might work, though. However, even with those optimizations, testing much fewer values than your original loop (on the order of O(n2) instead of O(n3)), it might not be feasible to find solution for large values of a, b and c.

Also note that I did not thoroughly test this for off-by-one errors, since performance was the main concern.

Another thing that you might try: Invert the loops for a and for b, i.e. instead of testing all b for each a, no matter how large the difference, test all a up to b first. This does not decrease the number of combinations to test, but it seems to yield much more valid combinations in the "smaller" number much faster. Like this (upper bound for b can probably be reduced):

 for b in range(2, v_max + 1, 2)
 for a in range(2, b+1, 2)

It is feasible to find solutions for large N people have solved it with many languages just not an easy task. projecteuler.net/problem=224
@dustin Surely it is possible, but certainly not using three nested loops with ~10**18 combinations. I don't know what those folks were using, but probably some high-level fancy math.
I tried all kinds of equation re-written and have like 10 code snippets. This generator one was over 1000x better at lower n but just couldn't hang as n increased.
I think you need to look at the math more. One thing that will help is that you can put upper bounds on how high to search for one of the numbers based on the gaps between consecutive squares.
yeah I don't disagree. I'm seeing if I can get an answer up that uses this now.

Peter Taylor Peter Taylor 24.4k1 gold badge49 silver badges94 bronze badges · Answer 2 · 2018-12-18 10:15:17Z

v_max_n = n // 3

The name of this variable is a mystery to me, but you seem to be using it as an upper bound on \$a\$. You can get a notably better upper bound on \$a\$ very easily: since \$a \le b\$ and \$c^2 = a^2 + b^2 + 1\$, \$a + b + c \ge 2a + \sqrt{2a^2 + 1} > 2a + \sqrt{2a^2} = (2 + \sqrt 2)a\$. Then your upper bound on \$a\$ can be reduced by 12.13%.

Given a value of \$a\$, the constraints on \$b\$ are \$a \le b\$ and \$a + b + \sqrt{a^2+b^2+1} \le n\$. Rearrange and square: \$a^2 + b^2 + 1 \le n^2 + a^2 + b^2 - 2an +2ab -2bn\$ or \$b \le \frac{n^2 - 2an - 1}{2n - 2a}\$. I'm not sure why you appear to be using this bound as a forced value in the code you added to the question.

$$\sum_{a=1}^{\frac{n}{2 + \sqrt 2}} \left(\frac{n^2 - 2an - 1}{2n - 2a} - a\right) = \left(\sum_{a=1}^{\frac{n}{2 + \sqrt 2}} \frac{n^2 - 2an - 1}{2n - 2a}\right) - \frac{n(n + 2 + \sqrt 2)}{2(2 + \sqrt 2)^2} \\ = \frac{n^2 + 1}2 \left(\psi\left(\frac{n}{2 + \sqrt 2}-n+1\right) - \psi(1-n)\right) + \frac{n^2}{2 + \sqrt 2} - \frac{n(n + 2 + \sqrt 2)}{2(2 + \sqrt 2)^2} \\ = \frac{n^2 + 1}2 \left(\psi\left(1-\frac{1+\sqrt2}{2 + \sqrt 2}n\right) - \psi(1-n)\right) + \frac{(3 + 2\sqrt 2)n^2 + (2 + \sqrt 2)n}{2(2 + \sqrt 2)^2} \\ > 0.9n^2 $$ The step from the sum to an expression in terms of the digamma function is courtesy of Wolfram Alpha.

As a rule of thumb, you don't want to be repeating a loop body \10ドル^{12}\$ times, so looping over all values of \$a\$ and all values of \$b\$ is a non-starter.

Hint: try rearranging \$a^2 + b^2 = c^2 - 1\$.

Oscar Smith Oscar Smith 3,71718 silver badges31 bronze badges · Answer 3 · 2018-12-17 16:08:23Z

1

\$\begingroup\$

One improvement over @tobias_k s answer is to add the extra upper bound on b of b<1 + a*a//2. This is valid because after that, the distance between b^2 and (b+1)^2 will be bigger than a^2, so c^2 does not exist. This provides a massive speedup at least a first.

Share

answered Dec 17, 2018 at 16:08

Oscar Smith's user avatar

Oscar Smith Oscar Smith

3,71718 silver badges31 bronze badges

\$\endgroup\$

8

\$\begingroup\$ Did you actually test this? Try 10 ** 3. On my machine, it has poorer performance \$\endgroup\$

dustin
– dustin

2018年12月17日 16:17:27 +00:00
Commented Dec 17, 2018 at 16:17
\$\begingroup\$ Nice. That's a massive speedup, but it still takes ages until a gets anywhere close to it's upper bound. (Did not thorougly check whether this actually does not skip any values, though) \$\endgroup\$

tobias_k
– tobias_k

2018年12月17日 16:23:12 +00:00
Commented Dec 17, 2018 at 16:23
\$\begingroup\$ @dustin How did you add that "extra check"? I added it to the min; did you do something else? \$\endgroup\$

tobias_k
– tobias_k

2018年12月17日 16:26:26 +00:00
Commented Dec 17, 2018 at 16:26
\$\begingroup\$ @tobias_k I am using timeit for this edit. I added to the min as you say and your solution is still faster without it by 2.3ns and a smaller std. Also, beyond 10 ** 3, it is still going over a minute and I end up stoping the interrupt since it is running very long. \$\endgroup\$

dustin
– dustin

2018年12月17日 16:35:17 +00:00
Commented Dec 17, 2018 at 16:35
\$\begingroup\$ @dustin "faster by 2.3 ns" is nothing. I did not test it, but maybe the condition is not triggered very often for small n so the additioal check adds a tiny amount of computational overhead without much use. How are you printing your results? Are you getting all the valid results first, or printing them as they come? As I said, waiting for all results still takes ages (after some time, a barely climbed into the lower 1000s) , but the individual results come in much faster. \$\endgroup\$

tobias_k
– tobias_k

2018年12月17日 16:39:05 +00:00
Commented Dec 17, 2018 at 16:39

| Show 3 more comments

Stack Exchange Network

Python generator for "barely obtuse" triangles (Project Euler 224)

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Python generator for "barely obtuse" triangles (Project Euler 224)

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions