My code in Julia, almost identical to the Python code (see below), runs in 4.6 s while the Python version runs in 2.4 s. Obviously there is a lot or room for improvement.
function Problem12()
#=
The sequence of triangle numbers is generated by adding the natural
numbers. So the 7th triangle number would be:
1 +たす 2 +たす 3 +たす 4 +たす 5 +たす 6 +たす 7 =わ 28.
The first ten terms would be:
1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...
Let us list the factors of the first seven triangle numbers:
1: 1
3: 1,3
6: 1,2,3,6
10: 1,2,5,10
15: 1,3,5,15
21: 1,3,7,21
28: 1,2,4,7,14,28
We can see that 28 is the first triangle number to have over five divisors.
What is the value of the first triangle number to have over five hundred
divisors?
=#
function num_divisors(n)
res = floor(sqrt(n))
divs = []
for i in 1:res
if n%i == 0
append!(divs,i)
end
end
if res^2 == n
pop!(divs)
end
return 2*length(divs)
end
triangle = 0
for i in Iterators.countfrom(1)
triangle += i
if num_divisors(triangle) > 500
return string(triangle)
end
end
end
Python version below:
import itertools
from math import sqrt, floor
# Returns the number of integers in the range [1, n] that divide n.
def num_divisors(n):
end = floor(sqrt(n))
divs = []
for i in range(1, end + 1):
if n % i == 0:
divs.append(i)
if end**2 == n:
divs.pop()
return 2*len(divs)
def compute():
triangle = 0
for i in itertools.count(1):
# This is the ith triangle number, i.e. num = 1 + 2 + ... + i =
# = i*(i+1)/2
triangle += i
if num_divisors(triangle) > 500:
return str(triangle)
3 Answers 3
The performance discrepancy in your Julia code comes from
res = floor(sqrt(n)) # is a Float64
which is a floating-point number (specifically, Float64) due to sqrt
. This causes the loop variable i
to be a Float64 as well, leading to a massive slowdown when you test for divisibility via taking the modulus (if n%i == 0
) because Euclidean division (i.e., mod
/rem
and div
/fld
/cld
) is very slow for floats. It is much faster for integers.
Furthermore, this does not always give the mathematically correct answer due to the inherently limited precision of floats. For example,
julia> Int(floor(sqrt(67108865^2 - 1)))
67108865
Instead, you should call isqrt
to take the integer square root
res = isqrt(n) # is an Int
which also gives the exact answer
julia> isqrt(67108865^2 - 1)
67108864
Avoid Int(floor(sqrt(n)))
and floor(Int, sqrt(n))
due to the limited precision outlined above.
-
\$\begingroup\$ The biggest performance issue is not isqrt vs sqrt, it's using trial division instead of a better factorisation method. \$\endgroup\$Emily L.– Emily L.2021年07月24日 12:42:15 +00:00Commented Jul 24, 2021 at 12:42
-
1\$\begingroup\$ @EmilyL. Thanks, I've edited my answer to clarify what it answers. Note that the OP's question is why their translated Julia code is slower than their Python code. The answer as explained is that the inner loop in the Julia code is taking floating-point modulus (via
libc
'sfmod
) whereas the Python code is taking integer modulus, ultimately becausefloor(sqrt(n))
is a float in Julia. The solution is to useisqrt(n)
. \$\endgroup\$Vincent Yu– Vincent Yu2021年07月24日 13:56:30 +00:00Commented Jul 24, 2021 at 13:56
You're applying a brute force solution by counting all the divisors through trial division. This is excessively slow and we can do better.
First, realize that all integers can be written as a product of prime numbers (the Fundamental theorem of arithmetic). Equivalently it can be written as a list of powers of the prime numbers.
I.e: any integer \$n\$ can be written as \$n = \prod p_i^{k_i}\$ where \$p_i\$ is a prime number and \$k_i\$ is an integer.
Note further that a "triangle number" is just the arithmetic sum from 1 to n.
So we're looking for \$n\$ such that \$n(n+1)/2 = \prod p_i^{k_i}\$ under the condition that \$\prod k_i+1 > 500\$. Note here that product of powers plus one computes the number of ways you can pick zero (the +1) to all of the powers of a prime to contribute to a number, which is the number of divisors you can have.
Your program would then be something like this:
- Generate large list of primes using a prime sieve or hard code.
- For each n starting at 1, compute the "triangle number", do prime factorisation using the list of primes; this is our speed-up - we only need to try dividing by known primes. Then count the prime powers, compute their product to get the divisors and check if larger than 500.
This will be faster than OP's code because we only test division with known primes rather than with all integers.
You might be able to do some fancy math to determine a better starting point than n=1 in step 2 by using the expressions above, but that's left as an academic exercise for the reader.
-
\$\begingroup\$ Using some of these tricks (and a few more, mainly that
gcd(n, n+1)==1)
which allows you to factor the triangular number by factoringn
andn+1
), I have the time down to .5ms forProblem12(500)
(or half a second forProblem12(5000)
` using Primes function Problem12(n) n < 2 && return 1 prev_num_fact = 1 for i in Iterators.countfrom(3) i = iseven(i) ? i>>1 : i num_fact = prod(e+1 for (p, e) in eachfactor(i)) if num_fact*prev_num_fact > n return i-1 end prev_num_fact = num_fact end end ` \$\endgroup\$Oscar Smith– Oscar Smith2023年10月09日 04:25:06 +00:00Commented Oct 9, 2023 at 4:25
You need the number of divisors, not divisors. So you don't need to gather all divisors, just remember the count. Increase it by 1 instead of push!/append, decrease instead of pop/
-
\$\begingroup\$ Thx for your answer! Although your suggestions do help in terms of allocations, they don't help in terms of performance :( \$\endgroup\$Maxwell's Daemon– Maxwell's Daemon2021年07月22日 08:31:57 +00:00Commented Jul 22, 2021 at 8:31
res
being a floating-point number. Replaceres = floor(sqrt(n))
byres = isqrt(n)
and you should get an order of magnitude improvement in speed. I'll write up an answer that includes this. \$\endgroup\$