Problem statement: In 1769 Leonhard Euler formulated a generalized version of Fermat's Last Theorem, conjecturing that at least \$n\$ \$\text{n}_{\text{th}}\$ powers are needed to obtain a sum that is itself an \$\text{n}_{\text{th}}\$ power, for \$n > 2\$. Write a program to disprove Euler's conjecture (which stood until 1967). That is, find \$a\$, \$b\$, \$c\$, \$d\$, and \$e\$ such that \$a^5 + b^5 + c^5 + d^5 = e^5\$.
This is one of my self-imposed challenges in Rust to become better at it. The problem was taken from Sedgewick Exercise 1.3.46.
Here is my code:
use clap::Parser;
use std::ops::RangeFrom;
use std::process::exit;
type Euler = Vec<(usize, usize, usize, usize, usize)>;
const VALID_SEARCH_UPPER_BOUNDS: RangeFrom<usize> = 1..;
#[derive(Debug, Parser)]
struct Arguments {
#[arg(index = 1)]
search_upper_bound: usize,
}
fn main() {
let arguments = Arguments::parse();
let search_upper_bound = arguments.search_upper_bound;
let Ok(numbers) = disprove_euler_conjecture(search_upper_bound) else {
eprintln!("Search upper bound must be at least {}.", VALID_SEARCH_UPPER_BOUNDS.start);
exit(1);
};
if numbers.is_empty() {
eprintln!("Search upper bound is too small.")
} else {
for tuple in numbers {
let a = tuple.0;
let b = tuple.1;
let c = tuple.2;
let d = tuple.3;
let e = tuple.4;
println!("{a}5 + {b}5 + {c}5 + {d}5 = {e}5");
}
}
}
fn disprove_euler_conjecture(search_upper_bound: usize) -> Result<Euler, String> {
if !VALID_SEARCH_UPPER_BOUNDS.contains(&search_upper_bound) {
return Err(format!(
"Search upper bound must be at least {}.",
VALID_SEARCH_UPPER_BOUNDS.start,
));
}
let mut numbers: Vec<(usize, usize, usize, usize, usize)> = Vec::new();
for e in 1..search_upper_bound {
let e5 = e.pow(5);
for a in 1..search_upper_bound {
let a5 = a.pow(5);
if a5 > e5 {
break;
}
for b in a..search_upper_bound {
let b5 = b.pow(5);
if a5 + b5 > e5 {
break;
}
for c in b..search_upper_bound {
let c5 = c.pow(5);
if a5 + b5 + c5 > e5 {
break;
}
for d in c..search_upper_bound {
let d5 = d.pow(5);
if a5 + b5 + c5 + d5 > e5 {
break;
}
if a5 + b5 + c5 + d5 == e5 {
numbers.push((a, b, c, d, e));
}
}
}
}
}
}
Ok(numbers)
}
Is there any way that I can improve my code?
2 Answers 2
Another good question, and much of the advice I gave on your previous one applies. I’ll try to repeat myself as little as possible.
The Loops Could be Improved
Currently you have a deeply-nested loop whose layers all have code like
for a in 1..search_upper_bound {
let a5 = a.pow(5);
if a5 > e5 {
break;
}
This performs two checks and conditional branches per iteration, which are slow. A vectorizing or parallelizing compiler probably won’t know what to do with it, either.
It also means that the declared range of the loop is actively misleading, and the actual termination check is a break
inside the loop, where a maintainer might not think to look. It would be better to compute the bounds of the search before starting it, and set the range of the loop to that. Next best would be to set the range to 1,,
, a..
, b..
or something like that, or even use a loop
with no condition, to make it obvious that something within the loop must break
.
Eliminate at Least One Level of the Loop
What jumps out at me is, there’s a valid Diophantine solution for a5 + b5 + c5 + d5 = e5, if and only if, for some a, there is a solution to b5 + c5 + d5 = e5 - a5, there’s a solution to that if and only if, given a and b, there’s a solution for c5 + d5 = e5 - a5 - b5. and at that point, given c, you can directly solve for d.
Consider a Tail-Recursive Implementation
I wouldn’t recommend this as easily in Rust as I would in a language that guaranteed tail-call optimization. However, the four steps of the algorithm I outlined above could be expressed as four helper functions, three of which might be one-liners using flat_map
. That would be easier to read and maintain.
Use Pattern Destructuring
You currently have:
for tuple in numbers {
let a = tuple.0;
let b = tuple.1;
let c = tuple.2;
let d = tuple.3;
let e = tuple.4;
This could be:
for (a,b,c,d,e) in numbers {
Avoid Passing Unrecoverable Logic Errors up the Stack
Currently, you’re returning a Result
that can only fail if the search is called with an incorrect set of bounds. But that is a logic error, not something you want to have to check every call for. It could happen at runtime, if you’re reading the bounds from user input, but in this case, I think I would have disprove_euler_conjecture
just panic if called with invalid inputs. In fact, is there a compelling reason the bound needs to be user-specified at all? The function just searches for a counterexample, and could treat any suggestion you give it about a bound as an advisory hint.
The caller is currently checking for this error anyway (in fact, it completely ignores the Err
value and assumes there’s only one kind of error, an indication that this should actually be an Option
instead of a Result
), and aborting, so there’s no benefit to returning Err
in this program. In fact, that’s worse for debugging, because you’ve partially unwound the stack and lost both the value that caused the bug and the line of code where it was detected! If you had instead called panic!
within disprove_euler_conjecture
, you can often set RUST_BACKTRACE=1
and get a lot of the information reproducing the bug in a debugger would have given you. Although you never want the end user to see a panic, it’s a great way to catch logic errors that should be eliminated.
In the rare situation that you are reading in the bound dynamically at runtime, the caller can check the value.
Consider a More Intuitive Return Type
You currently return a Result
where failure to find a counterexample is represented by Ok (vec![])
. It’s counterintuitive to have to unwrap the return object and then also check that the unwrapped value is valid. You also are searching the entire space first and then finding all counterexamples, rather than stopping when you find the first.
The return type should probably be Option<(usize, usize, usize, usize, usize)>
.
Prefer Iterator Expressions to Appending
There would be some new syntax to learn if you turn the loop into an expression with type impl Iterator<Item = (usize, usize, usize, usize, usize)>
, but that have several advantages.
One is that iterators can evaluate lazily and short-circuit when they find a counterexample. That probably removes the motivation for you to implement this as a function that searches a block and can restart at a certain range. You’re probably only doing it that way because you have to search the whole range before returning any results.
Another is that you can easily take the same code to generate the solutions and have it return a short-circuiting Option
by adding .nth(1)
at the end, or fill any collection with them by adding .collect()
. Or you could call Iterator::for_each
on it, or use it as the range expression of a for
loop, to print each value found immediately and keep searching. It’s extremely flexible!
Another is that you can often get an expression that it’s easier to prove correct than a loop whose termination conditions are buried in a break
statement nested five levels deep.
Mainly, though, mutable variables can have several categories of bugs that immutable state cannot, so you should be getting some significant benefit from mutable data to make up for that.
This is a good scenario for a struct again, which represents our desired states.
On it, we can implement its generation function and implement the Display
trait to be directly able to print it.
As other reviewers have already pointed out, the loops can be optimized as well. Again I'll use chained maps and filters here.
Cargo.toml
[package]
name = "euler"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
clap = { version = "4.3.12", features = ["derive"] }
src/lib.rs
use std::fmt::{Display, Formatter};
pub struct EulerQuintet {
a: u64,
b: u64,
c: u64,
d: u64,
e: u64,
}
impl EulerQuintet {
pub fn generate(limit: u64) -> impl Iterator<Item = Self> {
(1..=limit).flat_map(move |a| {
(a..=limit).flat_map(move |b| {
(b..=limit).flat_map(move |c| {
(c..=limit).flat_map(move |d| {
(d..=limit).filter_map(move |e| {
if a.pow(5) + b.pow(5) + c.pow(5) + d.pow(5) == e.pow(5) {
Some(Self { a, b, c, d, e })
} else {
None
}
})
})
})
})
})
}
}
impl Display for EulerQuintet {
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
write!(
f,
"{}5 + {}5 + {}5 + {}5 = {}5",
self.a, self.b, self.c, self.d, self.e
)
}
}
src/main.rs
use clap::Parser;
use euler::EulerQuintet;
use std::process::exit;
#[derive(Debug, Parser)]
struct Arguments {
#[arg(index = 1)]
search_upper_bound: u64,
}
fn main() {
let arguments = Arguments::parse();
let euler_quintets: Vec<_> = EulerQuintet::generate(arguments.search_upper_bound).collect();
if euler_quintets.is_empty() {
eprintln!("Upper bound too low.");
exit(1);
} else {
for quintet in euler_quintets {
println!("{quintet}");
}
}
}
Also for numerical work I'd not use the rather magic usize
, but one of the explicit integer sizes (u64
here). usize
is meant to be used when handling memory and addresses, but it's weird to read in algorithms that do higher level math stuff.