Consider the following function that implements optimised O(log n) exponentiation by squaring:
#include <cstdint> // uintmax
// log-n optimised integer power function. Computes x to the power of y
constexpr std::uintmax_t pow(std::uintmax_t x, std::uintmax_t y) {
// base cases for efficiency and guaranteed termination
switch (y) {
case 0:
return 1;
case 1:
return x;
case 2:
return x * x;
case 3:
return x * x * x;
}
// OTHERWISE:
std::uintmax_t square_root = pow(x, y / 2);
// otherwise, work out if y is a multiple of 2 or not
if (y % 2 == 0) {
return square_root * square_root;
} else {
return square_root * square_root * x;
}
}
The base cases switch could be rewritten to take advantage of deliberate fallthrough between the various cases to "aggregate" the exponentiation of x from the 0th up to the 3rd power:
// log-n optimised integer power function. Computes x to the power of y
constexpr std::uintmax_t pow(std::uintmax_t x, std::uintmax_t y) {
// base cases for efficiency and guaranteed termination
std::uintmax_t result = 1;
switch (y) {
case 3:
result *= x;
[[fallthrough]];
case 2:
result *= x;
[[fallthrough]];
case 1:
result *= x;
[[fallthrough]];
case 0:
return result;
}
// OTHERWISE:
std::uintmax_t square_root = pow(x, y / 2);
// otherwise, work out if y is a multiple of 2 or not
if (y % 2 == 0) {
return square_root * square_root;
} else {
return square_root * square_root * x;
}
}
Now, I am aware that the general consensus is that such use of fallthrough in a switch-case is seen as a cardinal sin. I am also aware that the function would work if base cases were provided for just the zeroth and first power (or even just for the zeroth!). The reason I provide base cases up to the third is that it feels just a bit of a waste of a function call to recur into to calculate such simple powers, although perhaps to the third power is a bit overkill..?
So my question is primarily:
- Is such use of deliberate fallthrough justifiable here, or could it be justifiable for a similarly-structured example that's a bit more elaborate than chaining multiplication but which still has the commutative chaining as a property?
Secondarily:
- What do you think about the number of base cases provided here? Is this astute use of deliberately avoiding recursive calls for such simple cases, or is it premature optimisation?
2 Answers 2
Firstly, I wouldn't call the function pow
- that's confusingly similar to std::pow
, which can be invoked with the same arguments.
In both cases, I expect the usual (iterative) binary exponentiation to be both faster and easier for readers to follow than this recursive implementation. Given that the question tags mention performance is important, I encourage you to benchmark both recursive and iterative functions.
Here's a simple implementation of the iterative method:
#include <concepts>
template<typename T>
constexpr T unit_value = 1;
// Return xn
// If calculation overflows, behaviour may be undefined!
template<typename T>
constexpr auto ipow(const T x, std::unsigned_integral auto n)
requires requires(T t) { t *= t; }
{
auto result = unit_value<T>;
for (auto y = x; true; y *= y) {
if (n % 2 == 1) { result *= y; }
if ((n /= 2) == 0) { break; }
}
return result;
}
The infinite loop with break
is used rather than testing n > 0
in the for
condition to avoid an unnecessary y *= y
in the last iteration.
I provided the unit_value
template so that it can be used with non-arithmetic types (complex numbers, square matrices, etc) by specialising that value with the appropriate multiplicative identity. For example:
#include <complex>
template<typename T>
constexpr std::complex<T> unit_value<std::complex<T>> = {1, 0};
(This one isn't strictly needed, since there's implicit conversion from T
to complex<T>
, but it demonstrates the point).
-
\$\begingroup\$ Thanks, I like this implementation better than the two I offered. Shorter and easier to follow. The
unit_value
is a neat trick too —I've sometimes needed an overloadable "0" or "1" value for arbitrary numeric-like types before. I can't believe I missed the trick of doing a for-loop with division for the loop-advance clause! \$\endgroup\$saxbophone– saxbophone2022年11月11日 09:53:25 +00:00Commented Nov 11, 2022 at 9:53 -
1\$\begingroup\$ I've been a bit unconventional with the
for
loop by declaringy
but usingn
as control variable - some reviewers will object to that, but it's easily changed to an equivalent that should compile to the same object code. \$\endgroup\$Toby Speight– Toby Speight2022年11月11日 10:34:59 +00:00Commented Nov 11, 2022 at 10:34 -
1\$\begingroup\$ Nice use of
std::unsigned_integral
too btw. I like using concepts but forgot they can be used in this way also. \$\endgroup\$saxbophone– saxbophone2022年11月11日 13:56:30 +00:00Commented Nov 11, 2022 at 13:56 -
2\$\begingroup\$ Good point about break before squaring
y
- this is one case where the loop fits shell syntax better than C++ (because shell permits multiple commands in both the condition and the body). \$\endgroup\$Toby Speight– Toby Speight2022年11月15日 07:54:20 +00:00Commented Nov 15, 2022 at 7:54 -
2\$\begingroup\$ @Matthieu, I updated to avoid the final multiplication (I'm not sure whether a compiler could optimise that away; I suspect it depends on how
*=
is defined for typeT
, and whether it can prove there are no side-effects). \$\endgroup\$Toby Speight– Toby Speight2022年11月15日 08:02:52 +00:00Commented Nov 15, 2022 at 8:02
I would indeed not use [[fallthrough]]
here, the version without is clearer in my opinion.
Instead though I would move everything inside the switch
-statement:
switch (y) {
case 0:
return 1;
case 1:
return x;
case 2:
return x * x;
case 3:
return x * x * x;
default:
return pow(x, y / 2) * pow(x, y - y / 2);
}
Also consider making it a template, possibly in combination with concepts to restrict the type of the exponent to unsigned integers:
template <typename T, std::unsigned_integral Exponent>
constexpr T pow(T x, Exponent y) {
...
}
-
1\$\begingroup\$ Thanks, that's a useful point about putting the whole thing in a switch. Does the implementation in the default case that you provide still exhibit O(log n) performance? I was under the impression that one can implement with fewer recursive calls if one computes
pow(x, y / 2)
into a temporary and computes the square of this, multiplied by the remaining product (if any) when odd. I can always make the default a braced-case I suppose \$\endgroup\$saxbophone– saxbophone2022年11月11日 09:11:11 +00:00Commented Nov 11, 2022 at 9:11 -
2\$\begingroup\$ You're right, that might not be the case. I think @TobySpeight's answer is even better, it avoids recursion altogether. \$\endgroup\$G. Sliepen– G. Sliepen2022年11月11日 13:19:19 +00:00Commented Nov 11, 2022 at 13:19
Explore related questions
See similar questions with these tags.