Recursively calculating powers by squaring

Question 1

Consider the following function that implements optimised O(log n) exponentiation by squaring:

#include <cstdint> // uintmax
// log-n optimised integer power function. Computes x to the power of y
constexpr std::uintmax_t pow(std::uintmax_t x, std::uintmax_t y) {
 // base cases for efficiency and guaranteed termination
 switch (y) {
 case 0:
 return 1;
 case 1:
 return x;
 case 2:
 return x * x;
 case 3:
 return x * x * x;
 }
 // OTHERWISE:
 std::uintmax_t square_root = pow(x, y / 2);
 // otherwise, work out if y is a multiple of 2 or not
 if (y % 2 == 0) {
 return square_root * square_root;
 } else {
 return square_root * square_root * x;
 }
}

The base cases switch could be rewritten to take advantage of deliberate fallthrough between the various cases to "aggregate" the exponentiation of x from the 0th up to the 3rd power:

// log-n optimised integer power function. Computes x to the power of y
constexpr std::uintmax_t pow(std::uintmax_t x, std::uintmax_t y) {
 // base cases for efficiency and guaranteed termination
 std::uintmax_t result = 1;
 switch (y) {
 case 3:
 result *= x;
 [[fallthrough]];
 case 2:
 result *= x;
 [[fallthrough]];
 case 1:
 result *= x;
 [[fallthrough]];
 case 0:
 return result;
 }
 // OTHERWISE:
 std::uintmax_t square_root = pow(x, y / 2);
 // otherwise, work out if y is a multiple of 2 or not
 if (y % 2 == 0) {
 return square_root * square_root;
 } else {
 return square_root * square_root * x;
 }
}

Now, I am aware that the general consensus is that such use of fallthrough in a switch-case is seen as a cardinal sin. I am also aware that the function would work if base cases were provided for just the zeroth and first power (or even just for the zeroth!). The reason I provide base cases up to the third is that it feels just a bit of a waste of a function call to recur into to calculate such simple powers, although perhaps to the third power is a bit overkill..?

So my question is primarily:

Is such use of deliberate fallthrough justifiable here, or could it be justifiable for a similarly-structured example that's a bit more elaborate than chaining multiplication but which still has the commutative chaining as a property?

Secondarily:

What do you think about the number of base cases provided here? Is this astute use of deliberately avoiding recursive calls for such simple cases, or is it premature optimisation?

Question 2

Firstly, I wouldn't call the function pow - that's confusingly similar to std::pow, which can be invoked with the same arguments.

In both cases, I expect the usual (iterative) binary exponentiation to be both faster and easier for readers to follow than this recursive implementation. Given that the question tags mention performance is important, I encourage you to benchmark both recursive and iterative functions.

Here's a simple implementation of the iterative method:

#include <concepts>
template<typename T>
constexpr T unit_value = 1;
// Return xn
// If calculation overflows, behaviour may be undefined!
template<typename T>
constexpr auto ipow(const T x, std::unsigned_integral auto n)
 requires requires(T t) { t *= t; }
{
 auto result = unit_value<T>;
 for (auto y = x; true; y *= y) {
 if (n % 2 == 1) { result *= y; }
 if ((n /= 2) == 0) { break; }
 }
 return result;
}

The infinite loop with break is used rather than testing n > 0 in the for condition to avoid an unnecessary y *= y in the last iteration.

I provided the unit_value template so that it can be used with non-arithmetic types (complex numbers, square matrices, etc) by specialising that value with the appropriate multiplicative identity. For example:

#include <complex>
template<typename T>
constexpr std::complex<T> unit_value<std::complex<T>> = {1, 0};

(This one isn't strictly needed, since there's implicit conversion from T to complex<T>, but it demonstrates the point).

Question 3

Thanks, I like this implementation better than the two I offered. Shorter and easier to follow. The unit_value is a neat trick too —I've sometimes needed an overloadable "0" or "1" value for arbitrary numeric-like types before. I can't believe I missed the trick of doing a for-loop with division for the loop-advance clause!

Question 4

I've been a bit unconventional with the for loop by declaring y but using n as control variable - some reviewers will object to that, but it's easily changed to an equivalent that should compile to the same object code.

Question 5

Nice use of std::unsigned_integral too btw. I like using concepts but forgot they can be used in this way also.

Question 6

Good point about break before squaring y - this is one case where the loop fits shell syntax better than C++ (because shell permits multiple commands in both the condition and the body).

Question 7

@Matthieu, I updated to avoid the final multiplication (I'm not sure whether a compiler could optimise that away; I suspect it depends on how *= is defined for type T, and whether it can prove there are no side-effects).

Question 8

I would indeed not use [[fallthrough]] here, the version without is clearer in my opinion.

Instead though I would move everything inside the switch-statement:

switch (y) {
case 0:
 return 1;
case 1:
 return x;
case 2:
 return x * x;
case 3:
 return x * x * x;
default:
 return pow(x, y / 2) * pow(x, y - y / 2);
}

Also consider making it a template, possibly in combination with concepts to restrict the type of the exponent to unsigned integers:

template <typename T, std::unsigned_integral Exponent>
constexpr T pow(T x, Exponent y) {
 ...
}

Question 9

Thanks, that's a useful point about putting the whole thing in a switch. Does the implementation in the default case that you provide still exhibit O(log n) performance? I was under the impression that one can implement with fewer recursive calls if one computes pow(x, y / 2) into a temporary and computes the square of this, multiplied by the remaining product (if any) when odd. I can always make the default a braced-case I suppose

Question 10

You're right, that might not be the case. I think @TobySpeight's answer is even better, it avoids recursion altogether.

Toby Speight Toby Speight 87.5k14 gold badges104 silver badges322 bronze badges · Accepted Answer · 2022-11-11 09:20:51Z

Firstly, I wouldn't call the function pow - that's confusingly similar to std::pow, which can be invoked with the same arguments.

In both cases, I expect the usual (iterative) binary exponentiation to be both faster and easier for readers to follow than this recursive implementation. Given that the question tags mention performance is important, I encourage you to benchmark both recursive and iterative functions.

Here's a simple implementation of the iterative method:

#include <concepts>
template<typename T>
constexpr T unit_value = 1;
// Return xn
// If calculation overflows, behaviour may be undefined!
template<typename T>
constexpr auto ipow(const T x, std::unsigned_integral auto n)
 requires requires(T t) { t *= t; }
{
 auto result = unit_value<T>;
 for (auto y = x; true; y *= y) {
 if (n % 2 == 1) { result *= y; }
 if ((n /= 2) == 0) { break; }
 }
 return result;
}

The infinite loop with break is used rather than testing n > 0 in the for condition to avoid an unnecessary y *= y in the last iteration.

I provided the unit_value template so that it can be used with non-arithmetic types (complex numbers, square matrices, etc) by specialising that value with the appropriate multiplicative identity. For example:

#include <complex>
template<typename T>
constexpr std::complex<T> unit_value<std::complex<T>> = {1, 0};

(This one isn't strictly needed, since there's implicit conversion from T to complex<T>, but it demonstrates the point).

Thanks, I like this implementation better than the two I offered. Shorter and easier to follow. The unit_value is a neat trick too —I've sometimes needed an overloadable "0" or "1" value for arbitrary numeric-like types before. I can't believe I missed the trick of doing a for-loop with division for the loop-advance clause!
I've been a bit unconventional with the for loop by declaring y but using n as control variable - some reviewers will object to that, but it's easily changed to an equivalent that should compile to the same object code.
Nice use of std::unsigned_integral too btw. I like using concepts but forgot they can be used in this way also.
Good point about break before squaring y - this is one case where the loop fits shell syntax better than C++ (because shell permits multiple commands in both the condition and the body).
@Matthieu, I updated to avoid the final multiplication (I'm not sure whether a compiler could optimise that away; I suspect it depends on how *= is defined for type T, and whether it can prove there are no side-effects).

Stack Exchange Network

Recursively calculating powers by squaring

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Recursively calculating powers by squaring

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions