C++ generic, multi-dimensional linear interpolation template

Question 1

Use-case in a nutshell: Interpolate fixed-sized data (a lookup table) that provides output values for combinations of a (fixed) number of inputs. The appropriate scaling is done externally, each axis starts at 0 and increments by 1. The output is a floating point value regardless of the table data type, the idea being that the user of this class can determine if rounding/casting is needed.

This function attempts to generalise (bi|tri|etc)linear interpolation in a way that's efficient on a microcontroller (with hardware FPU).

A working example that can be copy-pasted:

#include <cmath>
#include <array>
#include <cstddef>
#include <cstdint>
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
namespace interpolate {
 /**
 * @brief Linear interpolation for N-dimensional data. Saturates / clamps search values to dataset range.
 * 
 * This function is mainly meant to interpolate singleton values -such as lookup tables-, not entire images.
 * 
 * @tparam DIMS Size of each dimension in supplied data array, excluding the last. For example 7 for a table with 7 columns and 3 rows.
 * @tparam Td Type of data
 * @tparam N Count of data
 * @tparam Ts Type of search values / indices
 * @param search... Search indices, the amount equal to the amount of dimensions.
 * @return Td Interpolated value
 */
 template <std::size_t... DIMS,
 typename Td = decltype(0.0),
 std::size_t N,
 typename... Ts>
 constexpr auto linear(const Td(&data)[N], const Ts& ...search) {
 using index_t = std::size_t;
 using fp_t = decltype(0.0); // Use the system default literal floating point type
 using result_t = std::conditional_t<std::is_floating_point_v<Td> && (sizeof(Td) >= sizeof(fp_t)), Td, fp_t>; // Ensure we're always working in precise and reasonably performant floating point
 constexpr index_t D {sizeof...(Ts)};
 using search_set = std::array<std::common_type_t<Ts...>, D>;
 if constexpr (sizeof...(Ts) > 1) {
 constexpr index_t LAST_S { N / (DIMS * ...) };
 constexpr std::size_t SIZES[D] { DIMS..., LAST_S };
 constexpr std::size_t MAX[D] { (DIMS - 1u)..., LAST_S - 1 };
 constexpr std::size_t L = 0;
 constexpr std::size_t H = 1;
 
 static_assert(sizeof...(DIMS) + 1 == sizeof...(Ts), "Amount of dimension sizes doesn't match search parameter count");
 static_assert((DIMS * ...) != 0 && LAST_S != 0, "Can't interpolate singularities");
 static_assert(N == (LAST_S * (DIMS * ...)), "Data size doesn't match provided dimension sizes");
 index_t i[D][2] {{0}}; // High & low indices
 fp_t w[D][2] {{0}}; // Associated weights
 result_t result {0};
 for (index_t j = 0; j < D; ++j) {
 const auto searchval = search_set{{search...}}[j];
 // Perform input clamping
 if (searchval <= 0) {
 // Keeping the indices one apart should do better in vectorisation
 i[j][L] = 0;
 i[j][H] = 1;
 w[j][L] = 1;
 w[j][H] = 0;
 } else if (searchval >= MAX[j]) {
 i[j][L] = MAX[j]-1;
 i[j][H] = MAX[j];
 w[j][L] = 0;
 w[j][H] = 1;
 } else {
 i[j][L] = static_cast<index_t>(searchval);
 if constexpr (false) {
 // Sinus curve based fitting. TODO: Find an elegant way to toggle this
 i[j][H] = i[j][L] + ((search_set{{search...}}[j] - i[j][L]) > 0);
 w[j][H] = std::sin((i[j][L] + 0.5 - searchval) * M_PI) / -2.0 + 0.5;
 } else {
 // Linear
 w[j][H] = searchval - static_cast<index_t>(searchval);
 i[j][H] = i[j][L] + 1;
 }
 w[j][L] = 1 - w[j][H];
 }
 }
 // Accumulate 2^D weighed results
 for (index_t j = 0; j < (0b1 << D); ++j) {
 fp_t weight { w[0][j & 0b1] };
 index_t offset { 1 };
 index_t idx { i[0][j & 0b1] };
 for (index_t k = 1; k < D; ++k) {
 const bool n = (j >> k) & 0b1;
 weight *= w[k][n];
 offset *= SIZES[k-1];
 idx += i[k][n] * offset;
 }
 result += weight * data[idx];
 }
 return result;
 } else {
 const auto searchval = search_set{{search...}}[0];
 if (searchval <= 0) {
 return static_cast<result_t>(data[0]);
 } else if (searchval >= N) {
 return static_cast<result_t>(data[N-1]);
 } else {
 const auto i = static_cast<index_t>(searchval);
 const auto w = searchval - i;
 if (w == 0) {
 return static_cast<result_t>(data[i]);
 } else {
 return static_cast<result_t>(data[i] * (1 - w) + data[i + 1] * w);
 }
 }
 }
 }
}

// volatile
constexpr
float
// unsigned
data[] {
// 0 1 2 3 
/* 0 */ 1000, 2000, 3000, 4000,
/* 1 */ 5000, 6000, 7000, 8000,
/* 2 */ 9000, 10000, 11000, 12000,
/* 3 */ 1000, 2000, 3000, 4000,
/* 4 */ 5000, 6000, 7000, 8000,
/* 5 */ 9000, 10000, 11000, 12000,
/* 6 */ 1000, 2000, 3000, 4000,
/* 7 */ 5000, 6000, 7000, 8000,
/* 8 */ 9000, 10000, 11000, 12000
};
int main() {
 volatile unsigned interpolated1 __attribute__((unused)) = interpolate::linear(data, 2.5);
 volatile unsigned interpolated2 __attribute__((unused)) = interpolate::linear<4>(data, 2.5, 1.5);
 volatile unsigned interpolated3 __attribute__((unused)) = interpolate::linear<4,3>(data, 2.25, 1.25, 0.5);
}

I'd love to hear comments & critiques!

Question 2

First, the input type is rather strange const Td(&data)[N] - normally one would want a std::span or equivalent and I don't think that fixing values of all dimensions as a template parameter is helpful. Number of dimensions indeed should be a template as otherwise implementation would be somewhat more problematic with requirements of dynamic memory usage.

Second, const Ts& ...search you should wrap it as std::array<floating_type, dims> or something of the sort instead of a sequence of arbitrary parameters.

Perhaps, this is an oddness of programming for a micro controller, but generally I'd advice to wrap it all in a class and store important parameters like SIZES[D], MAX[D], and more importantly steps sizes as members of the class. And hopefully make the whole class constexpr. I am not certain which C++ version do you use and thus not entire certain that it is feasible.

Lastly, the interpolation algorithm is inefficient. What you do is for each vertex of the n dimensional cube compute product of weights and sum it all together. Which is inefficient as you make O(n) products for each vertex with a bunch of rather random memory access.

A faster approach is to interpolate according to a single dimension and get n-1 dimensional cube of values - and repeat. This way you'll make O(1) products for each vertex. I suppose it is confusing what I mean without example.

Let me give you one: you have 2x2x2 block {{{1,2}{3,4}},{{5,6}{7,8}}} and you want to compute interpolation at point (0.3, 0.4, 0.8). So first you interpolate along first dimension and get 2x2 block:

{{1.3, 3.3}, {5.3, 7.3}} // here 1*(1-0.3) + 2*0.3 = 1.3, 3*(1-0.3) + 4*0.3 = 3.3 etc...

Then you interpolate along second dimension (interp value = 0.4) to get block of size 2 {2.1, 6.1} and finally, along the last dimension (interp value = 0.8) and get 2.1*(1-0.8)+6.1*0.8 = 5.3.

Here, as you can see it is a rather simple double loop that just applies a[k] = a[2k]*(1-p)+a[2k+1]*p for the majority of operations.

ALX23z ALX23zALX23z 2,5396 silver badges15 bronze badges · Answer 1 · 2021-03-12 11:06:21Z

First, the input type is rather strange const Td(&data)[N] - normally one would want a std::span or equivalent and I don't think that fixing values of all dimensions as a template parameter is helpful. Number of dimensions indeed should be a template as otherwise implementation would be somewhat more problematic with requirements of dynamic memory usage.

Second, const Ts& ...search you should wrap it as std::array<floating_type, dims> or something of the sort instead of a sequence of arbitrary parameters.

Perhaps, this is an oddness of programming for a micro controller, but generally I'd advice to wrap it all in a class and store important parameters like SIZES[D], MAX[D], and more importantly steps sizes as members of the class. And hopefully make the whole class constexpr. I am not certain which C++ version do you use and thus not entire certain that it is feasible.

Lastly, the interpolation algorithm is inefficient. What you do is for each vertex of the n dimensional cube compute product of weights and sum it all together. Which is inefficient as you make O(n) products for each vertex with a bunch of rather random memory access.

A faster approach is to interpolate according to a single dimension and get n-1 dimensional cube of values - and repeat. This way you'll make O(1) products for each vertex. I suppose it is confusing what I mean without example.

Let me give you one: you have 2x2x2 block {{{1,2}{3,4}},{{5,6}{7,8}}} and you want to compute interpolation at point (0.3, 0.4, 0.8). So first you interpolate along first dimension and get 2x2 block:

{{1.3, 3.3}, {5.3, 7.3}} // here 1*(1-0.3) + 2*0.3 = 1.3, 3*(1-0.3) + 4*0.3 = 3.3 etc...

Then you interpolate along second dimension (interp value = 0.4) to get block of size 2 {2.1, 6.1} and finally, along the last dimension (interp value = 0.8) and get 2.1*(1-0.8)+6.1*0.8 = 5.3.

Here, as you can see it is a rather simple double loop that just applies a[k] = a[2k]*(1-p)+a[2k+1]*p for the majority of operations.

Stack Exchange Network

C++ generic, multi-dimensional linear interpolation template

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

C++ generic, multi-dimensional linear interpolation template

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions