2
\$\begingroup\$

Use-case in a nutshell: Interpolate fixed-sized data (a lookup table) that provides output values for combinations of a (fixed) number of inputs. The appropriate scaling is done externally, each axis starts at 0 and increments by 1. The output is a floating point value regardless of the table data type, the idea being that the user of this class can determine if rounding/casting is needed.

This function attempts to generalise (bi|tri|etc)linear interpolation in a way that's efficient on a microcontroller (with hardware FPU).

A working example that can be copy-pasted:

#include <cmath>
#include <array>
#include <cstddef>
#include <cstdint>
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
namespace interpolate {
 /**
 * @brief Linear interpolation for N-dimensional data. Saturates / clamps search values to dataset range.
 * 
 * This function is mainly meant to interpolate singleton values -such as lookup tables-, not entire images.
 * 
 * @tparam DIMS Size of each dimension in supplied data array, excluding the last. For example 7 for a table with 7 columns and 3 rows.
 * @tparam Td Type of data
 * @tparam N Count of data
 * @tparam Ts Type of search values / indices
 * @param search... Search indices, the amount equal to the amount of dimensions.
 * @return Td Interpolated value
 */
 template <std::size_t... DIMS,
 typename Td = decltype(0.0),
 std::size_t N,
 typename... Ts>
 constexpr auto linear(const Td(&data)[N], const Ts& ...search) {
 using index_t = std::size_t;
 using fp_t = decltype(0.0); // Use the system default literal floating point type
 using result_t = std::conditional_t<std::is_floating_point_v<Td> && (sizeof(Td) >= sizeof(fp_t)), Td, fp_t>; // Ensure we're always working in precise and reasonably performant floating point
 constexpr index_t D {sizeof...(Ts)};
 using search_set = std::array<std::common_type_t<Ts...>, D>;
 if constexpr (sizeof...(Ts) > 1) {
 constexpr index_t LAST_S { N / (DIMS * ...) };
 constexpr std::size_t SIZES[D] { DIMS..., LAST_S };
 constexpr std::size_t MAX[D] { (DIMS - 1u)..., LAST_S - 1 };
 constexpr std::size_t L = 0;
 constexpr std::size_t H = 1;
 
 static_assert(sizeof...(DIMS) + 1 == sizeof...(Ts), "Amount of dimension sizes doesn't match search parameter count");
 static_assert((DIMS * ...) != 0 && LAST_S != 0, "Can't interpolate singularities");
 static_assert(N == (LAST_S * (DIMS * ...)), "Data size doesn't match provided dimension sizes");
 index_t i[D][2] {{0}}; // High & low indices
 fp_t w[D][2] {{0}}; // Associated weights
 result_t result {0};
 for (index_t j = 0; j < D; ++j) {
 const auto searchval = search_set{{search...}}[j];
 // Perform input clamping
 if (searchval <= 0) {
 // Keeping the indices one apart should do better in vectorisation
 i[j][L] = 0;
 i[j][H] = 1;
 w[j][L] = 1;
 w[j][H] = 0;
 } else if (searchval >= MAX[j]) {
 i[j][L] = MAX[j]-1;
 i[j][H] = MAX[j];
 w[j][L] = 0;
 w[j][H] = 1;
 } else {
 i[j][L] = static_cast<index_t>(searchval);
 if constexpr (false) {
 // Sinus curve based fitting. TODO: Find an elegant way to toggle this
 i[j][H] = i[j][L] + ((search_set{{search...}}[j] - i[j][L]) > 0);
 w[j][H] = std::sin((i[j][L] + 0.5 - searchval) * M_PI) / -2.0 + 0.5;
 } else {
 // Linear
 w[j][H] = searchval - static_cast<index_t>(searchval);
 i[j][H] = i[j][L] + 1;
 }
 w[j][L] = 1 - w[j][H];
 }
 }
 // Accumulate 2^D weighed results
 for (index_t j = 0; j < (0b1 << D); ++j) {
 fp_t weight { w[0][j & 0b1] };
 index_t offset { 1 };
 index_t idx { i[0][j & 0b1] };
 for (index_t k = 1; k < D; ++k) {
 const bool n = (j >> k) & 0b1;
 weight *= w[k][n];
 offset *= SIZES[k-1];
 idx += i[k][n] * offset;
 }
 result += weight * data[idx];
 }
 return result;
 } else {
 const auto searchval = search_set{{search...}}[0];
 if (searchval <= 0) {
 return static_cast<result_t>(data[0]);
 } else if (searchval >= N) {
 return static_cast<result_t>(data[N-1]);
 } else {
 const auto i = static_cast<index_t>(searchval);
 const auto w = searchval - i;
 if (w == 0) {
 return static_cast<result_t>(data[i]);
 } else {
 return static_cast<result_t>(data[i] * (1 - w) + data[i + 1] * w);
 }
 }
 }
 }
}
// volatile
constexpr
float
// unsigned
data[] {
// 0 1 2 3 
/* 0 */ 1000, 2000, 3000, 4000,
/* 1 */ 5000, 6000, 7000, 8000,
/* 2 */ 9000, 10000, 11000, 12000,
/* 3 */ 1000, 2000, 3000, 4000,
/* 4 */ 5000, 6000, 7000, 8000,
/* 5 */ 9000, 10000, 11000, 12000,
/* 6 */ 1000, 2000, 3000, 4000,
/* 7 */ 5000, 6000, 7000, 8000,
/* 8 */ 9000, 10000, 11000, 12000
};
int main() {
 volatile unsigned interpolated1 __attribute__((unused)) = interpolate::linear(data, 2.5);
 volatile unsigned interpolated2 __attribute__((unused)) = interpolate::linear<4>(data, 2.5, 1.5);
 volatile unsigned interpolated3 __attribute__((unused)) = interpolate::linear<4,3>(data, 2.25, 1.25, 0.5);
}

I'd love to hear comments & critiques!

Toby Speight
87.1k14 gold badges104 silver badges322 bronze badges
asked Mar 11, 2021 at 11:09
\$\endgroup\$

1 Answer 1

1
\$\begingroup\$

First, the input type is rather strange const Td(&data)[N] - normally one would want a std::span or equivalent and I don't think that fixing values of all dimensions as a template parameter is helpful. Number of dimensions indeed should be a template as otherwise implementation would be somewhat more problematic with requirements of dynamic memory usage.

Second, const Ts& ...search you should wrap it as std::array<floating_type, dims> or something of the sort instead of a sequence of arbitrary parameters.

Perhaps, this is an oddness of programming for a micro controller, but generally I'd advice to wrap it all in a class and store important parameters like SIZES[D], MAX[D], and more importantly steps sizes as members of the class. And hopefully make the whole class constexpr. I am not certain which C++ version do you use and thus not entire certain that it is feasible.

Lastly, the interpolation algorithm is inefficient. What you do is for each vertex of the n dimensional cube compute product of weights and sum it all together. Which is inefficient as you make O(n) products for each vertex with a bunch of rather random memory access.

A faster approach is to interpolate according to a single dimension and get n-1 dimensional cube of values - and repeat. This way you'll make O(1) products for each vertex. I suppose it is confusing what I mean without example.

Let me give you one: you have 2x2x2 block {{{1,2}{3,4}},{{5,6}{7,8}}} and you want to compute interpolation at point (0.3, 0.4, 0.8). So first you interpolate along first dimension and get 2x2 block:

{{1.3, 3.3}, {5.3, 7.3}} // here 1*(1-0.3) + 2*0.3 = 1.3, 3*(1-0.3) + 4*0.3 = 3.3 etc...

Then you interpolate along second dimension (interp value = 0.4) to get block of size 2 {2.1, 6.1} and finally, along the last dimension (interp value = 0.8) and get 2.1*(1-0.8)+6.1*0.8 = 5.3.

Here, as you can see it is a rather simple double loop that just applies a[k] = a[2k]*(1-p)+a[2k+1]*p for the majority of operations.

answered Mar 12, 2021 at 11:06
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.