5

I'm writing a C++ library which does some computation on vectors of audio data.

The library supports both GPU (using Thrust, a C++ STL-like library for GPUs) and CPUs (using the STL). I'm using CUDA Toolkit 10.2, which is limited to GCC 8 (and thus limiting me to C++14). All of this is on an amd64 desktop computer running Fedora 32.

The library contain different classes, and each class has a CPU and GPU version. I'm looking for a neat way to define CPU/GPU variants without duplicating code. Sometimes when I fix a bug in the GPU algorithm, I forget to go and fix it in the CPU algorithm, and vice versa. Also, it would be nice if it could be something at the library level, so that if I instantiate "AlgorithmA-CPU", it internally uses "AlgorithmB-CPU", and similar for GPU.

Here's a simple example:

struct WindowCPU {
 std::vector<float> window{1.0, 2.0, 3.0};
}
struct WindowGPU {
 thrust::device_vector<float> window{1.0, 2.0, 3.0};
}
class AlgorithmCPU {
public:
 std::vector<float> scratch_buf;
 WindowCPU window;
 AlgorithmCPU(size_t size) : scratch_buf(size, 0.0F) {}
 void process_input(std::vector<float>& input) {
 // using thrust, the code ends up looking identical
 thrust::transform(input.begin(), input.end(), scratch_buf.begin(), some_functor());
 }
};
class AlgorithmGPU {
public:
 thrust::device_vector<float> scratch_buf;
 WindowGPU window;
 AlgorithmGPU(size_t size) : scratch_buf(size, 0.0F) {}
 void process_input(thrust::device_vector<float>& input) {
 // using thrust, the code ends up looking identical
 thrust::transform(input.begin(), input.end(), scratch_buf.begin(), some_functor());
 }
};

The example is overly simplified, but it shares the problem with all of my algorithms - the code is the same, except with different data types - CPU uses "std::vector", while GPU uses "thrust::device_vector". Also, there is a sort of "cascading" specialization - "AlgorithmCPU" uses "WindowCPU", and similar for GPU.

Here's one real example I have in my code currently, applied to the above fake algorithm, to reduce code duplication:

template <typename A>
static void _execute_algorithm_priv(A input, A output) {
 thrust::transform(input.begin(), input.end(), output.begin(), some_functor());
}
// GPU specialization
void AlgorithmGPU::process_input(thrust::device_vector<float>& input)
{
 _execute_algorithm_priv<thrust::device_vector<float>&>(
 input, scratch_buf);
}
// CPU specialization
void AlgorithmCPU::process_input(std::vector<float>& input)
{
 _execute_algorithm_priv<std::vector<float>&>(
 input, scratch_buf);
}

Now in the real code, I have many algorithms, some are huge. My imagination can't stretch to a global library-wide solution. I thought of something using an enum:

enum ComputeBackend {
 GPU,
 CPU
}

Afterwards, I would create templates of classes based on the enum - but I'd need to map the enum to different data types:

template <ComputeBackend b> class Algorithm {
// somehow define other types based on the compute backend
if (ComputeBackend b == ComputeBackend::CPU) {
 vector_type = std::vector<float>;
 other_type = Ipp32f;
} else {
 vector_type = thrust::device_vector<float>;
 other_type = Npp32f;
}
}

I read about "if static constexpr()" but I don't believe I can use it in C++14.

edit

Here's my solution based on the replies so far:

enum Backend {
 GPU,
 CPU
};
template<Backend T>
struct TypeTraits {};
template<>
struct TypeTraits<Backend::GPU> {
 typedef thrust::device_ptr<float> InputPointer;
 typedef thrust::device_vector<float> RealVector;
 typedef thrust::device_vector<thrust::complex<float>> ComplexVector;
};
template<>
struct TypeTraits<Backend::CPU> {
 typedef float* InputPointer;
 typedef std::vector<float> RealVector;
 typedef std::vector<thrust::complex<float>> ComplexVector;
};
template<Backend B> class Algorithm {
 typedef typename TypeTraits<B>::InputPointer InputPointer;
 typedef typename TypeTraits<B>::RealVector RealVector;
 typedef typename TypeTraits<B>::ComplexVector ComplexVector;
 public:
 RealVector scratch_buf;
 
 void process_input(InputPointer input);
};
asked Sep 9, 2020 at 17:37
3
  • 2
    Sounds like a perfect use case for template meta programming. Not sure if a single answer here can tell you everything you need to know about it to solve your problem, but it may be a good idea to get a copy of Modern C++ design and learn Policy based class design. Commented Sep 9, 2020 at 20:14
  • 2
    Yep, the easiest way is just to write both specializations of template <ComputeBackend> struct BackendTraits, each containing the appropriate types. Commented Sep 9, 2020 at 20:56
  • Thanks! Based on these comments, the answer, and this other SO question (stackoverflow.com/questions/9984491/…), I edited with my solution. It's working nicely. Commented Sep 10, 2020 at 15:06

1 Answer 1

9

One possible solution is to use templates and move the CPU / GPU specific stuff into a traits class:

struct CPUBackendTraits {
 template <typename T>
 using vector_type = std::vector<T>;
};
struct GPUBackendTraits {
 template <typename T>
 using vector_type = thrust::device_vector<T>;
};
template <typename BackendTraits>
struct Window {
 typename BackendTraits::vector_type<float> window{1.0, 2.0, 3.0};
};
template <typename BackendTraits>
class Algorithm {
 typename BackendTraits::vector_type<float> scratch_buf;
 Window<BackendTraits> window;
 Algorithm(std::size_t size) : scratch_buf(size, 0.f) {}
 void process_input(typename BackendTraits::vector_type<float>& input) {
 thrust::transform(input.begin(), input.end(), scratch_buf.begin(), some_functor());
 }
};

The typename BackendTraits:: prefix can be annoying, but it can be ommitted by adding the according typedef or using statement to the class.

In some cases, you may not only want to use different types, but also call different code. This can be done, for example, by adding the code as function to the traits class. However, using function overloads can sometimes be less confusing:

void do_something(std::vector<float>& input) {
 // do something std::vector specific
}
void do_something(thrust::vector<float>& input) {
 // do something thrust::vector specific
}
template <typename BackendTraits>
class Algorithm {
 void do_something_backend_specific() {
 typename BackendTraits::vector_type<float> buf = ...;
 // Call either std::vector or thrust::vector overload:
 do_something(buf);
 }
}

There are some advantages over an enum with conditionals:

  • Template programming allows you to use different types.
  • No need for if-else statements that choose the backend. Just pass the traits class around and everything automatically uses the same backend.
  • Adding new backends is simple, just add a new traits class.

Of course, there also some disadvantages:

  • Reading and writing template code can be harder than reading and writing code with concrete types.
  • All code is in templates, so if you want to use .cpp files for that code, you must add explicit template instantiations.
answered Sep 10, 2020 at 8:14
1
  • Excellent, this is close to what I came up with after following the comments posted to my question. In fact I combined the enum and traits, by having a templated traits class based on an enum value. So I end up with "using AlgorithmGPU = Algorithm<Backend::GPU>", and inside the class, I use "typedef typename BackendTypeTraits<Backend>::RealVector RealVector". Something like that. Commented Sep 10, 2020 at 12:53

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.