4
\$\begingroup\$

I recently had to work on some conversion of literal strings and wondered if that could be done at compile time using template meta programming. I couldn't find many examples online, so I started playing around to find if I could manage to do it (I'm not a template expert, so it seemed like a good opportunity to learn some things).

I ended up with a two-step process, first storing the converted string into a char array, and then wrapping the array inside a string view. Here is an example applied to a "CamelCase to snake_case" conversion:

#include <array>
#include <stdexcept>
#include <string_view>
#include <tuple>
// Get the size of a string_view once converted to snake_case
constexpr size_t GetSnakeCaseSize(std::string_view str) {
 size_t ret = 0;
 for (size_t i = 0; i < str.length(); ++i) {
 if (i > 0 && str[i] >= 'A' && str[i] <= 'Z') {
 ret += 2;
 }
 else {
 ret += 1;
 }
 }
 return ret;
}
// Get an array of snake_case size from an array of string_view
template<size_t N, std::size_t... I>
constexpr std::array<size_t, N> GetSnakeCaseSize(const std::array<std::string_view, N>& a, std::index_sequence<I...>) {
 return std::array{GetSnakeCaseSize(a[I])...};
}
template<size_t N>
constexpr std::array<size_t, N> GetSnakeCaseSize(const std::array<std::string_view, N>& a) {
 return GetSnakeCaseSize(a, std::make_index_sequence<N>{});
}
// Get a snake_case char array from a string_view
template <size_t N>
constexpr std::array<char, N> ToSnakeCase(std::string_view str) {
 // We can't static_assert based on str, so throw instead
 if (GetSnakeCaseSize(str) != N) {
 throw std::invalid_argument("ToSnakeCase called with wrong output size");
 }
 std::array<char, N> output{};
 size_t index = 0;
 for (size_t i = 0; i < str.length(); ++i) {
 if (str[i] >= 'A' && str[i] <= 'Z') {
 if (i > 0) {
 output[index++] = '_';
 }
 output[index++] = str[i] + 'a' - 'A';
 }
 else {
 output[index++] = str[i];
 }
 }
 return output;
}
// Convert an array of string_view to a tuple of snake_case char arrays
template <size_t N, const std::array<size_t, N>& lengths, std::size_t... Is>
constexpr auto ArrayToSnakeCaseTuple(const std::array<std::string_view, N>& str, std::index_sequence<Is...>) {
 return std::make_tuple(ToSnakeCase<lengths[Is]>(str[Is])...);
}
template <size_t N, const std::array<size_t, N>& lengths>
constexpr auto ArrayToSnakeCaseTuple(const std::array<std::string_view, N>& str) {
 return ArrayToSnakeCaseTuple<N, lengths>(str, std::make_index_sequence<N>{});
}
// Get a string_view from char array
template <size_t N>
constexpr std::string_view ToStringView(const std::array<char, N>& a) {
 return std::string_view(a.data(), N);
}
// Create an array of string_view from a tuple of char arrays
template <typename Tuple, std::size_t... I>
constexpr std::array<std::string_view, std::tuple_size_v<Tuple>> TupleOfArraysToArrayOfStr(const Tuple& t, std::index_sequence<I...>) {
 return { ToStringView(std::get<I>(t)) ... };
}
template <typename Tuple>
constexpr auto TupleOfArraysToArrayOfStr(const Tuple& t) {
 return TupleOfArraysToArrayOfStr(t, std::make_index_sequence<std::tuple_size_v<Tuple>>{});
}

And usage example:

static constexpr std::array<std::string_view, 3> camel_case_strings = { "Hello", "MyWorld", "!!" };
static constexpr auto converted_lengths = GetSnakeCaseSize(camel_case_strings);
static constexpr auto arrays = ArrayToSnakeCaseTuple<std::size(camel_case_strings), converted_lengths>(camel_case_strings);
static constexpr auto snake_case_strings = TupleOfArraysToArrayOfStr(arrays);
static_assert(std::is_same_v<
 const std::array<std::string_view, std::size(camel_case_strings)>,
 decltype(snake_case_strings)>
);
static_assert(snake_case_strings[0] == "hello");
static_assert(snake_case_strings[1] == "my_world");
static_assert(snake_case_strings[2] == "!!");

Live version

Is there a way to improve/simplify it? I don't think we can get around the necesity of having the snake_case_arrays stored somewhere as it's where the strings actually live (string_view only being a convenient wrapper around the memory if I understand correctly). For the intermediate output arrays length, I wish I could just hide it and not have to store it permanently, but I couldn't find a way to make it work properly.

Also, I'm mainly interested in C++17 as that's what I'm most familiar with, but I'd also be curious if there is an easier way to do it with more modern versions of C++.

toolic
14.4k5 gold badges29 silver badges201 bronze badges
asked May 22, 2024 at 12:04
\$\endgroup\$
0

2 Answers 2

4
\$\begingroup\$

This test can catch more than just upper-case characters, and it can also miss some:

i > 0 && str[i] >= 'A' && str[i] <= 'Z'

In a 8859-1 Latin locale, this fails to catch these upper-case letters: À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý.

And with EBCDIC encodings, it includes these non-letters: { } \ and other characters (varying with exact codepage, but typically including accented lower-case letters in the Latin ones).

Similarly, this is not a safe substitute for std::tolower():

str[i] + 'a' - 'A'
answered May 22, 2024 at 15:08
\$\endgroup\$
2
  • \$\begingroup\$ Yeah but std::tolower is not constexpr so we can't use it here \$\endgroup\$ Commented May 23, 2024 at 16:21
  • \$\begingroup\$ No, it's not - you'll need to implement a correct constexpr function yourself for these operations. \$\endgroup\$ Commented May 23, 2024 at 16:48
2
\$\begingroup\$

Unclear identifier:

size_t ret = 0;

What's ret? Return? As GetSnakeCaseSize() returns "the size of a string_view once converted to snake_case", why not name this to size or count?

Simplify:

if (i > 0 && str[i] >= 'A' && str[i] <= 'Z') {
 ret += 2;
} 
else {
 ret += 1;
}

to:

ret += (i > 0 && std::isupper(static_cast<unsigned char> (str[i]))) + 1;

Or:

ret += (i > 0 && std::isupper(static_cast<unsigned char> (str[i]))) ? 2 : 1;
answered May 22, 2024 at 12:34
\$\endgroup\$
4
  • 2
    \$\begingroup\$ isupper doesn't seem to be constexpr from what I can see here en.cppreference.com/w/cpp/string/byte/isupper \$\endgroup\$ Commented May 22, 2024 at 16:57
  • 1
    \$\begingroup\$ "Simplify": I don't think the one liner is simpler. I'm not a fan of bool to int conversions in arithmetic and there's just too much happening on this line. \$\endgroup\$ Commented May 22, 2024 at 20:43
  • \$\begingroup\$ @adepierre You're right, I have removed that part from my answer. \$\endgroup\$ Commented May 22, 2024 at 21:20
  • \$\begingroup\$ @isanae Well, I added another alternative. \$\endgroup\$ Commented May 22, 2024 at 21:21

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.