3
\$\begingroup\$

I proposed several techniques to answer to https://stackoverflow.com/q/78672409/21691539.

A classical way to answer would be to use std::index_sequence-based solution but op had two constraints:

  • it should work with a non default-constructible type
  • it should work for large std::array sizes, larger than what common std::index_sequence implementations are supporting.

I thus proposed this solution:

// creates an std::array<T, N> whose element i is constructed with gen(i)
template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array(Generator gen) {
 // checking std::array layout
 static_assert(sizeof(std::array<T, N>) == N * sizeof(T));
 // auxiliary struct used to trigger the call of destructors on the stored objects
 // also managed alignement
 struct ArrayWrapper {
 unsigned char storage[sizeof(std::array<T, N>) + alignof(T)];
 std::size_t tmpOffset =
 std::size_t{reinterpret_cast<std::uintptr_t>(storage) % alignof(T)};
 std::size_t Offset = tmpOffset == 0 ? 0 : alignof(T) - tmpOffset;
 unsigned char *alignedStorage = storage+Offset;
 ~ArrayWrapper() {
 T* toDelete = reinterpret_cast<T*>(alignedStorage);
 for (std::size_t i = 0; i != N; ++i) {
 toDelete[i].~T();
 }
 }
 };
 // create storage
 ArrayWrapper storage;
 // construct objects in storage through placement new
 for (std::size_t i = 0; i != N; ++i) {
 new (storage.storage + storage.Offset + i * sizeof(T))
 T(gen(static_cast<int>(i)));
 }
 
 // forcing move semantic for moving the contained objects instead of copying
 // should be legal as std::array has the correct layout and is implicit lifetime
return std::move(
 *reinterpret_cast<std::array<T, N>*>(storage.storage + storage.Offset));
};

Live with sanitizer
Live without sanitizer
Live without sanitizer, nor string

The idea is to allocate a raw storage, construct the objects in place, alias the storage to an std::array<T,N> and "move-initialize" the created array from the aliased one (see full example below). The temporary resource is released on leaving the creator function (objects are deleted).

I have several concerns regarding this code.

The first one is: it is UB? I believe not (see discussion there but I'm unsure).

The second one is: what is it's footprint? Honestly I didn't profile it properly, neither with respect to execution time (as I couldn't compare to the std::index_sequence-based solution which does not compile at all), nor with respect to memory usage (I believe I've got an overhead of one array).

Eventually would there be ways to improve this design?

Here is a full example, similar to the compiler explorer links above:

#include <array>
#include <cstddef>
#include <cstdint>
#include <utility>
#define USESTRING
#ifdef USESTRING
#include <string>
#endif
struct Int {
 int i;
#ifdef USESTRING
 std::string s;
 Int(int val) : i(val), s{} {}
#else
 constexpr Int(int val) : i(val) {}
#endif
 Int() = delete;
 Int(const Int&) = default;
 Int(Int&&) = default;
 Int& operator=(const Int&) = default;
 Int& operator=(Int&&) = default;
 ~Int() = default;
};
template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array(Generator gen) {
 static_assert(sizeof(std::array<T, N>) == N * sizeof(T));
 struct ArrayWrapper {
 unsigned char storage[sizeof(std::array<T, N>) + alignof(T)];
 std::size_t tmpOffset =
 std::size_t{reinterpret_cast<std::uintptr_t>(storage) % alignof(T)};
 std::size_t Offset = tmpOffset == 0 ? 0 : alignof(T) - tmpOffset;
 unsigned char *alignedStorage = storage+Offset;
 ~ArrayWrapper() {
 T* toDelete = reinterpret_cast<T*>(alignedStorage);
 for (std::size_t i = 0; i != N; ++i) {
 toDelete[i].~T();
 }
 }
 };
 ArrayWrapper storage;
 for (std::size_t i = 0; i != N; ++i) {
 new (storage.alignedStorage + i * sizeof(T))
 T(gen(static_cast<int>(i)));
 }
 return std::move(
 *reinterpret_cast<std::array<T, N>*>(storage.alignedStorage));
};
int main() {
 constexpr std::size_t N = 50000;
 auto gen = [](int i) { return Int(i); };
 std::array<Int, N> arr = make_array<N, Int>(gen);
 // following line does not compile
 // constexpr std::array<Int, N> arr2 = make_array<N, Int>(gen);
 arr[10] = Int(2);
 return (arr[10].i) * (arr[3].i);
}

EDIT: as suggested in @G.Sliepen answer, I propose also a dynamic-temporary version for review: Presence of UB and memory usage of a std::array initialization: version with temporary array on heap

asked Jul 2, 2024 at 13:05
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

It's inefficient

The idea of constructing an array from a generator is really nice. However, it is not efficient at all. You are first constructing the whole array in a temporary storage location, which is then moved into the return value, but before returning from the function it first has to destroy the moved-from elements of the temporary storage. Only in very trivial cases will the compiler be able to optimize this away.

If T does have a default constructor, then the naive implementation:

template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array2(Generator gen) {
 std::array<T, N> storage;
 for (std::size_t i = 0; i != N; ++i) {
 storage[i] = gen(i);
 }
 return storage;
};

Is actually vastly superior, as it will assign values directly in the return value. You could make two overloads or use if constexpr to choose to use the naive version or your version depending on whether T is trivial or has a default constructor.

It always uses stack space

While std::array has the advantage that it can live completely on the stack, you don't have to. You can store a std::array on the heap as well. That is important for large arrays, as stack space can be quite limited. However, your make_array() always puts the temporary storage space on the stack, which could cause a stack overflow if N is very large.

Match the order of template parameters of std::array

I would switch the order of the template parameters N and T so that it matches that of std::array.

answered Jul 2, 2024 at 19:03
\$\endgroup\$
5
  • \$\begingroup\$ Thanks for this feedback. Regarding your first comment, the prerequisite is that T may be not default-constructible. Yet an improvement would be to "if constexpr" and use your implementation when T is default-constructible. Regarding the stack space usage, you can see in the linked question a first version where I used dynamic storage . I thought that using the stack was an improvement but it's not so obvious if I follow your reasoning. May you be so kind as having a look at the dynamic version? As for the last remark: I totally agree, thanks. \$\endgroup\$ Commented Jul 3, 2024 at 7:53
  • \$\begingroup\$ NB I had another version where I removed the explicit destructors call when Twas trivially destructible but it seemed to change nothing for the 3 tested compilers. \$\endgroup\$ Commented Jul 3, 2024 at 7:54
  • \$\begingroup\$ Sorry, I read too fast your answer and I see now that you suggested the "if constexpr". \$\endgroup\$ Commented Jul 3, 2024 at 8:46
  • \$\begingroup\$ Regarding the cost of deallocating, I heard of a technique using pmr to efficiently release memory but I'm unsure of its validity. See stackoverflow.com/questions/78667752/…. Do you think it would be applicable? \$\endgroup\$ Commented Jul 3, 2024 at 8:49
  • \$\begingroup\$ If you want us to review the version that uses dynamic storage, just create a new question here on Code Review. As for polymorphic memory resources: these will only optimize the memory (de)allocation, but do nothing for the construction and destruction of the objects that live in them. That still needs to happen, otherwise you will get UB. \$\endgroup\$ Commented Jul 3, 2024 at 16:02

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.