I proposed several techniques to answer to https://stackoverflow.com/q/78672409/21691539.
A classical way to answer would be to use std::index_sequence
-based solution but op had two constraints:
- it should work with a non default-constructible type
- it should work for large
std::array
sizes, larger than what commonstd::index_sequence
implementations are supporting.
I thus proposed this solution:
// creates an std::array<T, N> whose element i is constructed with gen(i)
template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array(Generator gen) {
// checking std::array layout
static_assert(sizeof(std::array<T, N>) == N * sizeof(T));
// auxiliary struct used to trigger the call of destructors on the stored objects
// also managed alignement
struct ArrayWrapper {
unsigned char storage[sizeof(std::array<T, N>) + alignof(T)];
std::size_t tmpOffset =
std::size_t{reinterpret_cast<std::uintptr_t>(storage) % alignof(T)};
std::size_t Offset = tmpOffset == 0 ? 0 : alignof(T) - tmpOffset;
unsigned char *alignedStorage = storage+Offset;
~ArrayWrapper() {
T* toDelete = reinterpret_cast<T*>(alignedStorage);
for (std::size_t i = 0; i != N; ++i) {
toDelete[i].~T();
}
}
};
// create storage
ArrayWrapper storage;
// construct objects in storage through placement new
for (std::size_t i = 0; i != N; ++i) {
new (storage.storage + storage.Offset + i * sizeof(T))
T(gen(static_cast<int>(i)));
}
// forcing move semantic for moving the contained objects instead of copying
// should be legal as std::array has the correct layout and is implicit lifetime
return std::move(
*reinterpret_cast<std::array<T, N>*>(storage.storage + storage.Offset));
};
Live with sanitizer
Live without sanitizer
Live without sanitizer, nor string
The idea is to allocate a raw storage, construct the objects in place, alias the storage to an std::array<T,N>
and "move-initialize" the created array from the aliased one (see full example below). The temporary resource is released on leaving the creator function (objects are deleted).
I have several concerns regarding this code.
The first one is: it is UB? I believe not (see discussion there but I'm unsure).
The second one is: what is it's footprint? Honestly I didn't profile it properly, neither with respect to execution time (as I couldn't compare to the std::index_sequence
-based solution which does not compile at all), nor with respect to memory usage (I believe I've got an overhead of one array).
Eventually would there be ways to improve this design?
Here is a full example, similar to the compiler explorer links above:
#include <array>
#include <cstddef>
#include <cstdint>
#include <utility>
#define USESTRING
#ifdef USESTRING
#include <string>
#endif
struct Int {
int i;
#ifdef USESTRING
std::string s;
Int(int val) : i(val), s{} {}
#else
constexpr Int(int val) : i(val) {}
#endif
Int() = delete;
Int(const Int&) = default;
Int(Int&&) = default;
Int& operator=(const Int&) = default;
Int& operator=(Int&&) = default;
~Int() = default;
};
template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array(Generator gen) {
static_assert(sizeof(std::array<T, N>) == N * sizeof(T));
struct ArrayWrapper {
unsigned char storage[sizeof(std::array<T, N>) + alignof(T)];
std::size_t tmpOffset =
std::size_t{reinterpret_cast<std::uintptr_t>(storage) % alignof(T)};
std::size_t Offset = tmpOffset == 0 ? 0 : alignof(T) - tmpOffset;
unsigned char *alignedStorage = storage+Offset;
~ArrayWrapper() {
T* toDelete = reinterpret_cast<T*>(alignedStorage);
for (std::size_t i = 0; i != N; ++i) {
toDelete[i].~T();
}
}
};
ArrayWrapper storage;
for (std::size_t i = 0; i != N; ++i) {
new (storage.alignedStorage + i * sizeof(T))
T(gen(static_cast<int>(i)));
}
return std::move(
*reinterpret_cast<std::array<T, N>*>(storage.alignedStorage));
};
int main() {
constexpr std::size_t N = 50000;
auto gen = [](int i) { return Int(i); };
std::array<Int, N> arr = make_array<N, Int>(gen);
// following line does not compile
// constexpr std::array<Int, N> arr2 = make_array<N, Int>(gen);
arr[10] = Int(2);
return (arr[10].i) * (arr[3].i);
}
EDIT: as suggested in @G.Sliepen answer, I propose also a dynamic-temporary version for review: Presence of UB and memory usage of a std::array initialization: version with temporary array on heap
1 Answer 1
It's inefficient
The idea of constructing an array from a generator is really nice. However, it is not efficient at all. You are first constructing the whole array in a temporary storage location, which is then moved into the return value, but before returning from the function it first has to destroy the moved-from elements of the temporary storage. Only in very trivial cases will the compiler be able to optimize this away.
If T
does have a default constructor, then the naive implementation:
template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array2(Generator gen) {
std::array<T, N> storage;
for (std::size_t i = 0; i != N; ++i) {
storage[i] = gen(i);
}
return storage;
};
Is actually vastly superior, as it will assign values directly in the return value. You could make two overloads or use if constexpr
to choose to use the naive version or your version depending on whether T
is trivial or has a default constructor.
It always uses stack space
While std::array
has the advantage that it can live completely on the stack, you don't have to. You can store a std::array
on the heap as well. That is important for large arrays, as stack space can be quite limited. However, your make_array()
always puts the temporary storage space on the stack, which could cause a stack overflow if N
is very large.
Match the order of template parameters of std::array
I would switch the order of the template parameters N
and T
so that it matches that of std::array
.
-
\$\begingroup\$ Thanks for this feedback. Regarding your first comment, the prerequisite is that
T
may be not default-constructible. Yet an improvement would be to "if constexpr
" and use your implementation when T is default-constructible. Regarding the stack space usage, you can see in the linked question a first version where I used dynamic storage . I thought that using the stack was an improvement but it's not so obvious if I follow your reasoning. May you be so kind as having a look at the dynamic version? As for the last remark: I totally agree, thanks. \$\endgroup\$Oersted– Oersted2024年07月03日 07:53:17 +00:00Commented Jul 3, 2024 at 7:53 -
\$\begingroup\$ NB I had another version where I removed the explicit destructors call when
T
was trivially destructible but it seemed to change nothing for the 3 tested compilers. \$\endgroup\$Oersted– Oersted2024年07月03日 07:54:44 +00:00Commented Jul 3, 2024 at 7:54 -
\$\begingroup\$ Sorry, I read too fast your answer and I see now that you suggested the "
if constexpr
". \$\endgroup\$Oersted– Oersted2024年07月03日 08:46:44 +00:00Commented Jul 3, 2024 at 8:46 -
\$\begingroup\$ Regarding the cost of deallocating, I heard of a technique using pmr to efficiently release memory but I'm unsure of its validity. See stackoverflow.com/questions/78667752/…. Do you think it would be applicable? \$\endgroup\$Oersted– Oersted2024年07月03日 08:49:50 +00:00Commented Jul 3, 2024 at 8:49
-
\$\begingroup\$ If you want us to review the version that uses dynamic storage, just create a new question here on Code Review. As for polymorphic memory resources: these will only optimize the memory (de)allocation, but do nothing for the construction and destruction of the objects that live in them. That still needs to happen, otherwise you will get UB. \$\endgroup\$G. Sliepen– G. Sliepen2024年07月03日 16:02:38 +00:00Commented Jul 3, 2024 at 16:02
Explore related questions
See similar questions with these tags.