I have implemented the following pool allocator in C++:
template <typename T>
struct pool {
private:
struct node {
node* next;
T element;
};
private:
std::vector<node*> m_Chunks;
node* m_Head = nullptr;
uint64 m_MaxElements = 0;
bool m_Resizable;
public:
pool(pool const&) = delete;
pool& operator=(pool const&) = delete;
pool(uint64 nElems, bool resiz = false)
: m_Resizable{ resiz } {
m_Head = alloc_chunk(nElems);
}
pool(pool&& o)
: m_Chunks{ std::move(o.m_Chunks) }, m_Head{ o.m_Head },
m_MaxElements{ o.m_MaxElements }, m_Resizable{ o.m_Resizable } {
}
pool& operator=(pool&& o) {
for (auto n : m_Chunks) {
std::free(n);
}
m_Chunks = std::move(o.m_Chunks);
m_Head = o.m_Head;
m_MaxElements = o.m_MaxElements;
m_Resizable = o.m_Resizable;
return *this;
}
~pool() {
for (auto n : m_Chunks) {
std::free(n);
}
}
operator bool() const {
return m_Chunks.size();
}
T* alloc() {
if (!m_Head) {
if (m_Resizable) {
m_Head = alloc_chunk(m_MaxElements);
if (!m_Head) {
return nullptr;
}
}
else {
return nullptr;
}
}
auto h = m_Head;
m_Head = m_Head->next;
return &h->element;
}
void free(T* ptr) {
if (!ptr) {
return;
}
uint8* mem_raw = reinterpret_cast<uint8*>(ptr);
mem_raw -= offsetof(node, element);
node* mem_head = reinterpret_cast<node*>(mem_raw);
mem_head->next = m_Head;
m_Head = mem_head;
}
private:
node* alloc_chunk(uint64 num) {
uint64 alloc_sz = sizeof(node) * num;
node* mem = reinterpret_cast<node*>(std::malloc(alloc_sz));
if (!mem) {
return nullptr;
}
m_Chunks.push_back(mem);
node* it = mem;
for (uint64 i = 1; i < num; ++i, ++it) {
it->next = it + 1;
}
it->next = nullptr;
m_MaxElements += num;
return mem;
}
};
Is the implementation good/correct? Can I make the code higher quality somehow? I have written a small test that stress-tests it and it seems OK, performance is about 5 to 10 times better than default operator new.
Are there modern C++ elements I could use? I've learnt C++11 and 14 this year in college and I've tried to use my knowledge here. I know this should mean that I should make it exception-safe, but games usually not use exceptions (that's why I've included operator bool for minimal error checking) so I've decided not to.
Edit: The test code I used
template <typename FN>
void measure_exec(const char* name, FN f) {
auto start = std::chrono::steady_clock::now();
f();
auto t = std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - start);
std::cout << name << " took " << t.count() << "ms." << std::endl;
}
struct message {
int id;
double timestamp;
};
int main(void) {
std::srand(std::time(nullptr));
std::vector<message*> control;
std::vector<message*> test;
measure_exec("Pool", [&]{
mem::pool<message> pool{ 32, true };
for (uint64 i = 0; i < 200000; ++i) {
if (i % 15) {
// Allocate
int r_id = std::rand();
double r_time = double(std::rand()) / std::rand();
auto t = pool.alloc();
t->id = r_id;
t->timestamp = r_time;
test.push_back(t);
}
else if (control.size()) {
// Delete
uint64 idx = std::rand() % control.size();
test.erase(test.begin() + idx);
}
}
});
measure_exec("New", [&]{
for (uint64 i = 0; i < 200000; ++i) {
if (i % 15) {
// Allocate
int r_id = std::rand();
double r_time = double(std::rand()) / std::rand();
control.push_back(new message{ r_id, r_time });
}
else if (control.size()) {
// Delete
uint64 idx = std::rand() % control.size();
control.erase(control.begin() + idx);
}
}
});
std::cin.get();
return 0;
}
Note that I haven't really tested anything like this before, this testing method was a complete shot in the dark. I know because of randomness it can be unfair but repeating the test yielded similar results. Yes I know I don't free memory but I don't think it matters that much.
2 Answers 2
Comment on testing
If you are not going to bother with delete
then you may as well write a much simpler version.
template <typename T>
struct pool
{
private:
node* m_head = nullptr;
node* m_tail = nullptr;
std::size_t m_max = 0;
bool m_resize= false;
public:
pool(pool const&) = delete;
pool& operator=(pool const&) = delete;
pool(uint64 nElems, bool resiz = false)
: m_max(nElems)
, m_resize(resiz) {
allocBlock();
}
pool(pool&& o) nothrow {
swap(o);
}
pool& operator=(pool&& o) nothrow {
swap(o);
return *this;
}
void swap(pool& other) nothrow {
using std::swap;
swap(m_head, other.m_head);
swap(m_tail, other.m_tail);
swap(m_max, other.m_max);
swap(m_resize, other.m_resize);
}
operator bool() const {
return m_head != m_tail;
}
T* alloc() {
if ((m_head == m_tail) && m_resizable) {
allocBlock();
}
return m_head == m_tail ? nullptr : m_head++;
}
void free(T* ptr) {}
private:
void allocBlock() {
uint64 alloc_sz = sizeof(T) * m_max;
m_head = m_tail = reinterpret_cast<T*>(std::malloc(alloc_sz));
if (m_head == nullptr) {
throw std::bad_alloc;
}
m_tail += m_max;
}
};
Code Review
You should mark your move constructor as noexcept
.
pool(pool&& o) noexcept;
They should be noexcept
anyway; good to let the language police that for you (it terminates if it actually throws). Also it can help if you were to use your pool with standard libraries as it enables some optimizations.
Sure your object is invalid after a move. And will not cause any problems if used correctly.
pool(pool&& o)
: m_Chunks{ std::move(o.m_Chunks) }, m_Head{ o.m_Head },
m_MaxElements{ o.m_MaxElements }, m_Resizable{ o.m_Resizable } {
}
But I think it will cause problems if used incorrectly. The m_Head
pointer still points at the chain of elements. So you can still accidentally use this and cause real problems for the other allocator that just took ownership of the memory.
The standard way to implement this is using swap()
. See my implementation above. This handles situations like this in a safe way.
Same comments about the move assignment operator.
But this looks weird.
pool& operator=(pool&& o) {
for (auto n : m_Chunks) {
std::free(n);
}
You free all the memory. BUT your head still points into this chain and can be-reused. I think that is definitely a bug. A move should not be freeing the memory.
This is strange.
operator bool() const {
return m_Chunks.size();
}
I would have made that return true if there is available memory. Not if we ever successfully allocated memory. But I have not read the standard allocator documentation. So that may be what you want (though I doubt it).
Why are you using uint8
?
uint8* mem_raw = reinterpret_cast<uint8*>(ptr);
Should this not be char*
? The operator offsetof()
returns the offset in bytes. The char by definition fits into 1 byte, thus a char*
is a 1 byte addressable range. Also the standard has special properties for char*
that other integer types don't have.
Also the uint8
type is not standard, though std::uint8_t
is standard. But it is not required to be defined. If it is defined then it has exactly 8 bits but if that is not available the type is not available.
Comments on testing code
Don't allow the vector to expand.
std::vector<message*> control;
std::vector<message*> test;
Otherwise you are testing the memory manager code. Which is very complex and will give you different results (because memory has already been messed with in a previous test).
So pre-allocate the size of the vectors (so they don't reallocate during the test).
control.reserve(2000000);
test.reserve(2000000);
That is way too small a test.
for (uint64 i = 0; i < 200000; ++i) {
Both of these allocators will finish in the blink of an eye. You want a test that lasts a couple of seconds to get real information.
You are not running the constructor.
// Allocation but no construction
auto t = pool.alloc();
// Allocation and construction.
new message{ r_id, r_time };
You could use placement-new
to make them more similar. But personally I would override the new
and delete
operators so that they use your pool.
The test time will be dominated by this call.
uint64 idx = std::rand() % control.size();
test.erase(test.begin() + idx);
You are forcing a move of all the iterators in the vector. Rather than erasing it, just free the memory and set the pointer to nullptr
. That would be a much better test.
You should use exactly the same code for both tests. The only difference should be the allocator used. You should have one test that uses your custom allocator. While the second test uses an allocator that simply calls new
/delete
underneath the hood.
template<typename T>
class normal
{
public:
T* alloc() {return reinterpret_cast<T*>(new char[sizeof(T)]);}
void free(T* ptr) {delete [] reinterpret_cast<char*>(ptr);}
};
int main(void)
{
std::size_t testSize = 200'000'000;
std::vector<message*> control;
std::vector<message*> test;
control.reserve(testSize);
test.reserve(testSize);
measure_exec("Pool", [&]{mem::pool<message> pool{ 32, true }; runtTest(testSize, test, pool);});
measure_exec("New", [&]{mem::normal<message> pool;runtTest(testSize, control, pool);});
std::cin.get();
}
-
\$\begingroup\$ "They should be no-throw anyway good to let the language put that extra testing in for you." 1. I assume you mean
noexcept
, sincenothrow
is a facility used to request different behavior fromoperator new
and doesn't make sense in the context of a move constructor. 2. Even if you addnoexcept
, this will not do any extra checking from the language side for you. The only thing that will happen is that exceptions get converted into calls tostd::terminate
(unless that is what you mean by testing). \$\endgroup\$Ben Steffan– Ben Steffan2018年02月06日 23:38:00 +00:00Commented Feb 6, 2018 at 23:38 -
1\$\begingroup\$ @BenSteffan: Yes that's what I was trying to say. Updated answer. \$\endgroup\$Loki Astari– Loki Astari2018年02月06日 23:40:48 +00:00Commented Feb 6, 2018 at 23:40
-
\$\begingroup\$ Thank you! Actually I forgot to test freeing, that was really stupid of me. Why shouldn't the move assignment free? I don't get that. I set the head into the other pool which should mean I don't point into the old memory. \$\endgroup\$Peter Lenkefi– Peter Lenkefi2018年02月07日 06:21:10 +00:00Commented Feb 7, 2018 at 6:21
Don't abbreviate unnecessarily
Most of the names you've chosen for variables and functions look fine to me, but there are some weird, unnecessary abbreviations:
resiz
-> just writeresize
alloc_sz
-> I would writealloc_size
orsize
Some abbreviations are quite commonly used so I'd keep those (like Elems
and alloc
instead of Elements
and allocation
). However, be consistent: why is it nElems
but m_MaxElements
?
Create a class chunk
Your pool
consists of multiple chunks, and each chunk is an array of node
s. It makes sense to make a proper type for chunks as well. While you are at it, avoid doing manual memory allocations, and let std::vector
do it for you. The alloc_chunk()
function can become the constructor of chunk
.
struct chunk {
std::vector<node> nodes;
chunk(size_t nElems): nodes(nElems) {
for (auto &node: nodes)
node.next = &node + 1;
nodes.back().next = nullptr;
}
};
But this also leads me to:
Consider using a std::deque
to store node
s
Your allocator requires each node
, once allocated, to be at the same place in memory for the rest of its life. If you resize a std::vector
or use realloc()
, the storage might be moved. This is why you are using chunks. However, there is a STL container that gives you stable pointers and the ability to resize: std::deque
. This allows you to get rid of chunks.
template <typename T>
class pool {
struct node {/* same as before */};
std::deque<node> m_Nodes;
node *m_Head{};
public:
pool(pool const&) = delete;
pool& operator=(pool const&) = delete;
pool() = default;
pool(pool&& o) = default;
...
T *alloc() {
if (!m_Head) {
m_Nodes.emplace_back({});
m_Head = &m_Nodes.back();
}
...
}
...
};
Performance
Your allocator is very good when storing large objects and when the game doesn't need to allocate large batches of objects at a time. However, it has the overhead of one pointer per element, so if you are storing small objects, you might be wasting a lot of memory. Furthermore, there is no better way to allocate a batch of objects than to allocate them one by one.
If you want to store small objects (for example, struct message
is not that big), then instead of having a linked list of free nodes, it might be better to use (a tree of) bitmaps to store which elements of the chunk are used or not. Finding a region of consecutive free elements might also be much easier this way.
Use a testing framework
Consider using an existing (unit) testing framework, like CppUnit or Google Test, and a benchmarking framework like Google Benchmark. You still need to write the actual tests yourself, but they take away some of the burden and impose some order to what you are doing.
Explore related questions
See similar questions with these tags.
std::aligned_storage
. You are still hitting the memory system a lot more than you could/should. \$\endgroup\$