I'm reading "Modern C++ Design" (A. Alexandrescu, 2002). The author said "the standard memory allocator is awful for small objects", but it has been two decades after the book has been out, glibc's malloc was really hard to beat.
Anyway, inspired from that, I wrote a simple allocator for fixed size object, in particular, my B-Tree (C++ : B-Tree in C++20 (+Iterator support)) node, which have sizeof(Node) = 64
in 64-bit machine.
template <typename T>
class FixedAllocator {
struct Chunk {
static constexpr std::size_t blockSize_ = sizeof(T);
static constexpr unsigned char numBlocks_ = FixedAllocator::numBlocks_;
Chunk() {
unsigned char i = 0;
for (unsigned char * p = &data_[0]; i != numBlocks_; p += blockSize_) {
*p = ++i;
}
}
void* allocate() {
unsigned char* result = &data_[firstAvailableBlock_ * blockSize_];
firstAvailableBlock_ = *result;
--blocksAvailable_;
return result;
}
void deallocate(void* p) {
assert(p >= &data_[0]);
auto* toRelease = static_cast<unsigned char*>(p);
assert((toRelease - &data_[0]) % blockSize_ == 0);
auto index = static_cast<unsigned char>((toRelease - &data_[0]) / blockSize_);
*toRelease = firstAvailableBlock_;
firstAvailableBlock_ = index;
++blocksAvailable_;
}
bool hasBlock(void* p, std::size_t chunkLength) const {
auto* pc = static_cast<unsigned char*>(p);
return (&data_[0] <= pc) && (pc < &data_[chunkLength]);
}
[[nodiscard]] bool hasAvailable() const {
return (blocksAvailable_ == numBlocks_);
}
[[nodiscard]] bool isFilled() const {
return !blocksAvailable_;
}
unsigned char data_[blockSize_ * numBlocks_];
unsigned char firstAvailableBlock_ = 0;
unsigned char blocksAvailable_ = numBlocks_;
};
private:
static constexpr std::size_t blockSize_ = sizeof(T);
static constexpr unsigned char numBlocks_ = std::numeric_limits<unsigned char>::max();
std::vector<Chunk> chunks_;
Chunk* allocChunk_ = nullptr;
Chunk* deallocChunk_ = nullptr;
Chunk* emptyChunk_ = nullptr;
public:
using value_type = T;
FixedAllocator() {
chunks_.reserve(std::numeric_limits<unsigned char>::max());
}
void* doAllocate() {
assert(!emptyChunk_ || emptyChunk_->hasAvailable());
if (!allocChunk_ || allocChunk_->isFilled()) {
if (emptyChunk_) {
allocChunk_ = emptyChunk_;
emptyChunk_ = nullptr;
} else {
auto it = chunks_.begin();
for (;; ++it) {
if (it == chunks_.end()) {
chunks_.emplace_back();
allocChunk_ = &chunks_.back();
deallocChunk_ = &chunks_.front();
break;
}
if (!it->isFilled()) {
allocChunk_ = &*it;
break;
}
}
}
} else if (allocChunk_ == emptyChunk_) {
emptyChunk_ = nullptr;
}
assert(allocChunk_ && !allocChunk_->isFilled());
assert(!emptyChunk_ || emptyChunk_->hasAvailable());
return allocChunk_->allocate();
}
value_type* allocate(std::size_t n) {
assert(n == 1);
auto* p = static_cast<value_type*>(doAllocate());
return p;
}
Chunk* findVicinity(void* p) const {
assert(!chunks_.empty() && deallocChunk_);
const std::size_t chunkLength = numBlocks_ * blockSize_;
// bidirectional search
Chunk* lo = deallocChunk_;
Chunk* hi = deallocChunk_ + 1;
const Chunk* lbound = &chunks_.front();
const Chunk* hbound = &chunks_.back() + 1;
if (hi == hbound) {
hi = nullptr;
}
for (;;) {
if (lo) {
if (lo->hasBlock(p, chunkLength)) {
return lo;
}
if (lo == lbound) {
lo = nullptr;
if (!hi) {
break;
}
} else {
--lo;
}
}
if (hi) {
if (hi->hasBlock(p, chunkLength)) {
return hi;
}
if (++hi == hbound) {
hi = nullptr;
if (!lo) {
break;
}
}
}
}
return nullptr;
}
void deallocate(void* p, std::size_t n) noexcept {
assert(n == 1);
assert(!chunks_.empty());
assert(&chunks_.front() <= deallocChunk_);
assert(&chunks_.back() >= deallocChunk_);
assert(&chunks_.front() <= allocChunk_);
assert(&chunks_.back() >= allocChunk_);
Chunk* foundChunk = nullptr;
const std::size_t chunkLength = numBlocks_ * blockSize_;
if (deallocChunk_->hasBlock(p, chunkLength)) {
foundChunk = deallocChunk_;
} else if (allocChunk_->hasBlock(p, chunkLength)) {
foundChunk = allocChunk_;
} else {
foundChunk = findVicinity(p);
}
assert(foundChunk && foundChunk->hasBlock(p, chunkLength));
deallocChunk_ = foundChunk;
// double free check
assert(emptyChunk_ != deallocChunk_);
assert(!deallocChunk_->hasAvailable());
assert(!emptyChunk_ || emptyChunk_->hasAvailable());
deallocChunk_->deallocate(p);
if (deallocChunk_->hasAvailable()) {
// only release chunk if there are 2 empty chunks.
if (emptyChunk_) {
// if last chunk is empty, just let deallocChunk_ points
// to empty chunk, and release the last.
// otherwise, swap two and release an empty chunk
Chunk* lastChunk = &chunks_.back();
if (lastChunk == deallocChunk_) {
deallocChunk_ = emptyChunk_;
} else if (lastChunk != emptyChunk_) {
std::swap(*emptyChunk_, *lastChunk);
}
assert(lastChunk->hasAvailable());
chunks_.pop_back();
if ((allocChunk_ == lastChunk) || (allocChunk_->isFilled())) {
allocChunk_ = deallocChunk_;
}
}
emptyChunk_ = deallocChunk_;
}
}
template <typename... Args>
void construct (value_type* p, Args&&... args) {
std::construct_at(p, std::forward<Args>(args)...);
}
void destroy(value_type* p) {
std::destroy_at(p);
}
};
I changed my B-Tree implementation correspondingly, ditching std::unique_ptr
with using these functions:
FixedAllocator<Node> alloc;
Node* make_node() {
Node* ptr = alloc.allocate(1);
alloc.construct(ptr);
return ptr;
}
void erase_node(Node* ptr) {
alloc.destroy(ptr);
alloc.deallocate(ptr, 1);
}
The benchmark gives: (inserting and removing keys from 1~10000 in totally random order, 10000 time averaged)
My allocator:
Inserting: 2783us
Removing: 2435us
Standard allocator:
Inserting: 2700us
Removing: 2724us
Feel free to comment anything!
1 Answer 1
Missing alignment restrictions
A big issue with your allocator is that you don't ensure the storage is correctly aligned for type T
. There are various ways to ensure storage is correctly aligned, but a simple way to do this is by using std::aligned_storage
:
struct Chunk {
...
typename std::aligned_storage<sizeof(T), alignof(T)>::type data_[numBlocks_];
...
};
As a bonus you no longer have to multiple things by blockSize_
yourself.
A Chunk
knows its own length
It's weird to see the parameter chunkLength
for hasBlock()
, and member functions of FixedAllocator
calculating the size of Chunk
's data_[]
and then passing it to hasBlock()
. A Chunk
knows exactly how large its data_[]
is.
Also, if you can make hasBlock()
check that p
is correctly aligned to the start of a block, you can have a single assert(hasBlock(p))
inside deallocate()
.
Naming things
I recommend you try following the naming conventions of the standard C++ library, so users of your library don't have to deal with multiple conventions. For example:
hasBlock()
->contains()
hasAvailable()
->empty()
doAllocate()
->allocate()
(it can coexist with the one that takes an argument)
Don't use std::vector
to store your chunks
A std::vector
might move its contents in memory when it needs to resize.
Your solution to reserve the maximum size up front will potentially waste a lot of memory, and at the same time it now puts a hard limit to the number of objects an application can store using your allocator. And if you are going to give it a fixed capacity anyway, why not use a std::array
?
There are many data structures that are more suitable, the most important thing though is to think about how fast it will be to find the Chunk
that stores an object at a given address when calling FixedAllocator::deallocate()
.
-
1\$\begingroup\$ I'm tweaking my
FixedAllocator
to combine it withMemoryMappedFile
to use it with myBTree
. Anyway,std::aligned_storage
cannot be used without UB, so it is deprecated in C++23 (open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1413r3.pdf) (en.cppreference.com/w/cpp/types/aligned_storage) \$\endgroup\$frozenca– frozenca2022年07月19日 13:45:07 +00:00Commented Jul 19, 2022 at 13:45 -
\$\begingroup\$ Too bad they didn't fix
std::aligned_storage
, but you could indeed also make it work withalignas()
. To work around the issue with types that have extended alignment (which is what would cause issues, if it's not extended then I don't see any UB), you could roundblocksize
up to a multiple ofalignof(T)
. \$\endgroup\$G. Sliepen– G. Sliepen2022年07月19日 18:36:39 +00:00Commented Jul 19, 2022 at 18:36
memory_resource
interface? \$\endgroup\$std::vector<Chunk>
withstd::pmr::vector<Chunk>
, but I don't understand pmr well yet, so deferred that homework to later. \$\endgroup\$