Thread-safe cache using a linked list

Question 1

Are the final templated functions allocate and deallocate thread safe? The Cache object is a thread_local and that also makes FreeList a thread_local, the only parts that I'm still in doubt are the calls to std::malloc and std::free. Are they always thread safe or that depends on the implementation? Is it better to put a lock on them?

I think that the cached memory doesn't need locks since all the objects here are thread_local and each thread has a local copy of it. The only problem really are std::malloc and std::free, are they already synchronized?

Here's a self contained example of all of my code:

#include <iostream>
class FreeList {
 struct Node {
 Node * next;
 };
 Node * this_head;
public:
 static constexpr auto const NODE_SIZE = sizeof(Node);
 FreeList() noexcept
 : this_head() {
 }
 FreeList(FreeList &&) = delete;
 FreeList(FreeList const &) = delete;
 FreeList & operator =(FreeList &&) = delete;
 FreeList & operator =(FreeList const &) = delete;
 void push(void * const address) noexcept {
 auto const head = static_cast<Node *>(address);
 head->next = this_head;
 this_head = head;
 }
 auto pop() noexcept {
 void * const head = this_head;
 if (head) {
 this_head = this_head->next;
 }
 return head;
 }
};
template <uint64_t const SIZE>
class Cache {
 static_assert(SIZE >= FreeList::NODE_SIZE);
 FreeList this_free_list;
public:
 Cache() noexcept
 : this_free_list() {
 }
 Cache(Cache &&) = delete;
 Cache(Cache const &) = delete;
 ~Cache() noexcept {
 while (auto const address = this_free_list.pop()) {
 // Do I need a lock here?
 std::free(address);
 }
 }
 Cache & operator =(Cache &&) = delete;
 Cache & operator =(Cache const &) = delete;
 auto allocate() {
 if (auto const address = this_free_list.pop()) {
 return address;
 }
 // Do I need a lock here?
 if (auto const address = std::malloc(SIZE)) {
 return address;
 }
 throw std::bad_alloc();
 }
 void deallocate(void * const address) noexcept {
 if (address) {
 this_free_list.push(address);
 }
 }
};
template <typename TYPE>
auto & get() noexcept {
 thread_local TYPE local;
 return local;
}
template <typename TYPE>
auto & get_cache() noexcept {
 return get<Cache<sizeof(TYPE)>>();
}
// Are these thread safe?
template <typename TYPE>
auto allocate() {
 auto const address = get_cache<TYPE>().allocate();
 return static_cast<TYPE *>(address);
}
template <typename TYPE>
void deallocate(void * const address) noexcept {
 get_cache<TYPE>().deallocate(address);
}
int main() {
 auto x = allocate<int64_t>();
 *x = 5;
 std::cout << *x << '\n';
 deallocate<int64_t>(x);
}

Question 2

Which C++ version is this targeting? C++11? C++14? C++17?

Question 3

I'm targeting mainly C++17, but the more versions, the better.

Question 4

Are the final templated functions allocate and deallocate thread safe?

As every other state used is thread-local, the answer hinges on this:
Is the malloc()/free() memory-management-system thread-safe?

The C++ Standard just defers to the C standard:

Effects: These functions have the semantics specified in the C standard library.

And the C Standard says something like (from C11 final draft n1570):

For purposes of determining the existence of a data race, memory allocation functions behave as though they accessed only memory locations accessible through their arguments and not other static duration storage. These functions may, however, visibly modify the storage that they allocate or deallocate. A call to free or realloc that deallocates a region p of memory synchronizes with any allocation call that allocates all or part of the region p. This synchronization occurs after any access of p by the deallocating function, and before any such access by the allocating function.

So, in the end, the answer is:
Yes, it's thread-safe.

Now, let's critic your code:

You are only using FreeList for Cache, and don't call any of its methods more than once.
Abstraction and encapsulation are tools for managing complexity, but all that useless boilerplate increases it instead.
There is no valid reason Cache and FreeList aren't literal types, which don't depend on dynamic initialization.
Use constexpr, or better yet move to an in-class-initializer and =default the default-ctor.
To make a type non-copy- and non-move- constructible and assignable, it suffices to declare an explicitly deleted move-ctor or move-assignment-operator.
While it's a good thing Cache``static_asserts the block-size is big enough, make get_cache() fix the request if needed. The caller should not have to care.
Consider writing an Allocator using your caching-system. That way, it can be used by standard containers, instead of just manually. Though admittedly, as the block-size is a compile-time-constant, it can only be used in containers allocating single nodes instead of whole arrays.

Question 5

Yes, it's thread-safe. Whether it's a good idea depends on your use-profile. And changing to use the interface an Allocator should provide is also recommended, so you can use it for standard containers.

Question 6

@hoffmale: If you look at the code, std::malloc() and std::free() are the only places where shared data-structures are used. Everything else is thread_local (whether directly or indirectly), and thus already thread-safe.

Question 7

@JoãoPires: The problem is that the "unsafe middle" is publicly accessible. If I want a non-static Cache, I can simply create one - and that one isn't thread-safe (other than the constructor and destructor).

Question 8

@hoffmale Sorry, but if you follow that logic, just about nothing is thread-safe, because it's built out of non-thread-safe building-blocks, most of which are publicly available.

Question 9

@JoãoPires: That's some context not obvious from the question. // @Deduplicator: My problem is that a blanket statement "this is thread-safe" doesn't hold if there are non-thread-safe access paths. This doesn't mean building blocks all have to be thread-safe, just that the non-thread-safe parts must not be publicly accessible and have to be properly regulated. (Like having the safest door doesn't make a house burglar-proof if the ground-level windows are wide open.)

Question 10

Thread safety

std::malloc and std::free are thread-safe by themselves... But Freelist and Cache aren't, unless exclusively accessed through allocate or deallocate.

In all other cases some synchronization is needed in FreeList::push and FreeList::pop (or alternatively, Cache::allocate and Cache::deallocate).

There are some options to make those two classes thread-safe for all access paths:

Move get, get_cache, Cache and FreeList into a class as private nested classes / member functions (so they aren't publicly accessible anymore) and make allocate and deallocate a friend of that class. (private/anonymous namespace won't work since that one has to ve just as accessible as allocate/deallocate because it has to reside in a header.)

Why move get and get_cache? Because references and pointers to thread_local objects can be shared with other threads - so a threads might get access to another threads thread_local Cache inside get.

thread_local objects are only thread-safe if no reference or pointer to them gets shared with other threads (which might happen accidentally, e.g. by a mismanaged lambda capture).

This means allocate and deallocate are not thread-safe as long as references to the underlying Cache (via get or get_cache) could be shared by someone else. And since allocate and deallocate have exactly the same level of access as get or get_cache the posted implementation is not thread-safe in the general case.

Add some locking inside FreeList, or make FreeList lock-free by making this_head a std::atomic<Node*>.

This would make the critical section inside Cache::allocate and Cache::deallocate thread-safe as well.

Memory management

Cache::allocate won't provide correctly aligned memory for alignments greater than std::max_align_t (std::malloc is only specified to support alignments to this value). This might cause problems if higher alignments are needed, e.g. for SSE or AVX instructions.

std::aligned_alloc could be used as a replacement. (Beware: std::aligned_alloc isn't available for MSVC yet.)
It isn't possible to create objects whose size is less than sizeof(void*) - so exchanging int64_t for int32_t inside main will fail to compile on 64-bit systems. This could be fixed by always allocating chunks of memory that are at least sizeof(void*) (or FreeList::NODE_SIZE, if you prefer).

Naming

Cache doesn't actually cache objects - only chunks of memory. FreeListAllocator might be a better name.
get is a very generic name. (get_)thread_local_instance or singleton might be more descriptive.
Similar, get_cache might better be get_allocator or get_thread_local_allocator.

General stuff

With some slight modifications Cache could be changed into a standard library compatible std::pmr::memory_resource - which would allow it to be used with standard containers.
If I had to design this, I'd likely make FreeList an adaptor over another allocator (taken as template parameter) who would be asked for memory if the list was empty. This design would allow easy composition of different allocation strategies. There's an excellent talk by Andrei Alexandrescu about this topic.

Question 11

Adding synchronisation to code which is never used with data shared across threads is quite wasteful.

Question 12

@Deduplicator: "never used with data shared across threads" is not a quality that OP's implementation provides. get returns a reference to a thread_local object - but that reference can be passed to other threads, which then would still reference the original object. As long as get and get_cache are as accessible as allocate and deallocate none of those functions are thread-safe. So to make it thread-safe either the access via get/get_cache needs to be made private or the critical operations (at least FreeList::push/FreeList::pop) need to be synchronized.

Question 13

Just using thread_local doesn't mean "inaccessible from other threads". Yes, using the corresponding name refers to a different instance on each thread, but pointers or references to those instances can still be shared.

Question 14

It's about as good as it reasonably gets (meaning without encumbering usability, maintainability, or efficiency unreasonably), but C++ isn't really designed to enforce such things. If the user wants, he can circumvent any obstacle, reasonable or not.

Question 15

@Deduplicator: I don't think it's an obstacle if the "circumvention" is as simple as passing the result from get_cache to a function or lambda which in turn then gets called by a thread pool. The caller gets a reference, passes a reference, and suddenly he has data races - by only using an API that your answer advertised as thread-safe.

Deduplicator 19.9k1 gold badge32 silver badges65 bronze badges · Accepted Answer · 2018-07-24 00:01:43Z

Are the final templated functions allocate and deallocate thread safe?

As every other state used is thread-local, the answer hinges on this:
Is the malloc()/free() memory-management-system thread-safe?

The C++ Standard just defers to the C standard:

Effects: These functions have the semantics specified in the C standard library.

And the C Standard says something like (from C11 final draft n1570):

For purposes of determining the existence of a data race, memory allocation functions behave as though they accessed only memory locations accessible through their arguments and not other static duration storage. These functions may, however, visibly modify the storage that they allocate or deallocate. A call to free or realloc that deallocates a region p of memory synchronizes with any allocation call that allocates all or part of the region p. This synchronization occurs after any access of p by the deallocating function, and before any such access by the allocating function.

So, in the end, the answer is:
Yes, it's thread-safe.

Now, let's critic your code:

You are only using FreeList for Cache, and don't call any of its methods more than once.
Abstraction and encapsulation are tools for managing complexity, but all that useless boilerplate increases it instead.
There is no valid reason Cache and FreeList aren't literal types, which don't depend on dynamic initialization.
Use constexpr, or better yet move to an in-class-initializer and =default the default-ctor.
To make a type non-copy- and non-move- constructible and assignable, it suffices to declare an explicitly deleted move-ctor or move-assignment-operator.
While it's a good thing Cache``static_asserts the block-size is big enough, make get_cache() fix the request if needed. The caller should not have to care.
Consider writing an Allocator using your caching-system. That way, it can be used by standard containers, instead of just manually. Though admittedly, as the block-size is a compile-time-constant, it can only be used in containers allocating single nodes instead of whole arrays.

Yes, it's thread-safe. Whether it's a good idea depends on your use-profile. And changing to use the interface an Allocator should provide is also recommended, so you can use it for standard containers.
@hoffmale: If you look at the code, std::malloc() and std::free() are the only places where shared data-structures are used. Everything else is thread_local (whether directly or indirectly), and thus already thread-safe.
@JoãoPires: The problem is that the "unsafe middle" is publicly accessible. If I want a non-static Cache, I can simply create one - and that one isn't thread-safe (other than the constructor and destructor).
@hoffmale Sorry, but if you follow that logic, just about nothing is thread-safe, because it's built out of non-thread-safe building-blocks, most of which are publicly available.
@JoãoPires: That's some context not obvious from the question. // @Deduplicator: My problem is that a blanket statement "this is thread-safe" doesn't hold if there are non-thread-safe access paths. This doesn't mean building blocks all have to be thread-safe, just that the non-thread-safe parts must not be publicly accessible and have to be properly regulated. (Like having the safest door doesn't make a house burglar-proof if the ground-level windows are wide open.)

Stack Exchange Network

Thread-safe cache using a linked list

2 Answers 2

Thread safety

Memory management

Naming

General stuff

You must log in to answer this question.

Hot Network Questions

Thread-safe cache using a linked list

2 Answers 2

Thread safety

Memory management

Naming

General stuff

You must log in to answer this question.

Related

Hot Network Questions