Are the final templated functions allocate and deallocate thread safe? The Cache object is a thread_local and that also makes FreeList a thread_local, the only parts that I'm still in doubt are the calls to std::malloc and std::free. Are they always thread safe or that depends on the implementation? Is it better to put a lock on them?
I think that the cached memory doesn't need locks since all the objects here are thread_local and each thread has a local copy of it. The only problem really are std::malloc and std::free, are they already synchronized?
Here's a self contained example of all of my code:
#include <iostream>
class FreeList {
struct Node {
Node * next;
};
Node * this_head;
public:
static constexpr auto const NODE_SIZE = sizeof(Node);
FreeList() noexcept
: this_head() {
}
FreeList(FreeList &&) = delete;
FreeList(FreeList const &) = delete;
FreeList & operator =(FreeList &&) = delete;
FreeList & operator =(FreeList const &) = delete;
void push(void * const address) noexcept {
auto const head = static_cast<Node *>(address);
head->next = this_head;
this_head = head;
}
auto pop() noexcept {
void * const head = this_head;
if (head) {
this_head = this_head->next;
}
return head;
}
};
template <uint64_t const SIZE>
class Cache {
static_assert(SIZE >= FreeList::NODE_SIZE);
FreeList this_free_list;
public:
Cache() noexcept
: this_free_list() {
}
Cache(Cache &&) = delete;
Cache(Cache const &) = delete;
~Cache() noexcept {
while (auto const address = this_free_list.pop()) {
// Do I need a lock here?
std::free(address);
}
}
Cache & operator =(Cache &&) = delete;
Cache & operator =(Cache const &) = delete;
auto allocate() {
if (auto const address = this_free_list.pop()) {
return address;
}
// Do I need a lock here?
if (auto const address = std::malloc(SIZE)) {
return address;
}
throw std::bad_alloc();
}
void deallocate(void * const address) noexcept {
if (address) {
this_free_list.push(address);
}
}
};
template <typename TYPE>
auto & get() noexcept {
thread_local TYPE local;
return local;
}
template <typename TYPE>
auto & get_cache() noexcept {
return get<Cache<sizeof(TYPE)>>();
}
// Are these thread safe?
template <typename TYPE>
auto allocate() {
auto const address = get_cache<TYPE>().allocate();
return static_cast<TYPE *>(address);
}
template <typename TYPE>
void deallocate(void * const address) noexcept {
get_cache<TYPE>().deallocate(address);
}
int main() {
auto x = allocate<int64_t>();
*x = 5;
std::cout << *x << '\n';
deallocate<int64_t>(x);
}
-
\$\begingroup\$ Which C++ version is this targeting? C++11? C++14? C++17? \$\endgroup\$hoffmale– hoffmale2018年07月23日 23:44:36 +00:00Commented Jul 23, 2018 at 23:44
-
\$\begingroup\$ I'm targeting mainly C++17, but the more versions, the better. \$\endgroup\$João Pires– João Pires2018年07月23日 23:49:36 +00:00Commented Jul 23, 2018 at 23:49
2 Answers 2
Are the final templated functions
allocateanddeallocatethread safe?
As every other state used is thread-local, the answer hinges on this:
Is the malloc()/free() memory-management-system thread-safe?
The C++ Standard just defers to the C standard:
Effects: These functions have the semantics specified in the C standard library.
And the C Standard says something like (from C11 final draft n1570):
- For purposes of determining the existence of a data race, memory allocation functions behave as though they accessed only memory locations accessible through their arguments and not other static duration storage. These functions may, however, visibly modify the storage that they allocate or deallocate. A call to
freeorreallocthat deallocates a region p of memory synchronizes with any allocation call that allocates all or part of the region p. This synchronization occurs after any access of p by the deallocating function, and before any such access by the allocating function.
So, in the end, the answer is:
Yes, it's thread-safe.
Now, let's critic your code:
You are only using
FreeListforCache, and don't call any of its methods more than once.
Abstraction and encapsulation are tools for managing complexity, but all that useless boilerplate increases it instead.There is no valid reason
CacheandFreeListaren't literal types, which don't depend on dynamic initialization.
Useconstexpr, or better yet move to an in-class-initializer and=defaultthe default-ctor.To make a type non-copy- and non-move- constructible and assignable, it suffices to declare an explicitly deleted move-ctor or move-assignment-operator.
While it's a good thing
Cache``static_asserts the block-size is big enough, makeget_cache()fix the request if needed. The caller should not have to care.Consider writing an Allocator using your caching-system. That way, it can be used by standard containers, instead of just manually. Though admittedly, as the block-size is a compile-time-constant, it can only be used in containers allocating single nodes instead of whole arrays.
-
1\$\begingroup\$ Yes, it's thread-safe. Whether it's a good idea depends on your use-profile. And changing to use the interface an Allocator should provide is also recommended, so you can use it for standard containers. \$\endgroup\$Deduplicator– Deduplicator2018年07月24日 00:14:12 +00:00Commented Jul 24, 2018 at 0:14
-
1\$\begingroup\$ @hoffmale: If you look at the code,
std::malloc()andstd::free()are the only places where shared data-structures are used. Everything else isthread_local(whether directly or indirectly), and thus already thread-safe. \$\endgroup\$Deduplicator– Deduplicator2018年07月24日 00:16:25 +00:00Commented Jul 24, 2018 at 0:16 -
1\$\begingroup\$ @JoãoPires: The problem is that the "unsafe middle" is
publicly accessible. If I want a non-staticCache, I can simply create one - and that one isn't thread-safe (other than the constructor and destructor). \$\endgroup\$hoffmale– hoffmale2018年07月24日 00:24:14 +00:00Commented Jul 24, 2018 at 0:24 -
1\$\begingroup\$ @hoffmale Sorry, but if you follow that logic, just about nothing is thread-safe, because it's built out of non-thread-safe building-blocks, most of which are publicly available. \$\endgroup\$Deduplicator– Deduplicator2018年07月24日 00:26:51 +00:00Commented Jul 24, 2018 at 0:26
-
1\$\begingroup\$ @JoãoPires: That's some context not obvious from the question. // @Deduplicator: My problem is that a blanket statement "this is thread-safe" doesn't hold if there are non-thread-safe access paths. This doesn't mean building blocks all have to be thread-safe, just that the non-thread-safe parts must not be
publicly accessible and have to be properly regulated. (Like having the safest door doesn't make a house burglar-proof if the ground-level windows are wide open.) \$\endgroup\$hoffmale– hoffmale2018年07月24日 01:08:32 +00:00Commented Jul 24, 2018 at 1:08
Thread safety
std::malloc and std::free are thread-safe by themselves... But Freelist and Cache aren't, unless exclusively accessed through allocate or deallocate.
In all other cases some synchronization is needed in FreeList::push and FreeList::pop (or alternatively, Cache::allocate and Cache::deallocate).
There are some options to make those two classes thread-safe for all access paths:
Move
get,get_cache,CacheandFreeListinto aclassasprivatenested classes / member functions (so they aren't publicly accessible anymore) and makeallocateanddeallocateafriendof that class. (private/anonymousnamespacewon't work since that one has to ve just as accessible asallocate/deallocatebecause it has to reside in a header.)Why move
getandget_cache? Because references and pointers tothread_localobjects can be shared with other threads - so a threads might get access to another threadsthread_local Cacheinsideget.
thread_localobjects are only thread-safe if no reference or pointer to them gets shared with other threads (which might happen accidentally, e.g. by a mismanaged lambda capture).This means
allocateanddeallocateare not thread-safe as long as references to the underlyingCache(viagetorget_cache) could be shared by someone else. And sinceallocateanddeallocatehave exactly the same level of access asgetorget_cachethe posted implementation is not thread-safe in the general case.Add some locking inside
FreeList, or makeFreeListlock-free by makingthis_headastd::atomic<Node*>.This would make the critical section inside
Cache::allocateandCache::deallocatethread-safe as well.
Memory management
Cache::allocatewon't provide correctly aligned memory for alignments greater thanstd::max_align_t(std::mallocis only specified to support alignments to this value). This might cause problems if higher alignments are needed, e.g. for SSE or AVX instructions.std::aligned_alloccould be used as a replacement. (Beware:std::aligned_allocisn't available for MSVC yet.)It isn't possible to create objects whose size is less than
sizeof(void*)- so exchangingint64_tforint32_tinsidemainwill fail to compile on 64-bit systems. This could be fixed by always allocating chunks of memory that are at leastsizeof(void*)(orFreeList::NODE_SIZE, if you prefer).
Naming
Cachedoesn't actually cache objects - only chunks of memory.FreeListAllocatormight be a better name.getis a very generic name. (get_)thread_local_instanceorsingletonmight be more descriptive.Similar,
get_cachemight better beget_allocatororget_thread_local_allocator.
General stuff
With some slight modifications
Cachecould be changed into a standard library compatiblestd::pmr::memory_resource- which would allow it to be used with standard containers.If I had to design this, I'd likely make
FreeListan adaptor over another allocator (taken as template parameter) who would be asked for memory if the list was empty. This design would allow easy composition of different allocation strategies. There's an excellent talk by Andrei Alexandrescu about this topic.
-
\$\begingroup\$ Adding synchronisation to code which is never used with data shared across threads is quite wasteful. \$\endgroup\$Deduplicator– Deduplicator2018年07月24日 16:17:04 +00:00Commented Jul 24, 2018 at 16:17
-
\$\begingroup\$ @Deduplicator: "never used with data shared across threads" is not a quality that OP's implementation provides.
getreturns a reference to athread_localobject - but that reference can be passed to other threads, which then would still reference the original object. As long asgetandget_cacheare as accessible asallocateanddeallocatenone of those functions are thread-safe. So to make it thread-safe either the access viaget/get_cacheneeds to be madeprivateor the critical operations (at leastFreeList::push/FreeList::pop) need to be synchronized. \$\endgroup\$hoffmale– hoffmale2018年07月24日 16:22:45 +00:00Commented Jul 24, 2018 at 16:22 -
\$\begingroup\$ Just using
thread_localdoesn't mean "inaccessible from other threads". Yes, using the corresponding name refers to a different instance on each thread, but pointers or references to those instances can still be shared. \$\endgroup\$hoffmale– hoffmale2018年07月24日 16:23:22 +00:00Commented Jul 24, 2018 at 16:23 -
\$\begingroup\$ It's about as good as it reasonably gets (meaning without encumbering usability, maintainability, or efficiency unreasonably), but C++ isn't really designed to enforce such things. If the user wants, he can circumvent any obstacle, reasonable or not. \$\endgroup\$Deduplicator– Deduplicator2018年07月24日 16:43:21 +00:00Commented Jul 24, 2018 at 16:43
-
\$\begingroup\$ @Deduplicator: I don't think it's an obstacle if the "circumvention" is as simple as passing the result from
get_cacheto a function or lambda which in turn then gets called by a thread pool. The caller gets a reference, passes a reference, and suddenly he has data races - by only using an API that your answer advertised as thread-safe. \$\endgroup\$hoffmale– hoffmale2018年07月24日 16:50:38 +00:00Commented Jul 24, 2018 at 16:50
You must log in to answer this question.
Explore related questions
See similar questions with these tags.