The background is the question at https://stackoverflow.com/questions/36634394/nested-shared-ptr-destruction-causes-stack-overflow. To summarize, for a data structure that looks like:
// A node in a singly linked list.
struct Node {
int head;
std::shared_ptr<Node> tail;
};
If the list grows too long, a stack overflow might occur in the destructor of Node
due to recursive destructions of the shared_ptr
s. I'm implementing a thread-safe version of the delete engine proposed in https://stackoverflow.com/a/36635668/3234803.
There are three simple classes involved: SpinLock
, ConcurrentQueue
, and DeleteEngine
.
The SpinLock
class implements the C++ Lockable interface with a std::atomic_flag
:
class SpinLock {
public:
void lock()
{ while (lock_.test_and_set(std::memory_order_acquire) {} }
bool try_lock() // Not camel-case to be compatible with C++ Lockable interface
{ return !lock_.test_and_set(std::memory_order_acquire); }
void unlock()
{ lock_.clear(std::memory_order_release); }
private:
std::atomic_flag lock_ = ATOMIC_FLAG_INIT;
};
The ConcurrentQueue
class implements a multiple-producer-single-consumer queue that supports thread-safe enqueue
and tryDequeue
:
template<typename T>
class ConcurrentQueue {
public:
void enqueue(T item)
{
std::lock_guard<SpinLock> lk(lock_);
queue_.emplace_back(std::move(item));
}
bool tryDequeue(T &item)
{
std::lock_guard<SpinLock> lk(lock_);
if (queue_.empty()) {
return false;
}
item = std::move(queue_.front());
queue_.pop_front();
return true;
}
private:
std::deque<T> queue_;
SpinLock lock_;
};
The DeleteEngine
class implements the delete engine proposed in the answer aforementioned:
template<typename T>
class DeleteEngine {
public:
~DeleteEngine()
{
std::lock_guard<SpinLock> lk(deleting_);
deleteAll();
}
void enqueue(T *p)
{
queue_.enqueue(p);
if (deleting_.try_lock()) {
std::lock_guard<SpinLock> lk(deleting_, std::adopt_lock);
deleteAll();
}
}
private:
void deleteAll()
{
T *p = nullptr;
while (queue_.tryDequeue(p)) {
delete p;
}
}
ConcurrentQueue<T *> queue_;
SpinLock deleting_;
};
Now in the deleter of a shared_ptr<Node>
, we will hand the raw pointer to a DeleteEngine<Node>
instead of directly delete
ing it. A recursive delete
can handle up to about 10000 nodes, while this method can handle an arbitrarily large number of nodes.
The above code is only a prototype, but I'm particularly concerned about the performance of this implementation: (1) Most of the time the Node
class would be used in a single-threaded environment; (2) Occasionally it might be used in highly concurrent applications, e.g. a web server that constantly creates and destroys objects.
1 Answer 1
To address the locking overhead in the single-threaded case, my first advice is (as with all performance questions) to measure before optimising! It's likely that the locking overhead is tiny compared to the queue's memory management.
If you really find you need to streamline the locking, then you'll probably want to have two versions of the engine. The simplest way to do that is probably to add a boolean parameter to the template, and then use if constexpr
at the points where the code differs. Or perhaps make the lock type a template parameter, and supply a no-op lock when creating the single-thread version.
Explore related questions
See similar questions with these tags.
DeleteEngine::enqueue
, there's no guarantee that queued pointer will be seen bydeleteAll
before the lock is released (or even beforedeleteAll
returns). \$\endgroup\$DeleteEngine::enqueue
and takes the lock, then starts deleting. Thread B now callsenqueue
as well. Before it enqueues, though,deleteAll
on thread A returns (but the lock is not released yet). Thread B then enqueues and sees the lock is still taken, so it returns. Thread A then releases the lock. The result: There is still an item in the queue that will not be deleted until the next call todeleteAll
. \$\endgroup\$