Assumptions for use:
push() will only ever be called by a single thread.
pop() will only ever be called by a single thread.
Requirements:
push() needs to be as fast as possible. Requirements are stringent enough that I've determined that I can't use std::mutex (at least not in any way I've attempted to) as it is far too slow. (some initial stress tests took 6-8 ms without and 26-60 ms with std::mutex)
Code:
template<typename T>
class PCQueue {
const size_t size;
std::unique_ptr<T[]> contents;
std::atomic<size_t> head;
std::atomic<size_t> tail;
public:
explicit PCQueue(size_t s) : size(s), contents(new T[s]), head(0), tail(0) {}
//num of elements in ringbuffer (size - room())
size_t count() {
return (tail + size - head) % size;
}
//num of free spaces in ringbuffer (size - count())
size_t room() {
return (head + size - tail) % size;
}
void push(T t) {
size_t newTail = (tail + 1) % size;
if (newTail == head) {
throw std::runtime_error("Pushing Full PCQueue");
}
contents[tail] = t;
tail = newTail;//std::atomic implicitly memfences, so this op must occur after the "contents[tail] = t;"
}
T pop() {
if (tail == head) {
throw std::runtime_error("Popping Empty PCQueue");
}
T ret = contents[head];
size_t newHead = (head + 1) % size;
head = newHead;
return ret;
}
};
My testing has implied that this performs correctly (but I'd like to squeeze more speed out of it if possible), but I'm not entirely certain my assumption about the push method is correct (the comment regarding memory barriers), as I'm still getting a handle on those, and I'm not sure I understand how loads and stores can be reordered regarding atomic and non-atomic variables, with the various memory orders. sequential consistency (the default) should enforce the desired behavior I think?
I'm worried that it's just working by coincidence (as concurrent code has a tendency to do), so a verification that my code is valid would be appreciated, as well as any general tips to improve it both in readability and performance.
3 Answers 3
Design
The problem with your implementation of pop()
is that it can not be implemented in an exception safe manner (for any type of T). This is why the standard implementation implements this as two different methods top()
and pop()
. The top()
function simply returns the value while pop()
does not return the value but simply removes the value from head
.
So usage would become:
T val = queue.top();
queue.pop();
Code Review
Your code assumes that T
has a default constructor.
std::unique_ptr<T[]> contents;
When you create the content
with contents(new T[s])
you are allocating and initializing all the elements in that array using the default constructor.
In addition to this if T
is very expensive you have just used a lot of effort to create the items that may never be used (or you wasted resources creating the objects that are just going to be destroyed when overwritten).
I would consider changing this to std::vector<T>
. That way you don't create any un-required objects when first created.
This is fine:
explicit PCQueue(size_t s) : size(s), contents(new T[s]), head(0), tail(0) {}
But you can make it more readable. Remember the point of code is to make the code maintainable.
explicit PCQueue(size_t s)
: size(s)
, contents(new T[s])
, head(0)
, tail(0)
{}
Is this the real definition of the function?
//num of elements in ringbuffer (size - room())
size_t count() {
return (tail + size - head) % size;
}
I would rename this function to reflect what the function is actually returning. I would also change the formula so it is easy to read. I would call this availableSpace()
. The formula is: size - (head - tail)
.
This would be better named freeSpace()
.
//num of free spaces in ringbuffer (size - count())
size_t room() {
return (head + size - tail) % size;
}
You pass the parameter by value.
void push(T t) {
This causes a copy to be made to pass it to the parameter. You should pass this by const reference (so there is no copy). If you want to get more advanced you could also pass by r-value reference which would allow you to move the object into your buffer rather than copying it.
void push(T const& t) { // Pass by const reference
void push(T&& t) { // Pass by r-value reference
The problem with the pop()
as written you can not guarantee a clean removal. This is because you need to make several copies and if those copies fail you can leave the container in a bad state.
T pop() {
T ret = contents[head]; // Here is a copy.
size_t newHead = (head + 1) % size;
head = newHead;
return ret; // Here is another copy.
}
You did not define T
so you don't have control over the copy assignment constructor. That is why it is usually split into two functions.
// Return by reference
// That way we can avoid any un-needed copies.
T const& top() const {
return contents[head];
}
// Simply remove the head item in the pop.
void pop() {
size_t newHead = (head + 1) % size;
head = newHead;
}
-
\$\begingroup\$ Thank you! All of this is quite useful, except I think you misunderstood the count() method. it is intended to return the number of elements currently in the queue (i.e. how many times pop() can be called at most) (and testing has verified that it does, or at least seems to, even if head and or tail have wrapped around back to the beginning). having an "available space" and a "free space" method seems redundant and confusing, could I get clarification? \$\endgroup\$Phi– Phi2019年05月01日 14:32:37 +00:00Commented May 1, 2019 at 14:32
-
\$\begingroup\$ @Phi I think that confirms my point that the name of the function is not clear. :-) Function naming so that the intent is clear is a major part of the "Self Documenting Code" philosophy. \$\endgroup\$Loki Astari– Loki Astari2019年05月01日 16:52:24 +00:00Commented May 1, 2019 at 16:52
-
\$\begingroup\$ instead of doing top and then pop, used lambda as pop param.
void pop(func<void(T& val)> consmr);
\$\endgroup\$KRoy– KRoy2024年01月20日 06:14:42 +00:00Commented Jan 20, 2024 at 6:14
This is sort of an extended comment on Martin York's reply.
When you're doing any sort of parallel processing, I advise against re-designing pop
so it requires two operations to actually remove an item from the queue, like:
T val = queue.top();
queue.pop();
With sufficient care, this can work for a single-producer/single-consumer situation, but has the potential to lead to problems even in that limited case.
The minute you get multiple consumers involved, it's completely and irrevocably broken.
My personal advice is that even if you're only currently planning for a single consumer, it's better to design the interface so you could support multiple consumers anyway. Then when/if you use more than one consumer, you don't have to rewrite all the existing code to do it.
After several iterations through it, an interface I've found to work well is:
bool pop(T &);
The typical implementation is something like this:
bool pop(T &dest) {
try {
dest = data.top();
data.pop();
return true;
}
catch(...) {
return false;
}
}
With this design, we retain exception safety: if dest = data.top();
throws an exception, then we just return false, and the content of the queue remains exactly as it was before pop
was called. If it succeeds, we call data.pop();
.
It's also (with appropriate use of a mutex or similar) safe for multiple consumers to use in parallel. In particular, we would normally plan on executing the dest = data.top(); data.pop();
as an atomic operation, so if we get an item from the queue, we know we'll remove that item. We can't get two threads in parallel that read the same top of queue item, then each remove an item from the queue (one of which hasn't been read and can never be processed).
Depending on the situation, it's often useful to add a timeout to reading, so if you attempt to read but there's nothing in the queue, you don't get stuck there forever.
In addition to what has already been said I would add that
what happens if
s
is zero? You might want to throw an exception in the constructor if that happens probably because any call tocount
ofroom
would fail otherwise.you can definitely improve
const
correctness at least in a couple of places:
in
explicit PCQueue(size_t s) :
you can make s
const
.
explicit PCQueue(const size_t s) :
You can do the same thing in push
for the newTail
variable.
Explore related questions
See similar questions with these tags.
void pop(func<void(T& val)> consmr);
also when creating internally keep thesize+1
. \$\endgroup\$