I have two extremely simple toy implementations of an event loop, and would like to understand the performance differences between them.
First impl - events with a virtual 'handle' method - dynamic dispatch:
#include <memory>
#include <vector>
class Event {
public:
virtual bool handle() = 0;
};
class Event1 : public Event {
public:
bool handle() override {
return ++n < 1000000000;
}
private:
int n = 0;
};
class Event2 : public Event {
public:
bool handle() override {
--n;
return true;
}
private:
int n = 0;
};
class EventLoop {
public:
void addEvent(Event& event) {
events.push_back(&event);
}
void run() {
while (true) {
for (auto event : events) {
if (!event->handle()) {
return;
}
}
}
}
private:
std::vector<Event*> events;
};
int main() {
EventLoop eventLoop;
Event1 event1;
Event2 event2;
eventLoop.addEvent(event1);
eventLoop.addEvent(event2);
eventLoop.run();
return 0;
}
Second impl - wrap events in a lambda - static dispatch:
#include <vector>
#include <functional>
class Event1 {
public:
bool handle() {
return ++n < 1000000000;
}
private:
int n = 0;
};
class Event2 {
public:
bool handle() {
--n;
return true;
}
private:
int n = 0;
};
class EventLoop {
public:
template<typename T>
void addEvent(T& event) {
events.push_back([&event]() { return event.handle(); });
}
void run() {
while (true) {
for (auto& event : events) {
if (!event()) {
return;
}
}
}
}
private:
std::vector<std::function<bool()>> events;
};
int main() {
EventLoop eventLoop;
Event1 event1;
Event2 event2;
eventLoop.addEvent(event1);
eventLoop.addEvent(event2);
eventLoop.run();
return 0;
}
My naive expectation was that the version using lambdas would be faster (ie 'dynamic dispatch is slow'), but compiling with g++, it was twice as fast to run - 17s compared to 35s on my machine, according to time
.
I'd like to understand whether this is likely because with my toy event methods, the compiler was able to optimise the former better (and so in a more realistic scenario with more complex event handlers this would indeed typically be slower), or if there's some overhead associated with the lambdas that I don't understand that will likely always make this the case.
1 Answer 1
Enable compiler optimizations
The runtimes you posted suggest you compiled your code without optimizations enabled. That makes them very useless. Indeed as Rish commented, if you enable them, the runtimes will be approximately the same (and also much smaller).
Toy code won't give realistic results
The risk with such simple code though is that the compiler might be able to see that your code doesn't do anything at all, and could legally replace it with:
int main() {
return 0;
}
So for benchmarking you really should ideally use something more realistic. And if in that realisitc case every event would take a long time to process, the performance of the event loop doesn't matter at all. In that case, you should consider optimizing for code simplicity instead of performance.
Lambdas will not make things faster in this case
You heard that dynamic dispatch is slow. The reason is that in the first case, events
stores pointers to objects that have a virtual method table. When calling a virtual member function, it first has to load the pointer to the member function from the concrete class from that table. This double indirection is what is "slow", as it can prevent the CPU from doing prefetching and branch prediction. In contrast, std::function<>
will hold a pointer directly to the actual function you want to call, so it doesn't suffer from this issue.
However, contemporary CPUs have gotten much better at this, and furthermore, since you only have two entries in events
and you never change it while the event loop is running, everything will be in the L1 cache and the branch predictor can easily predict what is going to happen.
What to choose for event loops
Pointers to base classes sometimes are better than using std::function<>
, however for the specific case of an event loop, you are usually only interested in calling one function. So using std::function<>
is then probably a better choice, as it avoids the double indirection when calling the function, and it avoids needing objects to be derived from a particular base class, so it's a bit more flexible.
That said, if you want to be able to remove events from the event loop, this is going to be tricky if you are just storing std::function<>
s, whereas with pointers to objects it is trivial to do.
Explore related questions
See similar questions with these tags.
-O3
, I find the run time more or less the same for both (which is kinda expected). \$\endgroup\$