Disclaimer:
I am not a fan of the then concept, specifically as a mutating non-static member function. As of now, it has only been introduced as part of the experimental
namespace in the C++ Standard Library.
Issue 1: Lack of Implicit Executors in C++
C++ does not have implicit executors, which leads to unclear control flow. When using std::promise
and std::future
, it is not explicitly defined which thread will execute the continuation. For example:
- Will it execute in the thread calling
std::promise::set_value
? - Will it execute in the thread calling
std::future::get
? - Or will it launch a completely new thread, impacting performance in unpredictable ways?
This uncertainty often forces the user to schedule an operation explicitly, such as using a custom thread pool or calling another std::async
within the lambda passed to then
. While the then
concept has some legitimate use cases (e.g., using the future returned by then
as a synchronization barrier), in most other scenarios, I suggest using ASIO instead.
Issue 2: Mutating Calls to Non-Static Member Functions
I strongly dislike interfaces with mutating member functions that render objects unusable after the call. For instance, consider std::future::get
:
- You can only call
get
once. - If you call it again, it results in an exception. This is entirely avoidable by writing code that ensures
get
is only called once.
With explicit std::move
, static analyzers can detect and warn about "use after move" scenarios. However, with std::future::get
, the state is invalidated without any general way to detect it.
The same issue applies to then
. Once then
is called, there are two possible approaches:
- Prevent further use of the object after
then
is called. - Allow further use, but this introduces a range of new complications.
I won't elaborate on the second approach—it’s self-evidently problematic.
Proposed Adapted Interface
To address these issues, I designed an adapted interface that enforces safer semantics. The main principle is to make it impossible to use an object after its state has been invalidated, using "use after move" detection as a safeguard.
Key Features:
get
,set_value
, andthen
are free functions that only accept rvalue-references.- This makes it impossible to call these functions on lvalues without an explicit
std::move
. - Example:
auto f = async_func(); // then(f, op_b); // Compilation error: `f` must be moved then(std::move(f), op_b); // OK // get(f); // Compilation error get(std::move(f)); // Triggers "use after move" warning
- This also allows easy usage with temporaries:
then(async_func(), op_a);
- Furthermore,
then
as a variadic function (not yet implemented) enables optimizations such as allocating all continuations in a single place:then(async_func(), op_a, op_b, op_c);
- This makes it impossible to call these functions on lvalues without an explicit
Less "unspecified" threading behavior:
Continuations can only execute in two possible threads:- The thread where
set_value(std::move(promise), v)
is called. - The thread where
get(std::move(future))
is called (blocking), provided the continuation was scheduled afterset_value
.
- The thread where
Relying on existing std features:
The implementation relies onstd::future
and other standard library functions, utilizing the standard library as much as makes sense, without reinventing existing functionality.
Implementation
https://godbolt.org/z/PrbGa8vYP
Here’s a naive implementation of the proposed get
, set_value
, and then
functions:
#include <future>
#include <memory>
#include <mutex>
#include <optional>
#include <utility>
#include <variant>
namespace impl {
template <typename>
class move_only_func;
template <typename Ret, typename ... Args>
class move_only_func<Ret(Args...)> {
public:
move_only_func() = delete;
move_only_func(const move_only_func&) = delete;
move_only_func(move_only_func&&) noexcept = default;
template <typename Callable>
move_only_func(Callable &&c)
: move_only_func(std::forward<Callable>(c), std::is_convertible<decltype(c), Ret(*)(Args...)>{})
{}
// todo: customizeable allocator
Ret operator()(Args... args) const {
return std::visit([&](const auto& s) { return _invoke(s, std::forward<Args>(args)...); }, _impl);
}
private:
move_only_func(Ret(*f)(Args...), std::true_type /*is_convertible*/) noexcept
: _impl(_stateless(f))
{}
template <typename Callable>
explicit move_only_func(Callable &&c, std::false_type /*is_convertible*/)
: _impl(_stateful{
[](void *callable, Args... args){ return (*static_cast<Callable*>(callable))(std::forward<Args>(args)...); },
{new Callable(std::forward<Callable>(c)), [](void *callable){ delete static_cast<Callable*>(callable); }},
})
{}
struct _stateful {
Ret(*_f)(void*, Args...);
std::unique_ptr<void, void(*)(void*)> _state;
};
struct _stateless {
Ret(*_f)(Args...);
};
static Ret _invoke(const _stateful &s, Args... args) { return s._f(s._state.get(), std::forward<Args>(args)...); }
static Ret _invoke(const _stateless &s, Args... args) { return s._f(std::forward<Args>(args)...); }
std::variant<_stateful, _stateless> _impl;
};
template <typename>
struct is_future : std::false_type {};
template <typename T>
struct is_future<std::future<T>> : std::true_type {};
struct promise_context : std::enable_shared_from_this<promise_context> {
std::optional<move_only_func<void()>> _call;
std::mutex _mutex;
static std::shared_ptr<promise_context> create() { return std::shared_ptr<promise_context>(new promise_context{}); }
private:
promise_context() = default;
};
template <typename T>
struct fulfilled {
std::future<T> _future;
};
template <typename T>
struct unfulfilled {
std::future<T> _future;
std::weak_ptr<promise_context> _promise_context;
};
// todo: support variadic Next
template <typename T, typename Next>
auto then(fulfilled<T> &&state, Next &&next) noexcept
-> fulfilled<std::invoke_result_t<Next, std::future<T>>> {
return {
._future = std::async(std::launch::deferred,
[f = std::move(state._future), n = std::forward<Next>(next)]() mutable {
f.wait();
auto result = std::forward<Next>(n)(std::move(f));
if constexpr (is_future<decltype(result)>())
return result.get();
else
return result;
}),
};
}
template <typename T, typename Next>
auto then(unfulfilled<T> &&state, std::false_type /*shared_state_alive*/, std::false_type, Next &&next) noexcept
-> fulfilled<std::invoke_result_t<Next, std::future<T>>> {
return then(fulfilled<T>{._future = std::move(state._future)}, std::forward<Next>(next));
}
template <typename T, typename Next>
auto then(unfulfilled<T> &&state, promise_context &, std::false_type, Next &&next) noexcept
-> fulfilled<std::invoke_result_t<Next, std::future<T>>> {
return then(fulfilled<T>{._future = std::move(state._future)}, std::forward<Next>(next));
}
// todo: support variadic Next
template <typename T, typename Mutex, typename Next>
auto then(unfulfilled<T> &&state, promise_context &ctx, std::lock_guard<Mutex> lg, Next &&next) noexcept
-> unfulfilled<std::invoke_result_t<Next, std::future<T>>> {
std::promise<std::invoke_result_t<Next, std::future<T>>> next_promise;
auto next_future = next_promise.get_future();
ctx._call.emplace(
[current_future = std::move(state._future),
next_promise = std::move(next_promise),
next_call = std::forward<Next>(next)]() mutable {
current_future.wait();
next_promise.set_value(std::forward<Next>(next_call)(std::move(current_future)));
});
return {
._future = std::move(next_future),
._promise_context = ctx.weak_from_this(),
};
}
template <typename T, typename Next>
auto then(std::variant<unfulfilled<T>, fulfilled<T>> &&state, Next &&next) noexcept {
using next_state_t = std::variant<
unfulfilled<std::invoke_result_t<Next, std::future<T>>>,
fulfilled<std::invoke_result_t<Next, std::future<T>>>
>;
return std::visit(
[&next](auto &&state) -> next_state_t {
using cur_state_t = std::decay_t<decltype(state)>;
if constexpr (std::is_same_v<cur_state_t, fulfilled<T>>)
return then(std::move(state), std::forward<Next>(next));
else {
static_assert(std::is_same_v<cur_state_t, unfulfilled<T>>);
auto ctx = state._promise_context.lock();
if (!ctx)
return then(std::move(state), std::false_type{}, std::false_type{}, std::forward<Next>(next));
if (!ctx->_mutex.try_lock())
return then(std::move(state), *ctx, std::false_type{}, std::forward<Next>(next));
return then(std::move(state), *ctx, std::lock_guard{ctx->_mutex, std::adopt_lock}, std::forward<Next>(next));
}
}, std::move(state));
}
} // namespace impl
template <typename T>
struct promise {
std::promise<T> _promise;
std::shared_ptr<impl::promise_context> _context = impl::promise_context::create();
// todo: private constructor & friend
};
template <typename T>
struct future {
std::variant<
impl::unfulfilled<T>,
impl::fulfilled<T>
> _state;
// todo: private constructor & friend
};
template <typename T>
std::pair<future<T>, promise<T>> make_future_promise() noexcept {
promise<T> p{};
future<T> f{
._state = impl::unfulfilled<T>{
._future = p._promise.get_future(),
._promise_context = p._context,
},
};
return {std::move(f), std::move(p)};
}
template <typename T, typename Next>
auto then(future<T> &&f, Next &&next) noexcept
-> future<std::invoke_result_t<Next, std::future<T>>> {
return {impl::then(std::move(f._state), std::forward<Next>(next))};
}
template <typename T, typename U>
void set_value(promise<T> &&p, U &&value) {
auto ctx = std::move(p._context);
auto promise = std::move(p._promise);
promise.set_value(std::forward<U>(value));
std::lock_guard lg{ctx->_mutex};
if (ctx->_call)
(*ctx->_call)();
}
template <typename T>
T get(future<T> &&f) {
return std::visit([](auto &&state){ return state._future.get(); }, std::move(f._state));
}
Example + Testing
Here’s an example and a test case for the implementation:
std::size_t this_thread_id() noexcept {
return std::hash<std::thread::id>{}(std::this_thread::get_id());
}
template <typename T>
void print_get(future<T> &&future){
std::cout << this_thread_id() << ": before get\n";
std::cout << this_thread_id() << ": " << get(std::move(future)) << '\n';
std::cout << this_thread_id() << ": after get\n";
}
int main() {
// testing
{
std::cout << "basic case:\n";
auto [f, p] = make_future_promise<int>();
set_value(std::move(p), 4);
print_get(std::move(f));
}
{
std::cout << "\nbasic case parallel:\n";
auto [f, p] = make_future_promise<int>();
std::thread{[p = std::move(p)]() mutable {
std::cout << this_thread_id() << ": setting promise...\n";
set_value(std::move(p), 8);
}}.detach();
print_get(std::move(f));
}
{
std::cout << "\nbasic then:\n";
auto [f, p] = make_future_promise<int>();
set_value(std::move(p), 815);
print_get(then(std::move(f),
[](std::future<int> i) {
std::cout << this_thread_id() << ": next callback...\n";
return "oceanic " + std::to_string(i.get());
}));
}
{
std::cout << "\nbasic then parallel:\n";
auto [f, p] = make_future_promise<int>();
std::thread{[p = std::move(p)]() mutable {
std::cout << this_thread_id() << ": setting promise...\n";
set_value(std::move(p), 8);
}}.detach();
print_get(then(std::move(f),
[](std::future<int> i) {
std::cout << this_thread_id() << ": next callback...\n";
return "oceanic " + std::to_string(i.get());
}));
}
{
std::cout << "\nparallel, second then called when first is executed:\n";
auto [f, p] = make_future_promise<int>();
std::promise<void> promise_first_barrier;
auto first_barrier = promise_first_barrier.get_future();
auto first_then_future =
then(std::move(f),
[p = std::move(promise_first_barrier)](std::future<int> i) mutable {
std::cout << this_thread_id() << ": first then invoked\n";
p.set_value();
std::this_thread::sleep_for(std::chrono::milliseconds(200));
return i.get() + 4;
});
std::thread{[p = std::move(p)]() mutable {
std::cout << this_thread_id() << ": setting promise...\n";
set_value(std::move(p), 811);
}}.detach();
first_barrier.wait();
print_get(then(std::move(first_then_future),
[](std::future<int> i) {
std::cout << this_thread_id() << ": second then invoked...\n";
return "oceanic " + std::to_string(i.get());
}));
}
return 0;
}
1 Answer 1
Lack of implicit executors
C++ does not have implicit executors
Well, you could argue it has implicit executors, but which are woefully underspecified.
When futures and promises use real OS threads under the hood, I see little point of .then()
anyway, instead of writing:
auto result = std::async(foo).then(bar);
One could just as well write:
auto result = std::async([]{return bar(foo());});
Slightly more verbose for simple cases, but actually much more flexible and probably more efficient as well. Although there are situations where it might be useful (see below).
[...] in most other scenarios, I suggest using ASIO instead.
That's a different style of programming which might also not be right for every situation.
Mutating calls to non-static nember functions
I strongly dislike interfaces with mutating member functions that render objects unusable after the call.
Indeed, looking at the interface of std::experimental::future<T>::then()
, it sucks. However, you don't need a non-member function to enfore futures to be explicitly moved from, you can do this with a member function as well: just qualify it as working on r-values. Here is the fixed version:
template<class F>
future<...> then(F&& func) &&;
Now you have to write something like:
auto fut1 = std::async(foo);
auto fut2 = std::move(foo).then(bar);
If you omit the std::move()
, it will result in a compile error. The member function form allows for more readable chaining of functions; compare:
auto result = std::async(foo).then(bar).then(baz).then(quux);
// vs:
auto result = then(then(then(std::async(foo), bar), baz), quux);
Less unspecified but also less useful?
Continuations can only execute in two possible threads:
- The thread where
set_value(std::move(promise), v)
is called.- The thread where
get(std::move(future))
is called (blocking), provided the continuation was scheduled afterset_value
.
While in a way that does make sense and it more specified, consider that in an actual program, it still might not be predicatable which thread the continuation might run on, and in fact having it run on the thread calling get()
is often the thing you don't want to happen. Consider this caller:
auto fut1 = do_something_async();
bool needs_more_work = do_some_other_work();
if (needs_more_work) {
fut1 = then(std::move(fut1), do_more_work_async());
}
do_even_more_work();
auto result = get(std::move(fut1));
Ideally, do_more_work_async()
never runs on this caller's thread. However, with your implementation it can happen that it gets scheduled on the caller's thread after do_even_more_work()
finishes, when it could have run in parallel.
I am guessing this is why std::experimental::future<T>::then()
doesn't specify what thread is used; it could start a new one or it could run on the old one depending on when it was called, but the old thread could even be kept around if it finished early, so .then()
can reuse it.
Neither your then()
nor std::experimental
's one take a std::launch
option or something similar to indicate how to schedule the work. Maybe that would be an idea to add?
-
\$\begingroup\$
std::move(smth).foo()
doesn't look very good and often not recognized as a common practice by many C++ developers. I am aware of it, but like free functions more. Chaining via several calls to.then
is considerably harder to optimize, then a variadic call tothen(std::move(f), op1, op2, op3)
(notthen(then(then...
as you put it) \$\endgroup\$Sergey Kolesnik– Sergey Kolesnik2024年12月30日 21:10:02 +00:00Commented Dec 30, 2024 at 21:10 -
\$\begingroup\$ regarding the unspecified thread: one of the points is a very minimal and non-intrusive implementation using only
std::
features. Whenfut1
is completed without a scheduled optimization, the only way to launch it in parallel is to runstd::async
which (I highly suspect) is very inefficient. Otherwise I'd need to use a thread pool of my own, becausestd
doesn't expose one. \$\endgroup\$Sergey Kolesnik– Sergey Kolesnik2024年12月30日 21:20:56 +00:00Commented Dec 30, 2024 at 21:20
std::execution::then()
is basically yourthen()
,std::execution::sync_wait()
is basically yourget()
, andstd::execution::set_value()
(no docs yet) is basically yourset_value()
. The difference is they work on senders rather than juststd::future
, which makes them more flexible, and more efficient. (Also a nicer interface, thanks to piping.) \$\endgroup\$move_only_func
.) \$\endgroup\$future::then()
was not standardized for good reasons). 3) You reap the benefit of back-ported libraries (like the NVIDIA reference library mentioned above). You can pay the cost now, or pay a hell of a lot more later. \$\endgroup\$