4
\$\begingroup\$

Disclaimer:

I am not a fan of the then concept, specifically as a mutating non-static member function. As of now, it has only been introduced as part of the experimental namespace in the C++ Standard Library.


Issue 1: Lack of Implicit Executors in C++

C++ does not have implicit executors, which leads to unclear control flow. When using std::promise and std::future, it is not explicitly defined which thread will execute the continuation. For example:

  • Will it execute in the thread calling std::promise::set_value?
  • Will it execute in the thread calling std::future::get?
  • Or will it launch a completely new thread, impacting performance in unpredictable ways?

This uncertainty often forces the user to schedule an operation explicitly, such as using a custom thread pool or calling another std::async within the lambda passed to then. While the then concept has some legitimate use cases (e.g., using the future returned by then as a synchronization barrier), in most other scenarios, I suggest using ASIO instead.


Issue 2: Mutating Calls to Non-Static Member Functions

I strongly dislike interfaces with mutating member functions that render objects unusable after the call. For instance, consider std::future::get:

  • You can only call get once.
  • If you call it again, it results in an exception. This is entirely avoidable by writing code that ensures get is only called once.

With explicit std::move, static analyzers can detect and warn about "use after move" scenarios. However, with std::future::get, the state is invalidated without any general way to detect it.

The same issue applies to then. Once then is called, there are two possible approaches:

  1. Prevent further use of the object after then is called.
  2. Allow further use, but this introduces a range of new complications.

I won't elaborate on the second approach—it’s self-evidently problematic.


Proposed Adapted Interface

To address these issues, I designed an adapted interface that enforces safer semantics. The main principle is to make it impossible to use an object after its state has been invalidated, using "use after move" detection as a safeguard.

Key Features:

  1. get, set_value, and then are free functions that only accept rvalue-references.

    • This makes it impossible to call these functions on lvalues without an explicit std::move.
    • Example:
      auto f = async_func();
      // then(f, op_b); // Compilation error: `f` must be moved
      then(std::move(f), op_b); // OK
      // get(f); // Compilation error
      get(std::move(f)); // Triggers "use after move" warning
      
    • This also allows easy usage with temporaries:
      then(async_func(), op_a);
      
    • Furthermore, then as a variadic function (not yet implemented) enables optimizations such as allocating all continuations in a single place:
      then(async_func(), op_a, op_b, op_c);
      
  2. Less "unspecified" threading behavior:
    Continuations can only execute in two possible threads:

    • The thread where set_value(std::move(promise), v) is called.
    • The thread where get(std::move(future)) is called (blocking), provided the continuation was scheduled after set_value.
  3. Relying on existing std features:
    The implementation relies on std::future and other standard library functions, utilizing the standard library as much as makes sense, without reinventing existing functionality.


Implementation

https://godbolt.org/z/PrbGa8vYP
Here’s a naive implementation of the proposed get, set_value, and then functions:

#include <future>
#include <memory>
#include <mutex>
#include <optional>
#include <utility>
#include <variant>
namespace impl {
template <typename>
class move_only_func;
template <typename Ret, typename ... Args>
class move_only_func<Ret(Args...)> {
public:
 move_only_func() = delete;
 move_only_func(const move_only_func&) = delete;
 move_only_func(move_only_func&&) noexcept = default;
 template <typename Callable>
 move_only_func(Callable &&c)
 : move_only_func(std::forward<Callable>(c), std::is_convertible<decltype(c), Ret(*)(Args...)>{})
 {}
 // todo: customizeable allocator
 Ret operator()(Args... args) const {
 return std::visit([&](const auto& s) { return _invoke(s, std::forward<Args>(args)...); }, _impl);
 }
private:
 move_only_func(Ret(*f)(Args...), std::true_type /*is_convertible*/) noexcept 
 : _impl(_stateless(f))
 {}
 template <typename Callable>
 explicit move_only_func(Callable &&c, std::false_type /*is_convertible*/)
 : _impl(_stateful{
 [](void *callable, Args... args){ return (*static_cast<Callable*>(callable))(std::forward<Args>(args)...); },
 {new Callable(std::forward<Callable>(c)), [](void *callable){ delete static_cast<Callable*>(callable); }},
 })
 {}
 struct _stateful {
 Ret(*_f)(void*, Args...);
 std::unique_ptr<void, void(*)(void*)> _state;
 };
 struct _stateless {
 Ret(*_f)(Args...);
 };
 static Ret _invoke(const _stateful &s, Args... args) { return s._f(s._state.get(), std::forward<Args>(args)...); }
 static Ret _invoke(const _stateless &s, Args... args) { return s._f(std::forward<Args>(args)...); }
 std::variant<_stateful, _stateless> _impl;
};
template <typename>
struct is_future : std::false_type {};
template <typename T>
struct is_future<std::future<T>> : std::true_type {};
struct promise_context : std::enable_shared_from_this<promise_context> {
 std::optional<move_only_func<void()>> _call;
 std::mutex _mutex;
 static std::shared_ptr<promise_context> create() { return std::shared_ptr<promise_context>(new promise_context{}); }
private:
 promise_context() = default;
};
template <typename T>
struct fulfilled {
 std::future<T> _future;
};
template <typename T>
struct unfulfilled {
 std::future<T> _future;
 std::weak_ptr<promise_context> _promise_context;
};
// todo: support variadic Next
template <typename T, typename Next>
auto then(fulfilled<T> &&state, Next &&next) noexcept 
 -> fulfilled<std::invoke_result_t<Next, std::future<T>>> {
 return {
 ._future = std::async(std::launch::deferred,
 [f = std::move(state._future), n = std::forward<Next>(next)]() mutable {
 f.wait();
 auto result = std::forward<Next>(n)(std::move(f));
 if constexpr (is_future<decltype(result)>())
 return result.get();
 else
 return result;
 }),
 };
}
template <typename T, typename Next>
auto then(unfulfilled<T> &&state, std::false_type /*shared_state_alive*/, std::false_type, Next &&next) noexcept 
 -> fulfilled<std::invoke_result_t<Next, std::future<T>>> {
 return then(fulfilled<T>{._future = std::move(state._future)}, std::forward<Next>(next));
}
template <typename T, typename Next>
auto then(unfulfilled<T> &&state, promise_context &, std::false_type, Next &&next) noexcept 
 -> fulfilled<std::invoke_result_t<Next, std::future<T>>> {
 return then(fulfilled<T>{._future = std::move(state._future)}, std::forward<Next>(next));
}
// todo: support variadic Next
template <typename T, typename Mutex, typename Next>
auto then(unfulfilled<T> &&state, promise_context &ctx, std::lock_guard<Mutex> lg, Next &&next) noexcept 
 -> unfulfilled<std::invoke_result_t<Next, std::future<T>>> {
 std::promise<std::invoke_result_t<Next, std::future<T>>> next_promise;
 auto next_future = next_promise.get_future();
 ctx._call.emplace( 
 [current_future = std::move(state._future),
 next_promise = std::move(next_promise),
 next_call = std::forward<Next>(next)]() mutable {
 current_future.wait();
 next_promise.set_value(std::forward<Next>(next_call)(std::move(current_future)));
 });
 return {
 ._future = std::move(next_future),
 ._promise_context = ctx.weak_from_this(),
 };
}
template <typename T, typename Next>
auto then(std::variant<unfulfilled<T>, fulfilled<T>> &&state, Next &&next) noexcept {
 using next_state_t = std::variant<
 unfulfilled<std::invoke_result_t<Next, std::future<T>>>,
 fulfilled<std::invoke_result_t<Next, std::future<T>>>
 >;
 return std::visit(
 [&next](auto &&state) -> next_state_t {
 using cur_state_t = std::decay_t<decltype(state)>;
 if constexpr (std::is_same_v<cur_state_t, fulfilled<T>>)
 return then(std::move(state), std::forward<Next>(next));
 else {
 static_assert(std::is_same_v<cur_state_t, unfulfilled<T>>);
 auto ctx = state._promise_context.lock();
 if (!ctx)
 return then(std::move(state), std::false_type{}, std::false_type{}, std::forward<Next>(next));
 if (!ctx->_mutex.try_lock())
 return then(std::move(state), *ctx, std::false_type{}, std::forward<Next>(next));
 return then(std::move(state), *ctx, std::lock_guard{ctx->_mutex, std::adopt_lock}, std::forward<Next>(next));
 }
 }, std::move(state));
}
} // namespace impl
template <typename T>
struct promise {
 std::promise<T> _promise;
 std::shared_ptr<impl::promise_context> _context = impl::promise_context::create();
 // todo: private constructor & friend
};
template <typename T>
struct future {
 std::variant<
 impl::unfulfilled<T>,
 impl::fulfilled<T>
 > _state;
 // todo: private constructor & friend
};
template <typename T>
std::pair<future<T>, promise<T>> make_future_promise() noexcept {
 promise<T> p{};
 future<T> f{
 ._state = impl::unfulfilled<T>{
 ._future = p._promise.get_future(),
 ._promise_context = p._context,
 },
 };
 return {std::move(f), std::move(p)};
}
template <typename T, typename Next>
auto then(future<T> &&f, Next &&next) noexcept 
-> future<std::invoke_result_t<Next, std::future<T>>> {
 return {impl::then(std::move(f._state), std::forward<Next>(next))};
}
template <typename T, typename U>
void set_value(promise<T> &&p, U &&value) {
 auto ctx = std::move(p._context);
 auto promise = std::move(p._promise);
 promise.set_value(std::forward<U>(value));
 std::lock_guard lg{ctx->_mutex};
 if (ctx->_call) 
 (*ctx->_call)();
}
template <typename T>
T get(future<T> &&f) {
 return std::visit([](auto &&state){ return state._future.get(); }, std::move(f._state));
}

Example + Testing

Here’s an example and a test case for the implementation:

std::size_t this_thread_id() noexcept {
 return std::hash<std::thread::id>{}(std::this_thread::get_id());
}
template <typename T>
void print_get(future<T> &&future){
 std::cout << this_thread_id() << ": before get\n";
 std::cout << this_thread_id() << ": " << get(std::move(future)) << '\n';
 std::cout << this_thread_id() << ": after get\n";
}
int main() {
// testing
 {
 std::cout << "basic case:\n";
 auto [f, p] = make_future_promise<int>();
 set_value(std::move(p), 4);
 print_get(std::move(f));
 }
 {
 std::cout << "\nbasic case parallel:\n";
 auto [f, p] = make_future_promise<int>();
 std::thread{[p = std::move(p)]() mutable {
 std::cout << this_thread_id() << ": setting promise...\n";
 set_value(std::move(p), 8);
 }}.detach();
 print_get(std::move(f));
 }
 {
 std::cout << "\nbasic then:\n";
 auto [f, p] = make_future_promise<int>();
 set_value(std::move(p), 815);
 print_get(then(std::move(f),
 [](std::future<int> i) { 
 std::cout << this_thread_id() << ": next callback...\n";
 return "oceanic " + std::to_string(i.get()); 
 }));
 }
 {
 std::cout << "\nbasic then parallel:\n";
 auto [f, p] = make_future_promise<int>();
 std::thread{[p = std::move(p)]() mutable {
 std::cout << this_thread_id() << ": setting promise...\n";
 set_value(std::move(p), 8);
 }}.detach();
 print_get(then(std::move(f),
 [](std::future<int> i) { 
 std::cout << this_thread_id() << ": next callback...\n";
 return "oceanic " + std::to_string(i.get()); 
 }));
 }
 {
 std::cout << "\nparallel, second then called when first is executed:\n";
 auto [f, p] = make_future_promise<int>();
 std::promise<void> promise_first_barrier;
 auto first_barrier = promise_first_barrier.get_future();
 auto first_then_future = 
 then(std::move(f), 
 [p = std::move(promise_first_barrier)](std::future<int> i) mutable {
 std::cout << this_thread_id() << ": first then invoked\n";
 p.set_value();
 std::this_thread::sleep_for(std::chrono::milliseconds(200));
 return i.get() + 4;
 });
 std::thread{[p = std::move(p)]() mutable {
 std::cout << this_thread_id() << ": setting promise...\n";
 set_value(std::move(p), 811);
 }}.detach();
 first_barrier.wait();
 print_get(then(std::move(first_then_future),
 [](std::future<int> i) { 
 std::cout << this_thread_id() << ": second then invoked...\n";
 return "oceanic " + std::to_string(i.get()); 
 }));
 }
 return 0;
}

toolic
14.6k5 gold badges29 silver badges204 bronze badges
asked Dec 23, 2024 at 14:53
\$\endgroup\$
9
  • \$\begingroup\$ Just FYI, C++26 pretty much already solves all of these problems, and more. std::execution::then() is basically your then(), std::execution::sync_wait() is basically your get(), and std::execution::set_value() (no docs yet) is basically your set_value(). The difference is they work on senders rather than just std::future, which makes them more flexible, and more efficient. (Also a nicer interface, thanks to piping.) \$\endgroup\$ Commented Dec 23, 2024 at 17:32
  • \$\begingroup\$ (Also-also, C++23 already has move_only_func.) \$\endgroup\$ Commented Dec 23, 2024 at 17:32
  • \$\begingroup\$ @indi nice to know, although this question is marked for C++20, and the fact is that many companies can't afford migrating even to C++17... This code might be modified for C++14, but for my case 20 is available. I personally don't see a wide spread adoption of C++26 happening in next 5-7 years, unless starting a completely fresh project. Otherwise, nice touch \$\endgroup\$ Commented Dec 23, 2024 at 17:39
  • 1
    \$\begingroup\$ Yes, there will be more cost to change if you’ve already committed to a non-standard interface, but even then, costs are dwarfed by enormous payoffs. 1) You reduce the cost of moving to a more modern standard, if the chance ever arises. 2) You would be coding to a better interface, vetted by experts (the concurrency TS with future::then() was not standardized for good reasons). 3) You reap the benefit of back-ported libraries (like the NVIDIA reference library mentioned above). You can pay the cost now, or pay a hell of a lot more later. \$\endgroup\$ Commented Dec 24, 2024 at 23:56
  • 1
    \$\begingroup\$ Look, you’re free to write your library however you want, but to put it bluntly, ignoring the current standard APIs is foolish. Even if you can’t directly use the latest standard now, the future will come. If you code to the standard, your library will age well, be interoperable with other modern libraries, and make for an easy transition if you ever do need to move away from it. If you don’t, you’re just creating headaches for your library’s users five years down the line when they’re locked in to your bespoke API. But, like I say: it’s your library, do as you please. \$\endgroup\$ Commented Dec 24, 2024 at 23:56

1 Answer 1

2
\$\begingroup\$

Lack of implicit executors

C++ does not have implicit executors

Well, you could argue it has implicit executors, but which are woefully underspecified.

When futures and promises use real OS threads under the hood, I see little point of .then() anyway, instead of writing:

auto result = std::async(foo).then(bar);

One could just as well write:

auto result = std::async([]{return bar(foo());});

Slightly more verbose for simple cases, but actually much more flexible and probably more efficient as well. Although there are situations where it might be useful (see below).

[...] in most other scenarios, I suggest using ASIO instead.

That's a different style of programming which might also not be right for every situation.

Mutating calls to non-static nember functions

I strongly dislike interfaces with mutating member functions that render objects unusable after the call.

Indeed, looking at the interface of std::experimental::future<T>::then(), it sucks. However, you don't need a non-member function to enfore futures to be explicitly moved from, you can do this with a member function as well: just qualify it as working on r-values. Here is the fixed version:

template<class F>
future<...> then(F&& func) &&;

Now you have to write something like:

auto fut1 = std::async(foo);
auto fut2 = std::move(foo).then(bar);

If you omit the std::move(), it will result in a compile error. The member function form allows for more readable chaining of functions; compare:

auto result = std::async(foo).then(bar).then(baz).then(quux);
// vs:
auto result = then(then(then(std::async(foo), bar), baz), quux);

Less unspecified but also less useful?

Continuations can only execute in two possible threads:

  • The thread where set_value(std::move(promise), v) is called.
  • The thread where get(std::move(future)) is called (blocking), provided the continuation was scheduled after set_value.

While in a way that does make sense and it more specified, consider that in an actual program, it still might not be predicatable which thread the continuation might run on, and in fact having it run on the thread calling get() is often the thing you don't want to happen. Consider this caller:

auto fut1 = do_something_async();
bool needs_more_work = do_some_other_work();
if (needs_more_work) {
 fut1 = then(std::move(fut1), do_more_work_async());
}
do_even_more_work();
auto result = get(std::move(fut1));

Ideally, do_more_work_async() never runs on this caller's thread. However, with your implementation it can happen that it gets scheduled on the caller's thread after do_even_more_work() finishes, when it could have run in parallel.

I am guessing this is why std::experimental::future<T>::then() doesn't specify what thread is used; it could start a new one or it could run on the old one depending on when it was called, but the old thread could even be kept around if it finished early, so .then() can reuse it.

Neither your then() nor std::experimental's one take a std::launch option or something similar to indicate how to schedule the work. Maybe that would be an idea to add?

answered Dec 30, 2024 at 19:14
\$\endgroup\$
2
  • \$\begingroup\$ std::move(smth).foo() doesn't look very good and often not recognized as a common practice by many C++ developers. I am aware of it, but like free functions more. Chaining via several calls to .then is considerably harder to optimize, then a variadic call to then(std::move(f), op1, op2, op3) (not then(then(then... as you put it) \$\endgroup\$ Commented Dec 30, 2024 at 21:10
  • \$\begingroup\$ regarding the unspecified thread: one of the points is a very minimal and non-intrusive implementation using only std:: features. When fut1 is completed without a scheduled optimization, the only way to launch it in parallel is to run std::async which (I highly suspect) is very inefficient. Otherwise I'd need to use a thread pool of my own, because std doesn't expose one. \$\endgroup\$ Commented Dec 30, 2024 at 21:20

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.