Sanity check needed, as I'm still very unfamiliar with the nuances behind async/await and how they relate to more traditional Task / TPL code.
I have a high-level repository layer which is assembling a business object from various lower-level repositories. As a business decision, we are doing keyed lookups at the data layer and assembling the composite object in code. In the following code, "Account" is the business object, and the various _tAccountXXXRepo objects are all patterned similarly to this (we are using NPoco):
public class T_AccountMapperRepository : BaseNPocoRepository<T_AccountMapper>
, IRetrieveMany<T_AccountMapper>
, IRetrieve<AccountId, T_AccountMapper>
{
public T_AccountMapperRepository(IDatabase database) : base(database)
{
}
public async Task<T_AccountMapper> Retrieve(AccountId input)
{
return await Database.QueryAsync<T_AccountMapper>()
.First(x => x.AccountId == input)
.ConfigureAwait(false);
}
}
The code to fetch the different values can be logically executed in parallel, and I am wondering if this implementation is the correct pattern to do so: (the various methods being called on the Account object are thread-safe as well)
public async Task<Account> Retrieve(AccountId input1, DateTime input2)
{
var account = new Account();
await Task.WhenAll(
_tAccountMapperRepo.Retrieve(input1)
.ContinueWith(async result => account.SetAccountMap(await TranslateAccountMap(result))),
_tPaymentRedirectionRepo.Retrieve(input1, input2)
.ContinueWith(async result => account.ChangePayerToAccount(await TranslatePaymentRedirection(result))),
_tAccountAncestorRepo.Retrieve(input1, input2)
.ContinueWith(async result => await _tAccountMapperRepo.Retrieve((await result).AncestorId))
.ContinueWith(async result => account.MoveAccountToNewParentAccountMap(await TranslateAccountMap(await result))),
_tAccountRepo.Retrieve(input1)
.ContinueWith(async result => await _tAccountTypeRepo.Retrieve((await result).TypeId))
.ContinueWith(async result => account.SetAccountType(await TranslateAccountType(await result)))
);
return account;
}
Any method that is labelled with TranslateXXX all look similar to this:
private static async Task<AccountMap> TranslateAccountMap(Task<T_AccountMapper> mapTask)
{
if (!mapTask.IsCompletedSuccessfully)
throw new InvalidOperationException(nameof(mapTask), mapTask.Exception);
var map = await mapTask;
return new AccountMap(map.AccountId, map.Login, map.Namespace);
}
My main concerns are mixing Task and async/await, and whether or not my async & awaiting is re-introducing an element of synchronicity in what I'm hoping to make a very asynchronous process. My end goal is that as many various properties as possible are fetched in parallel and assembled
3 Answers 3
As for me it looks fine except one thing. You have already completed Task
in .ContinueWith
block, thus you may use .Result
then instead of await
and avoid redundant async
State Machine.
By the way, if you're using async/await
, probably there's no continuations needed.
Consider this continuations-free implementation of 1st Task.WhenAll()
job.
public async Task<Account> Retrieve(AccountId input1, DateTime input2)
{
var account = new Account();
// probably try-catch wrapper maybe useful here as replacement of !mapTask.IsCompletedSuccessfully
await Task.WhenAll((Func<Task>)(async () =>
{
T_AccountMapper map = await _tAccountMapperRepo.Retrieve(input1);
// continuation:
AccountMap accountMap = new AccountMap(map.AccountId, map.Login, map.Namespace);
account.SetAccountMap(accountMap);
})
//, (Func<Task>)(async() => {...})
);
return account;
}
As you can see, TranslateAccountMap
method isn't needed anymore.
Finally, it's up to you.
P.S. Here's a useful link.
-
1\$\begingroup\$ The "useful link" was already purple :) That's what gave me something like the pattern \$\endgroup\$Bryan B– Bryan B2020年09月29日 19:04:36 +00:00Commented Sep 29, 2020 at 19:04
Based on feedback from aepot, and lessons from AspNetCore Async Guidance, this is what I refactored the code into:
public async Task<Account> RetrieveAsync(AccountId input1, DateTime input2)
{
Account account = new Account();
await Task.WhenAll(
SetAccountMap(account, input1, input2),
ChangePayer(account, input1, input2),
SetParentMap(account, input1, input2),
SetAccountType(account, input1, input2)
);
return account;
}
private async Task SetAccountType(Account account, AccountId accountId, DateTime dateTime)
{
var baseAccount = await _tAccountRepo.RetrieveAsync(accountId);
var accountType = await _tAccountTypeRepo.RetrieveAsync(baseAccount.TypeId);
var convertedAccountType = _accountTypeConverter.Convert(accountType);
account.SetAccountType(convertedAccountType);
}
private async Task SetParentMap(Account account, AccountId accountId, DateTime dateTime)
{
var accountAncestor = await _tAccountAncestorRepo.RetrieveAsync(accountId, dateTime);
var accountMap = await _tAccountMapperRepo.RetrieveAsync(accountAncestor.DescendentId);
var convertedAccountMap = _accountMapConverter.Convert(accountMap);
account.MoveAccountToNewParentAccountMap(convertedAccountMap);
}
private async Task ChangePayer(Account account, AccountId accountId, DateTime dateTime)
{
var paymentRedirection = await _tPaymentRedirectionRepo.RetrieveAsync(accountId, dateTime);
var convertedPaymentRedirection = _paymentRedirectionConverter.Convert(paymentRedirection);
account.ChangePayerToAccount(convertedPaymentRedirection);
}
private async Task SetAccountMap(Account account, AccountId accountId, DateTime dateTime)
{
var accountMap = await _tAccountMapperRepo.RetrieveAsync(accountId);
var convertedAccountMap = _accountMapConverter.Convert(accountMap);
account.SetAccountMap(convertedAccountMap);
}
-
\$\begingroup\$ I suggest to rename
input1
andinput2
toid
anddate
accordingly.accountId
anddateTime
arguments can be shorten in the same way. Btw, I like this version of the code, looks more friendly. \$\endgroup\$aepot– aepot2020年09月30日 22:42:22 +00:00Commented Sep 30, 2020 at 22:42 -
\$\begingroup\$ @BryanBoettcher With this code you are running 4 different functions concurrently that are try to modify the same object. It can work (and will not cause race condition) if these functions are working on independent set of properties of the Account. What if you would do the necessary data retrievals concurrently and then you could construct the Account object on a single thread? With that your ET(L) would be nicely separated. Parallel Extraction and Single threaded Transformation. \$\endgroup\$Peter Csala– Peter Csala2020年10月01日 12:23:13 +00:00Commented Oct 1, 2020 at 12:23
-
\$\begingroup\$ @PeterCsala: in this case the object is thread-aware and there will not be race conditions. Good catch though, because that could definitely be problematic! Can you post an answer that would illustrate the parallel-fetch & single-threaded set? I feel like it may bring a bunch of noise back to the main method. \$\endgroup\$Bryan B– Bryan B2020年10月01日 14:51:19 +00:00Commented Oct 1, 2020 at 14:51
-
\$\begingroup\$ @BryanBoettcher I've just posted an answer, please check it. \$\endgroup\$Peter Csala– Peter Csala2020年10月02日 07:53:44 +00:00Commented Oct 2, 2020 at 7:53
-
1\$\begingroup\$ @aepot: I had to wait before accepting -- I upvoted yours, but will be accepting mine since ultimately that's what I went with. \$\endgroup\$Bryan B– Bryan B2020年10月05日 15:23:55 +00:00Commented Oct 5, 2020 at 15:23
Here I've two alternative versions of Bryan Boettcher revised code.
In his version there are 4 ETL functions and they are running concurrently.
In my solutions only the Extract part is running concurrently. The Transformation and Load parts are executed sequentially.
Alternative #1
Please bear in mind that I had to use some made up type names (like: AccountType
, PaymentRedirection
) because I could not tell the exact types based on the provided code.
public async Task<Account> RetrieveAsync(AccountId accountId, DateTime dateTime)
{
//Extract
var accountTypeJob = RetrieveAccountType(accountId);
var paymentRedirectionJob = RetrievePaymentRedirection(accountId, dateTime);
var accountMapJob = RetrieveAccountMap(accountId);
var parentAccountMapJob = RetrieveAccountMapForParent(accountId, dateTime);
//Transform
var accountType = await accountTypeJob;
var convertedAccountType = _accountTypeConverter.Convert(accountType);
var paymentRedirection = await paymentRedirectionJob;
var convertedPaymentRedirection = _paymentRedirectionConverter.Convert(paymentRedirection);
var accountMap = await accountMapJob;
var convertedAccountMap = _accountMapConverter.Convert(accountMap);
var parentAccountMap = await parentAccountMapJob;
var convertedParentAccountMap = _accountMapConverter.Convert(parentAccountMap);
//Load
var account = new Account();
account.SetAccountType(convertedAccountType);
account.ChangePayerToAccount(convertedPaymentRedirection);
account.SetAccountMap(convertedAccountMap);
account.MoveAccountToNewParentAccountMap(convertedParentAccountMap);
return account;
}
private async Task<AccountType> RetrieveAccountType(AccountId accountId)
{
var baseAccount = await _tAccountRepo.RetrieveAsync(accountId);
return await _tAccountTypeRepo.RetrieveAsync(baseAccount.TypeId);
}
private async Task<AccountMap> RetrieveAccountMapForParent(AccountId accountId, DateTime dateTime)
{
var accountAncestor = await _tAccountAncestorRepo.RetrieveAsync(accountId, dateTime);
return await _tAccountMapperRepo.RetrieveAsync(accountAncestor.DescendentId);
}
private async Task<PaymentRedirection> RetrievePaymentRedirection(AccountId accountId, DateTime dateTime)
{
return await _tPaymentRedirectionRepo.RetrieveAsync(accountId, dateTime);
}
private async Task<AccountMap> RetrieveAccountMap(AccountId accountId)
{
return await _tAccountMapperRepo.RetrieveAsync(accountId);
}
As you can see I have 4 RetrieveXYZ
functions and these are responsible for loading data. I suppose these are I/O operations, so they are not CPU-bound (concurrency is not limited by the available cores).
During the Extract phase I start all of the Task
s, so there is no need to explicitly call the Task.WhenAll
In the Transform phase I access the retrieved data via the await
keywords.
Alternative #2
With a simple helper function we can make the RetrieveAsync
function neater and more concise.
public static class TaskExtensions
{
public static async Task<(T1, T2, T3, T4)> WhenAll<T1, T2, T3, T4>(Task<T1> t1, Task<T2> t2, Task<T3> t3, Task<T4> t4)
{
return (await t1, await t2, await t3, await t4);
}
}
By taking advantage of this helper method the new version of the RetrieveAsync
would look like this:
public async Task<Account> RetrieveAsync(AccountId accountId, DateTime dateTime)
{
//Extract
var (accountType, paymentRedirection, accountMap, parentAccountMap) = await TaskExtensions.WhenAll(
RetrieveAccountType(accountId), RetrievePaymentRedirection(accountId, dateTime),
RetrieveAccountMap(accountId), RetrieveAccountMapForParent(accountId, dateTime));
//Transform
var convertedAccountType = _accountTypeConverter.Convert(accountType);
var convertedPaymentRedirection = _paymentRedirectionConverter.Convert(paymentRedirection);
var convertedAccountMap = _accountMapConverter.Convert(accountMap);
var convertedParentAccountMap = _accountMapConverter.Convert(parentAccountMap);
//Load
var account = new Account();
account.SetAccountType(convertedAccountType);
account.ChangePayerToAccount(convertedPaymentRedirection);
account.SetAccountMap(convertedAccountMap);
account.MoveAccountToNewParentAccountMap(convertedParentAccountMap);
return account;
}
Advantages
- Only the I/O bound operations are running concurrently
- Each helper function has single responsibility
- The top-level function acts as a coordinator
- There is no shared resource, so we don't have to worry that much about thread-safety
Disadvantages
- If the Transform and/or Load parts are I/O bound as well then this approach would be slower
- It is not relying on the built-in
Task.WhenAll
rather than it relies on theValueTuple
and C# 7's Deconstruction capabilities, so it can't be used in older .NET Framework versions.
result
in yourContinueWith
delegates. They are really misleading. Because you haven't specifiedTaskContinuationOptions.OnlyOnRanToCompletion
that's why your ancestor / antecedent task could have failed. IMO, calling a failed task asresult
is misleading. \$\endgroup\$