The Problem
I am developing a Visual Studio extension for importing code from a server.
Because of the way Visual Studio works, multiple threads could try to perform the same operation at the same time.
Thus, I wanted a class that would ensure that a) each Task
would only be performed when needed and b) each Task
would only be performed once.
The only exception to b) is if the requesters cancel and then later another request is made in which case if the operation was not finished, it should be retried.
Intended functionality:
- A
TaskManager
object is created. - At some point,
TaskManager.Value
is called on the object and theTask
is set to run. - If
TaskManger.Value
is called multiple times (possibly on other threads), the other executions are set waiting on theTask
that is already in progress. - If the
CancellationToken
indicates a cancellation request, only if all the requestors agree should theTask
be cancelled. - If the
TaskManager
object is cancelled before completion and later is requested again, theTask
should be reinitialized.
The Code
public class TaskManager<T>
{
readonly Func<CancellationToken, Task<T>> _initializer;
public TaskManager(Func<CancellationToken, Task<T>> initializer) => _initializer = initializer;
Task<T>? _task;
CancellationTokenSource _cancellationToken = new CancellationTokenSource();
readonly object _initializationLock = new();
readonly ConcurrentBag<CancellationToken> _cancellationTokens = new();
public Task<T> Value (CancellationToken cancellationToken)
{
if (cancellationToken != default)
{
if (cancellationToken.IsCancellationRequested) return Task.FromResult<T>(default);
_cancellationTokens.Add(cancellationToken);
cancellationToken.Register(() =>
{
if (_task is not { IsCompleted: true } && _cancellationTokens.All(c => c.IsCancellationRequested))
_cancellationToken.Cancel();
});
}
if (_task != null && !_cancellationToken.IsCancellationRequested) return _task;
lock (_initializationLock)
{
if (_task != null) return _task;
_cancellationToken = new CancellationTokenSource();
_task = _initializer(_cancellationToken.Token);
}
return _task;
}
}
Example Usage
public static class CodepediaApi
{
private static readonly RestClient client = new RestClient("https://localhost:5001"); // Test Server
private static readonly Dictionary<string, TaskManager<SearchResult[]>> responses = new Dictionary<string, TaskManager<SearchResult[]>>();
public static Task<SearchResult[]> Search(string query, CancellationToken cancellationToken) =>
responses.TryGetOrAdd(query,
() => new TaskManager<SearchResult[]>(async c =>
(await client.GetAsync<SearchResult[]>(new RestRequest("/api/search").AddQueryParameter("q", query), c)) ?? new SearchResult[0]
)
).Value(cancellationToken);
}
where TryGetOrAdd
is defined as follows:
public static class Extensions
{
public static TValue TryGetOrAdd<TKey, TValue>(this IDictionary<TKey, TValue> d, TKey key, Func<TValue> toAdd)
{
if (d.TryGetValue(key, out TValue value))
return value;
TValue adding = toAdd();
lock (d)
{
if (d.TryGetValue(key, out value))
return value;
d.Add(key, adding);
}
return adding;
}
}
Another Example Usage
Here is one of my classes:
public class CodepediaUtil
{
public readonly Workspace Workspace;
public readonly SnapshotSpan Range;
public readonly TaskManager<(Document document, SemanticModel semanticModel, SyntaxNode outerNode, UnresolvedMethodCallInfo unresolvedInvocation)?> CodeAnalysisInfo;
public readonly TaskManager<SearchResult[]> SearchResults;
public readonly TaskManager<Dictionary<SearchResult, WikiEntry>> SearchResultsInfo;
public async Task<UnresolvedMethodCallInfo?> UnresolvedInvocation (CancellationToken cancellationToken)
{
var unresolvedInvocationInfo = await CodeAnalysisInfo.Value(cancellationToken);
if (unresolvedInvocationInfo is not (_, _, _, UnresolvedMethodCallInfo unresolvedInvocation)) return null;
return unresolvedInvocation;
}
public CodepediaUtil (Workspace workspace, SnapshotSpan range)
{
Workspace = workspace;
Range = range;
CodeAnalysisInfo = new TaskManager<(Document document, SemanticModel semanticModel, SyntaxNode outerNode, UnresolvedMethodCallInfo unresolvedInvocation)?>
(
async (CancellationToken cancellationToken) =>
{
Document? document = Range.Snapshot.TextBuffer.GetRelatedDocuments().FirstOrDefault();
if (document == null) return null;
SemanticModel? semanticModel = await document.GetSemanticModelAsync(cancellationToken);
if (semanticModel == null) return null;
SyntaxNode? outerNode = await CodeAnalyzer.GetOuterNodeAsync(semanticModel, Range.Start, cancellationToken);
if (outerNode == null) return null;
UnresolvedMethodCallInfo? unresolvedInvocation = outerNode.FindUnresolvedInvocation(semanticModel);
if (unresolvedInvocation == null) return null;
return (document, semanticModel, outerNode, unresolvedInvocation);
}
);
SearchResults = new TaskManager<SearchResult[]>
(
async (CancellationToken cancellationToken) =>
{
UnresolvedMethodCallInfo? unresolvedInvocation = await UnresolvedInvocation(cancellationToken);
if (unresolvedInvocation == null) return new SearchResult[0];
return await CodepediaApi.Search(unresolvedInvocation.Name, cancellationToken);
}
);
SearchResultsInfo = new TaskManager<Dictionary<SearchResult, WikiEntry>>
(
async (CancellationToken cancellationToken) =>
{
// UnresolvedMethodCallInfo? unresolvedInvocation = await UnresolvedInvocation(cancellationToken);
// if (unresolvedInvocation == null) return new Dictionary<SearchResult, WikiEntry>();
// string invocationName = unresolvedInvocation.Name;
return await CodepediaApi.InterpretSearchResults(await SearchResults.Value(cancellationToken), cancellationToken);
}
);
}
}
This coding pattern should allow other classes to request exactly what they need on a per-need basis.
What I want reviewed
- Is my code useful? (Is there a better way to do this?)
- Could the performance/elegance of my code improved? (Any comments to make it better would be appreciated.)
- Did I make a mistake? (Is my code thread-safe? Are there any race conditions or ways it could get stuck?)
1 Answer 1
Here are my observations:
TryGetOrAdd
of Extensions
- Naming a parameter to
d
is not really a good choice - I guess you have performed the
TryGetValue
twice due to the double-checked lock- I would like to warn you with Jon Skeet's words
Double-checking locking now works in Java as well as C# (the Java memory model changed and this is one of the effects). However, you have to get it exactly right. If you mess things up even slightly, you may well end up losing the thread safety.
- Locking on the
d
is not a good idea since it is provided by the caller (so, it can be considered aspublic)
- Instead of having three
return
statements you can achieve the same with two
private static readonly object syncObject = new();
public static TValue TryGetOrAdd<TKey, TValue>(this IDictionary<TKey, TValue> dict, TKey key, Func<TValue> toAdd)
{
if (dict.TryGetValue(key, out TValue value))
return value;
lock (syncObject)
{
if (!dict.TryGetValue(key, out value))
{
value = toAdd();
dict.Add(key, value);
}
return value;
}
}
TaskManager
- Is there any particular reason why did you not use target-typed new expression for
_cancellationToken
?- By the way
_cancellationToken
is a really bad name for this member since it is a CancellationTokenSource not a CancellationToken
- By the way
- It is not absolutely clear for me why did you pass the
initializer
via the ctor and thecancellationToken
via theValue
method- Can't you pass the the two either via ctor or via the method?
- In case of C# a method usually starts with verb, so
GetValue
orRetrieveValue
might be better - If the
IsCancellationRequested
is true then would it make more sense to callTask.FromCanceled<T>(cancellationToken)
instead? - Here yet again you can simplify your
lock
block to have a singlereturn
statement
lock (_initializationLock)
{
if (_task is null)
{
_cancellationTokenSource = new();
_task = _initializer(_cancellationTokenSource.Token);
}
return _task;
}
CodepediaApi
's Search
- Here I've only two tiny suggestions
c
is not the best name forCancellationToken
- Instead of allocating empty array (
new SearchResult[0]
rather preferArray.Empty<SearchResult>()
UPDATE #1: Reflect to the first comment
The method of creating a static variable called syncObject would mean that only one Dictionary could be accessed at a time by the TryGetOrAdd function. In my code's case that work fine as the only Dictionary is the search dictionary. The issue with using another object as a sync object is that since dict is not locked, the first call to dict.TryGetValue(key, out TValue value) could be accessing the Dictionary while it's an invalid state. So, the only solution would be to surround the whole function with lock (syncObject) which would decrease performance
The core problem here is ownership. Since the dictionary is not owned by the method or the class of the method anyone can access it (and potentially modify it) since the lock
scope is the method, not the object. Let me demonstrate with a simple example:
public static void Main()
{
var dict = new Dictionary<string, int>();
var lockAcquired = new TaskCompletionSource<object>();
_ = Task.Run(async () => { await lockAcquired.Task; dict.Add("1", 1);});
Add(dict, lockAcquired);
}
public static void Add(Dictionary<string, int> dict, TaskCompletionSource<object> signal)
{
lock(dict)
{
signal.SetResult(null);
Thread.Sleep(100);
dict.Add("1", 1);
}
}
This code will throw an exception inside the Add
method:
System.ArgumentException: An item with the same key has already been added. Key: 1
So, as you can see neither mine nor your solution is fine.
-
\$\begingroup\$ The method of creating a static variable called
syncObject
would mean that only oneDictionary
could be accessed at a time by theTryGetOrAdd
function. In my code's case that work fine as the onlyDictionary
is the search dictionary. The issue with using another object as a sync object is that sincedict
is not locked, the first call todict.TryGetValue(key, out TValue value)
could be accessing theDictionary
while it's an invalid state. So, the only solution would be to surround the whole function withlock (syncObject)
which would decrease performance. \$\endgroup\$trinalbadger587– trinalbadger5872022年01月31日 23:04:56 +00:00Commented Jan 31, 2022 at 23:04 -
\$\begingroup\$ The reason
cancellationToken
is provided in theValue
function not the constructor is that each requester of theTaskManager
has their owncancellationToken
. Providing the initializer in the constructor allowsTaskManager
objects to be passed around without actually starting the task. \$\endgroup\$trinalbadger587– trinalbadger5872022年01月31日 23:25:57 +00:00Commented Jan 31, 2022 at 23:25 -
\$\begingroup\$ @trinalbadger587 I've updated my post to reflect your first comment, please check it. \$\endgroup\$Peter Csala– Peter Csala2022年02月01日 07:47:07 +00:00Commented Feb 1, 2022 at 7:47
-
1\$\begingroup\$ As I understand it, the only failure mode for my existing code is if someone adds a key value pair to the
Dictionary
without locking it while the firstTryGetValue
function is running as once theDictionary
is lockedTryGetValue
is called again. I was thinking about what you said about code ownership and I thought that to help improve my code style I could create another class something likepublic class MyDictionary<K, V> : Dictionary<K, V> {}
. Then, I could have theTryGetOrAdd
function only acceptMyDictionary
to prevent misuse with objects managed by other code. \$\endgroup\$trinalbadger587– trinalbadger5872022年02月02日 00:06:14 +00:00Commented Feb 2, 2022 at 0:06 -
1\$\begingroup\$ @trinalbadger587 Well it depends. Lets suppose you have 3 threads (2 readers, 1 writer) then it might happen that the two reads receive different results. If you have multiple readers and the above behavior is unacceptable then you might need to consider to use a read-write lock. Anyway I don't want to have a too lengthy conversation. If you have any question I'm happy to answer otherwise have a great week! \$\endgroup\$Peter Csala– Peter Csala2022年02月02日 06:57:19 +00:00Commented Feb 2, 2022 at 6:57
Explore related questions
See similar questions with these tags.
await
a Task as many times as you want but that will not trigger a new execution. each Task would only be performed when needed << what do mean by that? Do want to have a cold start on the Task? \$\endgroup\$Task
is that it is not started until theValue
function is called on it but if multiple calls are made to theValue
function, theTask
is only started once. \$\endgroup\$Task
if it is cancelled. Eg. a scenario where a user searches for something, cancels and then searches for the same thing again should retry. \$\endgroup\$