Task manager object for lazy initialization

Question 1

The Problem

I am developing a Visual Studio extension for importing code from a server.

Because of the way Visual Studio works, multiple threads could try to perform the same operation at the same time.

Thus, I wanted a class that would ensure that a) each Task would only be performed when needed and b) each Task would only be performed once.

The only exception to b) is if the requesters cancel and then later another request is made in which case if the operation was not finished, it should be retried.

Intended functionality:

A TaskManager object is created.
At some point, TaskManager.Value is called on the object and the Task is set to run.
If TaskManger.Value is called multiple times (possibly on other threads), the other executions are set waiting on the Task that is already in progress.
If the CancellationToken indicates a cancellation request, only if all the requestors agree should the Task be cancelled.
If the TaskManager object is cancelled before completion and later is requested again, the Task should be reinitialized.

The Code

public class TaskManager<T>
{
 readonly Func<CancellationToken, Task<T>> _initializer;
 public TaskManager(Func<CancellationToken, Task<T>> initializer) => _initializer = initializer;
 Task<T>? _task;
 CancellationTokenSource _cancellationToken = new CancellationTokenSource();
 readonly object _initializationLock = new();
 readonly ConcurrentBag<CancellationToken> _cancellationTokens = new();
 public Task<T> Value (CancellationToken cancellationToken)
 {
 if (cancellationToken != default)
 {
 if (cancellationToken.IsCancellationRequested) return Task.FromResult<T>(default);
 _cancellationTokens.Add(cancellationToken);
 cancellationToken.Register(() =>
 {
 if (_task is not { IsCompleted: true } && _cancellationTokens.All(c => c.IsCancellationRequested))
 _cancellationToken.Cancel();
 });
 }
 if (_task != null && !_cancellationToken.IsCancellationRequested) return _task;
 lock (_initializationLock)
 {
 if (_task != null) return _task;
 _cancellationToken = new CancellationTokenSource();
 _task = _initializer(_cancellationToken.Token);
 }
 return _task;
 }
}

Example Usage

public static class CodepediaApi
{
 private static readonly RestClient client = new RestClient("https://localhost:5001"); // Test Server
 private static readonly Dictionary<string, TaskManager<SearchResult[]>> responses = new Dictionary<string, TaskManager<SearchResult[]>>();
 public static Task<SearchResult[]> Search(string query, CancellationToken cancellationToken) =>
 responses.TryGetOrAdd(query,
 () => new TaskManager<SearchResult[]>(async c =>
 (await client.GetAsync<SearchResult[]>(new RestRequest("/api/search").AddQueryParameter("q", query), c)) ?? new SearchResult[0]
 )
 ).Value(cancellationToken);
}

where TryGetOrAdd is defined as follows:

public static class Extensions
{
 public static TValue TryGetOrAdd<TKey, TValue>(this IDictionary<TKey, TValue> d, TKey key, Func<TValue> toAdd)
 {
 if (d.TryGetValue(key, out TValue value))
 return value;
 TValue adding = toAdd();
 lock (d)
 {
 if (d.TryGetValue(key, out value))
 return value;
 d.Add(key, adding);
 }
 return adding;
 }
}

Another Example Usage

Here is one of my classes:

public class CodepediaUtil
{
 public readonly Workspace Workspace;
 public readonly SnapshotSpan Range;
 public readonly TaskManager<(Document document, SemanticModel semanticModel, SyntaxNode outerNode, UnresolvedMethodCallInfo unresolvedInvocation)?> CodeAnalysisInfo;
 public readonly TaskManager<SearchResult[]> SearchResults;
 public readonly TaskManager<Dictionary<SearchResult, WikiEntry>> SearchResultsInfo;
 public async Task<UnresolvedMethodCallInfo?> UnresolvedInvocation (CancellationToken cancellationToken)
 {
 var unresolvedInvocationInfo = await CodeAnalysisInfo.Value(cancellationToken);
 if (unresolvedInvocationInfo is not (_, _, _, UnresolvedMethodCallInfo unresolvedInvocation)) return null;
 return unresolvedInvocation;
 }
 public CodepediaUtil (Workspace workspace, SnapshotSpan range)
 {
 Workspace = workspace;
 Range = range;
 CodeAnalysisInfo = new TaskManager<(Document document, SemanticModel semanticModel, SyntaxNode outerNode, UnresolvedMethodCallInfo unresolvedInvocation)?>
 (
 async (CancellationToken cancellationToken) =>
 {
 Document? document = Range.Snapshot.TextBuffer.GetRelatedDocuments().FirstOrDefault();
 if (document == null) return null;
 SemanticModel? semanticModel = await document.GetSemanticModelAsync(cancellationToken);
 if (semanticModel == null) return null;
 SyntaxNode? outerNode = await CodeAnalyzer.GetOuterNodeAsync(semanticModel, Range.Start, cancellationToken);
 if (outerNode == null) return null;
 UnresolvedMethodCallInfo? unresolvedInvocation = outerNode.FindUnresolvedInvocation(semanticModel);
 if (unresolvedInvocation == null) return null;
 return (document, semanticModel, outerNode, unresolvedInvocation);
 }
 );
 SearchResults = new TaskManager<SearchResult[]>
 (
 async (CancellationToken cancellationToken) =>
 {
 UnresolvedMethodCallInfo? unresolvedInvocation = await UnresolvedInvocation(cancellationToken);
 if (unresolvedInvocation == null) return new SearchResult[0];
 return await CodepediaApi.Search(unresolvedInvocation.Name, cancellationToken);
 }
 );
 SearchResultsInfo = new TaskManager<Dictionary<SearchResult, WikiEntry>>
 (
 async (CancellationToken cancellationToken) =>
 {
 // UnresolvedMethodCallInfo? unresolvedInvocation = await UnresolvedInvocation(cancellationToken);
 // if (unresolvedInvocation == null) return new Dictionary<SearchResult, WikiEntry>();
 // string invocationName = unresolvedInvocation.Name;
 return await CodepediaApi.InterpretSearchResults(await SearchResults.Value(cancellationToken), cancellationToken);
 }
 );
 }
}

This coding pattern should allow other classes to request exactly what they need on a per-need basis.

What I want reviewed

Is my code useful? (Is there a better way to do this?)
Could the performance/elegance of my code improved? (Any comments to make it better would be appreciated.)
Did I make a mistake? (Is my code thread-safe? Are there any race conditions or ways it could get stuck?)

Question 2

Maybe it is too Friday to me, but I don't understand the basic problem which you want to solve. each Task would only be performed once << A Task can be completed either with failure or with success only once. You can await a Task as many times as you want but that will not trigger a new execution. each Task would only be performed when needed << what do mean by that? Do want to have a cold start on the Task?

Question 3

@PeterCsala, the way it differs from Task is that it is not started until the Value function is called on it but if multiple calls are made to the Value function, the Task is only started once.

Question 4

Can't you achieve that with a Lazy<Task>?

Question 5

@PeterCsala, yeah, it's the same idea but I don't think that it will reinitialize the Task if it is cancelled. Eg. a scenario where a user searches for something, cancels and then searches for the same thing again should retry.

Question 6

Well that's a valid point. I will review it next week.

Question 7

Here are my observations:

`TryGetOrAdd` of `Extensions`

Naming a parameter to d is not really a good choice
I guess you have performed the TryGetValue twice due to the double-checked lock
- I would like to warn you with Jon Skeet's words

Double-checking locking now works in Java as well as C# (the Java memory model changed and this is one of the effects). However, you have to get it exactly right. If you mess things up even slightly, you may well end up losing the thread safety.

Locking on the d is not a good idea since it is provided by the caller (so, it can be considered as public)
- Related SO topic
- Pluralsight Best practices
Instead of having three return statements you can achieve the same with two

private static readonly object syncObject = new();
public static TValue TryGetOrAdd<TKey, TValue>(this IDictionary<TKey, TValue> dict, TKey key, Func<TValue> toAdd)
{
 if (dict.TryGetValue(key, out TValue value))
 return value;
 
 lock (syncObject)
 {
 if (!dict.TryGetValue(key, out value))
 {
 value = toAdd();
 dict.Add(key, value);
 }
 return value;
 }
}

`TaskManager`

Is there any particular reason why did you not use target-typed new expression for _cancellationToken?
- By the way _cancellationToken is a really bad name for this member since it is a CancellationTokenSource not a CancellationToken
It is not absolutely clear for me why did you pass the initializer via the ctor and the cancellationToken via the Value method
- Can't you pass the the two either via ctor or via the method?
In case of C# a method usually starts with verb, so GetValue or RetrieveValue might be better
If the IsCancellationRequested is true then would it make more sense to call Task.FromCanceled<T>(cancellationToken) instead?
Here yet again you can simplify your lock block to have a single return statement

lock (_initializationLock)
{
 if (_task is null)
 {
 _cancellationTokenSource = new();
 _task = _initializer(_cancellationTokenSource.Token);
 }
 return _task;
}

`CodepediaApi`'s `Search`

Here I've only two tiny suggestions
- c is not the best name for CancellationToken
- Instead of allocating empty array (new SearchResult[0] rather prefer Array.Empty<SearchResult>()

UPDATE #1: Reflect to the first comment

The method of creating a static variable called syncObject would mean that only one Dictionary could be accessed at a time by the TryGetOrAdd function. In my code's case that work fine as the only Dictionary is the search dictionary. The issue with using another object as a sync object is that since dict is not locked, the first call to dict.TryGetValue(key, out TValue value) could be accessing the Dictionary while it's an invalid state. So, the only solution would be to surround the whole function with lock (syncObject) which would decrease performance

The core problem here is ownership. Since the dictionary is not owned by the method or the class of the method anyone can access it (and potentially modify it) since the lock scope is the method, not the object. Let me demonstrate with a simple example:

public static void Main()
{
 var dict = new Dictionary<string, int>();
 var lockAcquired = new TaskCompletionSource<object>();
 _ = Task.Run(async () => { await lockAcquired.Task; dict.Add("1", 1);});
 Add(dict, lockAcquired);
}
public static void Add(Dictionary<string, int> dict, TaskCompletionSource<object> signal)
{
 lock(dict)
 {
 signal.SetResult(null);
 Thread.Sleep(100);
 dict.Add("1", 1);
 }
}

This code will throw an exception inside the Add method:

System.ArgumentException: An item with the same key has already been added. Key: 1

So, as you can see neither mine nor your solution is fine.

Question 8

The method of creating a static variable called syncObject would mean that only one Dictionary could be accessed at a time by the TryGetOrAdd function. In my code's case that work fine as the only Dictionary is the search dictionary. The issue with using another object as a sync object is that since dict is not locked, the first call to dict.TryGetValue(key, out TValue value) could be accessing the Dictionary while it's an invalid state. So, the only solution would be to surround the whole function with lock (syncObject) which would decrease performance.

Question 9

The reason cancellationToken is provided in the Value function not the constructor is that each requester of the TaskManager has their own cancellationToken. Providing the initializer in the constructor allows TaskManager objects to be passed around without actually starting the task.

Question 10

@trinalbadger587 I've updated my post to reflect your first comment, please check it.

Question 11

As I understand it, the only failure mode for my existing code is if someone adds a key value pair to the Dictionary without locking it while the first TryGetValue function is running as once the Dictionary is locked TryGetValue is called again. I was thinking about what you said about code ownership and I thought that to help improve my code style I could create another class something like public class MyDictionary<K, V> : Dictionary<K, V> {}. Then, I could have the TryGetOrAdd function only accept MyDictionary to prevent misuse with objects managed by other code.

Question 12

@trinalbadger587 Well it depends. Lets suppose you have 3 threads (2 readers, 1 writer) then it might happen that the two reads receive different results. If you have multiple readers and the above behavior is unacceptable then you might need to consider to use a read-write lock. Anyway I don't want to have a too lengthy conversation. If you have any question I'm happy to answer otherwise have a great week!

Peter Csala Peter Csala 10.7k1 gold badge16 silver badges36 bronze badges · Accepted Answer · 2022-01-31 15:09:37Z

Here are my observations:

`TryGetOrAdd` of `Extensions`

Naming a parameter to d is not really a good choice
I guess you have performed the TryGetValue twice due to the double-checked lock
- I would like to warn you with Jon Skeet's words

Double-checking locking now works in Java as well as C# (the Java memory model changed and this is one of the effects). However, you have to get it exactly right. If you mess things up even slightly, you may well end up losing the thread safety.

Locking on the d is not a good idea since it is provided by the caller (so, it can be considered as public)
- Related SO topic
- Pluralsight Best practices
Instead of having three return statements you can achieve the same with two

private static readonly object syncObject = new();
public static TValue TryGetOrAdd<TKey, TValue>(this IDictionary<TKey, TValue> dict, TKey key, Func<TValue> toAdd)
{
 if (dict.TryGetValue(key, out TValue value))
 return value;
 
 lock (syncObject)
 {
 if (!dict.TryGetValue(key, out value))
 {
 value = toAdd();
 dict.Add(key, value);
 }
 return value;
 }
}

`TaskManager`

Is there any particular reason why did you not use target-typed new expression for _cancellationToken?
- By the way _cancellationToken is a really bad name for this member since it is a CancellationTokenSource not a CancellationToken
It is not absolutely clear for me why did you pass the initializer via the ctor and the cancellationToken via the Value method
- Can't you pass the the two either via ctor or via the method?
In case of C# a method usually starts with verb, so GetValue or RetrieveValue might be better
If the IsCancellationRequested is true then would it make more sense to call Task.FromCanceled<T>(cancellationToken) instead?
Here yet again you can simplify your lock block to have a single return statement

lock (_initializationLock)
{
 if (_task is null)
 {
 _cancellationTokenSource = new();
 _task = _initializer(_cancellationTokenSource.Token);
 }
 return _task;
}

`CodepediaApi`'s `Search`

Here I've only two tiny suggestions
- c is not the best name for CancellationToken
- Instead of allocating empty array (new SearchResult[0] rather prefer Array.Empty<SearchResult>()

UPDATE #1: Reflect to the first comment

The method of creating a static variable called syncObject would mean that only one Dictionary could be accessed at a time by the TryGetOrAdd function. In my code's case that work fine as the only Dictionary is the search dictionary. The issue with using another object as a sync object is that since dict is not locked, the first call to dict.TryGetValue(key, out TValue value) could be accessing the Dictionary while it's an invalid state. So, the only solution would be to surround the whole function with lock (syncObject) which would decrease performance

The core problem here is ownership. Since the dictionary is not owned by the method or the class of the method anyone can access it (and potentially modify it) since the lock scope is the method, not the object. Let me demonstrate with a simple example:

public static void Main()
{
 var dict = new Dictionary<string, int>();
 var lockAcquired = new TaskCompletionSource<object>();
 _ = Task.Run(async () => { await lockAcquired.Task; dict.Add("1", 1);});
 Add(dict, lockAcquired);
}
public static void Add(Dictionary<string, int> dict, TaskCompletionSource<object> signal)
{
 lock(dict)
 {
 signal.SetResult(null);
 Thread.Sleep(100);
 dict.Add("1", 1);
 }
}

This code will throw an exception inside the Add method:

System.ArgumentException: An item with the same key has already been added. Key: 1

So, as you can see neither mine nor your solution is fine.

The method of creating a static variable called syncObject would mean that only one Dictionary could be accessed at a time by the TryGetOrAdd function. In my code's case that work fine as the only Dictionary is the search dictionary. The issue with using another object as a sync object is that since dict is not locked, the first call to dict.TryGetValue(key, out TValue value) could be accessing the Dictionary while it's an invalid state. So, the only solution would be to surround the whole function with lock (syncObject) which would decrease performance.
The reason cancellationToken is provided in the Value function not the constructor is that each requester of the TaskManager has their own cancellationToken. Providing the initializer in the constructor allows TaskManager objects to be passed around without actually starting the task.
@trinalbadger587 I've updated my post to reflect your first comment, please check it.
As I understand it, the only failure mode for my existing code is if someone adds a key value pair to the Dictionary without locking it while the first TryGetValue function is running as once the Dictionary is locked TryGetValue is called again. I was thinking about what you said about code ownership and I thought that to help improve my code style I could create another class something like public class MyDictionary<K, V> : Dictionary<K, V> {}. Then, I could have the TryGetOrAdd function only accept MyDictionary to prevent misuse with objects managed by other code.
@trinalbadger587 Well it depends. Lets suppose you have 3 threads (2 readers, 1 writer) then it might happen that the two reads receive different results. If you have multiple readers and the above behavior is unacceptable then you might need to consider to use a read-write lock. Anyway I don't want to have a too lengthy conversation. If you have any question I'm happy to answer otherwise have a great week!

Stack Exchange Network

Task manager object for lazy initialization

The Problem

The Code

Example Usage

Another Example Usage

What I want reviewed

1 Answer 1

`TryGetOrAdd` of `Extensions`

`TaskManager`

`CodepediaApi`'s `Search`

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Task manager object for lazy initialization

The Problem

The Code

Example Usage

Another Example Usage

What I want reviewed

1 Answer 1

TryGetOrAdd of Extensions

TaskManager

CodepediaApi's Search

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

`TryGetOrAdd` of `Extensions`

`TaskManager`

`CodepediaApi`'s `Search`