I am new to the .NET framework's Task Parallel Library (and multitasking and multithreading in general). From the literature I have read, I should just be able to create a bunch of tasks, run them and the framework should take care of spawning the necessary number of threads according to the resources available on the system. The problem I have is making thousands of HTTP requests in a timely manner. So this is the code I have.
var taskList = new List<Task>();
foreach(var request in requests)
{
taskList.Add(client.SendAsync(request));
}
Task.WaitAll(taskList.ToArray());
client
is a System.Net.Http.HttpClient
object.
I am using Task.WaitAll()
because this code is inside a method that is not async.
To test this code, I am making the requests to another server on the same LAN.
The requests collection has over 15,000 so a task should be created for each object.
But it only manages to run about 7,000 before throwing an aggregate exception. The inner exception doesn't seem to be very helpful, stating only
"A task was cancelled"
though the cancellation token reports that no cancellation was requested.
The stack trace isn't very helpful either with the most recent calls shown being:
at System.Threading.Tasks.Task.WaitAll(Task[] tasks, Int32 millisecondsTimeout, CancellationToken cancellationToken) at System.Threading.Tasks.Task.WaitAll(Task[] tasks, Int32 millisecondsTimeout)
at System.Threading.Tasks.Task.WaitAll(Task[] tasks)
I also played around with Parallel.Invoke()
but that proved much worse.
var taskActionList = new List<Action>();
foreach(var request in requests)
{
taskActionList.Add(() => client.SendAsync(request));
}
Parallel.Invoke(taskActionList.ToArray());
This does not throw any exceptions but it only runs about 1,300 tasks and the code runs to completion.
My question is, how do you use the Task Parallel Library to efficiently make a large number of HTTP requests? Is there something I am missing?
1 Answer 1
The TPL has no idea how to best schedule your HTTP calls. It does not even know that you are performing IO. It's heuristics are inadequate.
Usually, the optional degree of parallelism for IO needs to be determined experimentally. You need to write the code so that this optimal DOP is being used. None of the built-in constructs can provide you with an exact DOP. It's always a maximum.
Here, something was overloaded causing timeouts. A cancellation exception often is a sign of a timeout (yes, this is questionable API design).
You can use ForEachAsync
for this.
requests.ForEachAsync(async () => await ProcessAsync(request)).Wait();
async - await
approach which you tried.