I've googled this plenty but I'm afraid I don't fully understand the consequences of concurrency and parallelism.
I have about 3000 rows of database objects that each have an average of 2-4 logical data attached to them that need to be validated as a part of a search query, meaning the validation service needs to execute approx. 3*3000 times. E.g. the user has filtered on color then each row needs to validate the color and return the result. The loop cannot break when a match has been found, meaning all logical objects will always need to be evaluated (this is due to calculations of relevance and just not a match).
This is done on-demand when the user selects various properties, meaning performance is key here.
I'm currently doing this by using Parallel.ForEach but wonder if it is smarter to use async behavior instead?
Current way
var validatorService = new LogicalGroupValidatorService();
ConcurrentBag<StandardSearchResult> results = new ConcurrentBag<StandardSearchResult>();
Parallel.ForEach(searchGroups, (group) =>
{
var searchGroupResult = validatorService.ValidateLogicGroupRecursivly(
propertySearchQuery, group.StandardPropertyLogicalGroup);
result.Add(new StandardSearchResult(searchGroupResult));
});
Async example code
var validatorService = new LogicalGroupValidatorService();
List<StandardSearchResult> results = new List<StandardSearchResult>();
var tasks = new List<Task<StandardPropertyLogicalGroupSearchResult>>();
foreach (var group in searchGroups)
{
tasks.Add(validatorService.ValidateLogicGroupRecursivlyAsync(
propertySearchQuery, group.StandardPropertyLogicalGroup));
}
await Task.WhenAll(tasks);
results = tasks.Select(logicalGroupResultTask =>
new StandardSearchResult(logicalGroupResultTask.Result)).ToList();
-
Since you seem to have both versions implemented, what does it look like if you measure execution times of both?Risto M– Risto M2018年02月27日 09:22:43 +00:00Commented Feb 27, 2018 at 9:22
-
Did you measure the difference in performance? That is the only way to be sure. That said, my guess is that parallel should perform better in this case as async is mainly "do not block the main thread while waiting on other systems"Hans Kesting– Hans Kesting2018年02月27日 09:23:34 +00:00Commented Feb 27, 2018 at 9:23
-
So does ValidateLogicGroupRecursivly work with database or everything is done in memory?Evk– Evk2018年02月27日 09:32:14 +00:00Commented Feb 27, 2018 at 9:32
-
For 3000 or 9000 rows it doesn't matter, unless you are doing some roundtrips to the database or any CPU heavy work per row its not going to matter. 9000 iterations of a loop to calculate some simple formula or value is nothing for todays computers. Unless you have measured real world bottle neck issues on it, just do it on a single thread. starting/queuing too many threads in ASP.NET Core may actually lower your overall performance instead of increasing it (ASP.NET Core starts rejecting connections in a high-traffic situation when it runs out of (queued) threads)Tseng– Tseng2018年02月27日 09:33:44 +00:00Commented Feb 27, 2018 at 9:33
-
And if you have to do any roundtrips to the DB there may be better ways to solve it (find out the values you need in code and fetch then all in a single query, then perform the calculation locally) (Heavy CPU means like 1-2 ms per row, so that the whole calculation would take 9 to 18 seconds. Typical calculations on todays CPUs are in the range of ns, so 9000 records making no big difference if it takes 1 or 2 ms)Tseng– Tseng2018年02月27日 09:36:22 +00:00Commented Feb 27, 2018 at 9:36
1 Answer 1
The difference between parallel and async is this:
- Parallel: Spin up multiple threads and divide the work over each thread
- Async: Do the work in a non-blocking manner.
Whether this makes a difference depends on what it is that is blocking in the async-way. If you're doing work on the CPU, it's the CPU that is blocking you and therefore you will still end up with multiple threads. In case it's IO (or anything else besides the CPU, you will reuse the same thread)
For your particular example that means the following:
Parallel.ForEach
=> Spin up new threads for each item in the list (the nr of threads that are spun up is managed by the CLR) and execute each item on a different thread
async/await
=> Do this bit of work, but let me continue execution. Since you have many items, that means saying this multiple times. It depends now what the results:
- If
this bit of work
is on the CPU, the effect is the same - Otherwise, you'll just use a single thread while the work is being done somewhere else
5 Comments
var results = searchGroups.Select(group => new StandardSearchResult(validatorService.ValidateLogicGroupRecursivly(propertySearchQuery, group.StandardPropertyLogicalGroup))).ToList();
and just be happy with it?Parallel.ForEach
does not spin up new threads, it uses tasks from the thread pool, and 2) calling async/await
on Task.WhenAll
cannot possibly use a single thread, on the contrary, Task.WhenAll(tasks)
will always create tasks.Count
tasks.Explore related questions
See similar questions with these tags.