I'm playing around with switching some code to using Parallel.ForEach
and I have a question about whether I need to be concerned about the effects of concurrency here. Here's a simplified version of what I'm doing:
public IList<Item> RetrieveAllItems(int paramValue)
{
var itemFuncs = new List<Func<int, IEnumerable<Item>>>
{
RetrieveItems1,//one method to retrieve items, taking an int param
RetrieveItems2//another one that takes an int param
};
List<Item> allItems = new List<Item>();
Parallel.ForEach(itemFuncs, (f) =>
{
//is this a problem?
allItems.AddRange(f(paramValue));
});
return allItems;
}
For this application, I do not care about the order that my Item
objects are added to allItems
(in fact, I do explicit ordering later on). My question is whether I could get into trouble here with both functions attempting to add Item
objects at the same time.
My motivation for playing around with this is simply to try to speed up this sequence of events, and when I tried using a BlockingCollection
instead, my timings, obviously slowed down to the point that there was hardly any difference between doing these two operations in parallel versus the old, serial way. When I use a simple List<Item>
then I do see impressive speed-ups.
So, considering I am just adding objects to a collection in a parallel fashion, is using a non-thread safe collection asking for trouble here? My real-world experience with multi-threaded/parallel programming is very sparse, hence the uncertainty here.
2 Answers 2
Yes, what you are doing is a problem. List
is not thread-safe and adding items from multiple threads can lead to data corruption (for example individual elements overwriting each other).
You said that you don't care about the order of items inserted quickest fix is to use ConcurrentBag
Another option is to the PLINQ extensions and use the Select
extension of the ParallelQuery
. The code would look like this:
return itemFuncs.AsParallel().SelectMany(f => f(paramValue)).ToList();
-
1\$\begingroup\$ Thanks for the tips and answer. Another way of asking about this would have been for me to ask if adding items to a
List
is an atomic operation. Apparently, it isn't so that is good to know. \$\endgroup\$Sven Grosen– Sven Grosen2013年12月27日 22:48:34 +00:00Commented Dec 27, 2013 at 22:48 -
\$\begingroup\$ One correction, I'd need to use
itemFuncs.AsParallel().SelectMany(f => f(paramValue)).ToList();
otherwise I have a nested collection. \$\endgroup\$Sven Grosen– Sven Grosen2013年12月27日 22:53:04 +00:00Commented Dec 27, 2013 at 22:53 -
\$\begingroup\$ @ledbutter: Ah yes, I missed that, corrected now. \$\endgroup\$ChrisWue– ChrisWue2013年12月28日 01:35:40 +00:00Commented Dec 28, 2013 at 1:35
If you only need to return an IEnumerable<Item>
you could simply enumerate to List
s and then Concat
the results.
Otherwise you could consider using a linked list, enumerating in parallel and then again Concat
enating for O(1).
About the original question: yes, that code is dangerous and prone to race conditions