Multi-threaded sort

Question 1

I wanted to write a multi-threaded sort, but unfortunately I don't know much about threads, especially in C++11. I managed to make something work, but I would be highly surprised if it was the best way to do it.

template<class T>
void ___sort(T *data, int len, int nbThreads){
 if(nbThreads<2){
 std::sort(data, (&(data[len])), std::less<T>());
 }
 else{
 std::future<void> b = std::async(std::launch::async,___sort<T>, data, len/2, nbThreads-2);
 std::future<void> a = std::async(std::launch::async,___sort<T>, &data[len/2], len/2, nbThreads-2);
 a.wait();
 b.wait();
 std::inplace_merge (data,&data[len/2],&data[len],std::less<T>());
 }
}

Some of the tests I did sorting integers:

size is the number of ints sorted, and the time is in seconds.

size : 2097152
time with 1 thread : 4.961
time with 2 thread : 3.191
time with 4 thread : 2.377
size : 4194304
time with 1 thread : 10.002
time with 2 thread : 6.214
time with 4 thread : 4.689
size : 8388608
time with 1 thread : 19.975
time with 2 thread : 12.332
time with 4 thread : 9.29
size : 16777216
time with 1 thread : 38.712
time with 2 thread : 25.257
time with 4 thread : 18.706

Also, I tried using std::qsort instead of std::sort, and the results were at least twice as long as that. Any reasons why?

Question 2

Don't use '__' in an identifer name.stackoverflow.com/a/228797/14065

Question 3

I think that one additional optimisation could be to avoid "false sharing" of cache line between threads. The way the buffer is divided this could happen with the current code..

Question 4

Something like this is probably more efficient:

template<class T>
void parallel_sort(T* data, int len, int grainsize)
{
 // Use grainsize instead of thread count so that we don't e.g.
 // spawn 4 threads just to sort 8 elements.
 if(len < grainsize)
 {
 std::sort(data, data + len, std::less<T>());
 }
 else
 {
 auto future = std::async(parallel_sort<T>, data, len/2, grainsize);
 // No need to spawn another thread just to block the calling 
 // thread which would do nothing.
 parallel_sort(data + len/2, len - len/2, grainsize);
 future.wait();
 std::inplace_merge(data, data + len/2, data + len, std::less<T>());
 }
}

You could set grain size to something like, std::max(256, len/nb_threads).

Question 5

Yes I don't want 4 threads to sort 8 items. But then again I don't want to spawn more threads than I have parallelism available (not more ((1.0 -> 1.5) * <available cores>)). So I would use a combination of these two techniques.

Question 6

@LokiAstari: See my suggestion below the code, set grainsize to e.g. std::max(256, len/nb_threads):

Question 7

parallel_sort(data + len/2, len/2, grainsize); should be: parallel_sort(data + len/2, len - len/2, grainsize);

ronag 5965 silver badges13 bronze badges · Accepted Answer · 2013-02-15 13:14:46Z

Something like this is probably more efficient:

template<class T>
void parallel_sort(T* data, int len, int grainsize)
{
 // Use grainsize instead of thread count so that we don't e.g.
 // spawn 4 threads just to sort 8 elements.
 if(len < grainsize)
 {
 std::sort(data, data + len, std::less<T>());
 }
 else
 {
 auto future = std::async(parallel_sort<T>, data, len/2, grainsize);
 // No need to spawn another thread just to block the calling 
 // thread which would do nothing.
 parallel_sort(data + len/2, len - len/2, grainsize);
 future.wait();
 std::inplace_merge(data, data + len/2, data + len, std::less<T>());
 }
}

You could set grain size to something like, std::max(256, len/nb_threads).

Yes I don't want 4 threads to sort 8 items. But then again I don't want to spawn more threads than I have parallelism available (not more ((1.0 -> 1.5) * <available cores>)). So I would use a combination of these two techniques.
@LokiAstari: See my suggestion below the code, set grainsize to e.g. std::max(256, len/nb_threads):
parallel_sort(data + len/2, len/2, grainsize); should be: parallel_sort(data + len/2, len - len/2, grainsize);

Stack Exchange Network

Multi-threaded sort

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Multi-threaded sort

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions