Quicksort - follow-up

Question 1

This is a follow-up to

Like the previous code, this does not optimize for containers with large numbers of duplicate elements. How can I improve this code, particularly with iterators?

template<typename C>
void insertion_sort(C low, C high) {
 for (auto i = low; i < high; ++i) {
 for (auto j = i; j != low && *j < *(j-1); --j) {
 std::swap(*j, *(j-1));
 }
 }
}
template<typename C, typename I>
I partition(C& c, I low, I high) {
 auto p = *low;
 I i = low;
 I j = high;
 while (true) {
 while (i+1 != std::end(c) && *(++i) < p) {
 if (i == high) {
 break;
 }
 }
 while (p < *(--j)) { }
 if (i >= j) {
 break;
 }
 std::swap(*i, *j);
 }
 std::swap(*low, *j);
 return j;
}
template<typename C, typename I>
void sort(C& c, I low, I high) {
 if (high <= low+15) {
 insertion_sort(low, high);
 return;
 }
 I p = partition(c, low,high);
 sort(c, low, p);
 sort(c, p+1,high);
}
template<typename C>
void sort(C& c) {
 if (std::begin(c) == std::end(c)) {
 throw std::out_of_range("no size");
 }
 if (c.size() == 1) {
 return;
 }
 std::random_shuffle(std::begin(c), std::end(c));
 sort(c, std::begin(c), std::end(c));
}

Question 2

You could fix most pathological cases with a Dual Pivot Quicksort.

Question 3

Sorting an empty array is perfectly valid.

if (c.size() == 0) {
 throw std::out_of_range("no size");
}

The result id an empty array. An exception is not the correct way to go.

OK optimize. An array of size 0 or 1 is already sorted.

if (c.size() == 1) { // I would make this c.size() <= 1 (as I would remove the exception)
 return;
}

Special casing the std::end(c) just looks wrong.

 while (i+1 != std::end(c) && *(++i) < p) {

I would say that if i+1 == high would be a better test. Also adding the ++ and * operators into the expression makes it really hard to read. I would go with:

 while ((i < high) && (*i < p)) { ++i;}

The decrementing the j pointer should look exactly the same as the increment.

This is not tested (so may have bugs).
But this is how I would write partition.

// Note: this function require std::distance(low, high) >= 3
// Note: it is called from sort when std::distance(low, high) > 15
template<typename I>
std::pair<I,I> partition(I low, I high) {
 --high;
 I pivot = high;
 I highBot = low;
 I lowTop = high;
 --lowTop;
 while (highBot < lowTop) {
 while ((highBot < high) && (*highBot <= *pivot)) { ++highBot;}
 while ((lowTop >= low) && (*pivot < *lowTop)) { --lowTop;}
 if (highBot < lowTop) {
 std::swap(*highBot, *lowTop);
 }
 }
 // At this point highBot and lowTop have just passed each other.
 // Thus highBot points to the first element greater than *pivot
 // and lowTop points to the last element smaller or equal to *pivot
 swap(*highBot, *pivot);
 // Now: highBot points to the pivot value and the value that was
 // at highBot is in the top end of the high range.
 // We want to to remove pivot value from the range of elements to be sorted.
 // so we want to sort the two ranges likes this:
 //
 // [low..lowTop+1)
 // highBot The pivot point removed from further sorting.
 // Note: lopTop+1 == highBot
 // [highBot+1..high)
 //
 // Small optimization
 // If the value at lowTop is also equal to the pivot value
 // Then we can remove it from further sorting.
 for(; (lowTop >= low) && (*p == *lowTop); --lowTop) {}
 return std::make_pair(lowTop+1, highBot+1);
}

Usage:

std::pair<I,I> part = partition(low, high);
sort(c, low, part.first);
sort(c, part.second, high);

Question 4

@Loki already covered the most important points, but there are still a few things that you could change to improve the algorithm further (provided you're still interested in this question one year later):

A well-designed sorting algorithm should be able to take advantage of user-defined swap functions instead of simply using std::swap everywhere; these user-defined overloads might be optimized. We can use argument-dependent lookup to do the job:
```
using std::swap;
swap(*it1, *it2);
```
The first line tells the compiler to take std::swap into account for unqualified calls to swap. The second line makes an unqualified call to swap: the compiler will check the types of *it1 and *it2 and search for a swap function in their associated namespace. If the lookup finds such a function, it will use it, otherwise it will use std::swap instead. Note that you can also use std::iter_swap which does exactly that:
```
std::iter_swap(it1, it2);
```
Algorithms that care about comparison generally allow to pass an additional compare parameter to use instead of operator<. It allows to sort types that do not define operator< and to sort things differently (for example, it allows to use std::greater<> to sort a collection in reverse order. Here is a comparison-enhanced insertion sort:
```
template<typename C, typename Compare=std::less<>>
void insertion_sort(C low, C high, Compare compare={}) {
 for (auto i = low; i < high; ++i) {
 for (auto j = i; j != low && compare(*j, *(j-1)); --j) {
 std::iter_swap(j, j - 1);
 }
 }
}
```
std::sort is required to work with move-only types, and quicksort generally requires to copy the pivot before the partitioning operation to avoid tricky problems. To implement a quicksort without compying the pivot, the trick either to choose std::prev(high) as the pivot or to swap the pivot with std::prev(high) before the partitioning step.
It's a mere detail, but the condition if (high <= low+15) is not the easiest to reason about. I would use if (std::distance(low, high) <= 15), which tells more about the reasoning in my opinion.

Loki Astari Loki Astari 97.7k5 gold badges126 silver badges341 bronze badges · Accepted Answer · 2015-01-03 16:24:27Z

Sorting an empty array is perfectly valid.

if (c.size() == 0) {
 throw std::out_of_range("no size");
}

The result id an empty array. An exception is not the correct way to go.

OK optimize. An array of size 0 or 1 is already sorted.

if (c.size() == 1) { // I would make this c.size() <= 1 (as I would remove the exception)
 return;
}

Special casing the std::end(c) just looks wrong.

 while (i+1 != std::end(c) && *(++i) < p) {

I would say that if i+1 == high would be a better test. Also adding the ++ and * operators into the expression makes it really hard to read. I would go with:

 while ((i < high) && (*i < p)) { ++i;}

The decrementing the j pointer should look exactly the same as the increment.

This is not tested (so may have bugs).
But this is how I would write partition.

// Note: this function require std::distance(low, high) >= 3
// Note: it is called from sort when std::distance(low, high) > 15
template<typename I>
std::pair<I,I> partition(I low, I high) {
 --high;
 I pivot = high;
 I highBot = low;
 I lowTop = high;
 --lowTop;
 while (highBot < lowTop) {
 while ((highBot < high) && (*highBot <= *pivot)) { ++highBot;}
 while ((lowTop >= low) && (*pivot < *lowTop)) { --lowTop;}
 if (highBot < lowTop) {
 std::swap(*highBot, *lowTop);
 }
 }
 // At this point highBot and lowTop have just passed each other.
 // Thus highBot points to the first element greater than *pivot
 // and lowTop points to the last element smaller or equal to *pivot
 swap(*highBot, *pivot);
 // Now: highBot points to the pivot value and the value that was
 // at highBot is in the top end of the high range.
 // We want to to remove pivot value from the range of elements to be sorted.
 // so we want to sort the two ranges likes this:
 //
 // [low..lowTop+1)
 // highBot The pivot point removed from further sorting.
 // Note: lopTop+1 == highBot
 // [highBot+1..high)
 //
 // Small optimization
 // If the value at lowTop is also equal to the pivot value
 // Then we can remove it from further sorting.
 for(; (lowTop >= low) && (*p == *lowTop); --lowTop) {}
 return std::make_pair(lowTop+1, highBot+1);
}

Usage:

std::pair<I,I> part = partition(low, high);
sort(c, low, part.first);
sort(c, part.second, high);

Stack Exchange Network

Quicksort - follow-up

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Quicksort - follow-up

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions