My version of C++ quicksort

Question 1

I built my code according to the algorithm provided in this YouTube video. Please review it and help me find any problems. It definitely takes a lot longer than the qsort() function in C++. But the algorithm I use should be just as fast. Can anyone find problems?

#include <iostream>
#include <time.h>
using namespace std;
void randomize( int* array, int size );
void qsort( int* array, int leftbound , int rightbound );
void swap( int &a, int &b );
void output( int* array, int size );
int main()
{ 
 //declare array
 int array[100];
 //get size of the array
 int size = sizeof(array)/sizeof(int);
 //randomize value
 randomize(array, size);
 //start the quicksort algorithm
 qsort(array, 0, size-1 );
 //output the array
 output(array, size);
 return 0;
}
void randomize( int* array, int size )
{
 srand((int)(time(0)));
 for ( int i = 0 ; i < size ; i++ ) {
 array[i] = rand();
 }
}
void output( int* array, int size )
{
 for ( int i = 0 ; i < size ; i++ ) {
 cout << array[i] << endl;
 }
}
void qsort( int* array, int leftbound , int rightbound )
{
 if (rightbound > leftbound) {
 int pivot = array[leftbound + (rand() % (rightbound - leftbound + 1))];
 int leftposition = leftbound;
 int rightposition = rightbound;
 while ( leftposition != rightposition ) {
 while ( array[leftposition] < pivot ) {
 leftposition++;
 }
 while ( array[rightposition] > pivot ) {
 rightposition--;
 }
 swap ( array[leftposition], array[rightposition] );
 }
 qsort( array, 0, leftposition - 1 );
 qsort( array, leftposition + 1, rightbound );
 }
}
void swap(int &a, int &b)
{
 int tmp = a;
 a = b;
 b = tmp;
}

Question 2

Please stop using:

using namespace std;

It may seem like a time saver. But for anything that isn't a toy program, it causes problems with collisions. So, it is a bad habit to get into. Stop before it becomes a habit.

You have alternatives.
You can selectively bring stuff into the current scope:

using std::cout;
using std::vector;

Alternatively (my preference) you prefix stuff in the the standard library with std::. The name std was chosen so it was short enough that people would not be hampered with typing it.

Prefer to use the C++ version of the headers:

#include <time.h>
// Prefer
#include <ctime>

Note: This also puts the functions in the standard namespace.

This is correct:

int size = sizeof(array)/sizeof(int);

But it makes the code brittle. If you change the type of the array. You also need to remember to change type here in the size expression. It is best to let the compiler deduce this so use:

int size = sizeof(array)/sizeof(array[0]);

Usage:

qsort(array, 0, size-1 );

Here you are making right bound point at the rightmost item. In C++ it is so common that the end points one past the end of the data it may make the code slightly harder to grok. I would stick with the standard convention of pointing one past the end. This means there will be some adjustment in your qsort but not that much and it makes it easier for C++ users to read.

Prefer pre-increment:

++leftposition; // rather than leftposition++;

Though it makes no difference in this situation. There are situations where it can. So it one of those habits you should get into. Then when the types of objects are changed you do not need to search your code to make sure you are using the efficient version you know you used the efficient version automatically.

Your test for a sortable array is not aggressive enough:

if (rightbound > leftbound) {

You only don't sort if the array is zero length. Which may potentially lead to long recursion as you randomly choose a pivot point that always dives stuff to one side (luckily your non standard code below saved you by always leaving the pivot off subsequent sorts).

The real answer is you only need to sort arrays that have a size>= 2. (size zero and one are already sorted).

if ((rightbound - leftbound /* +1 depending on if you rightbound is one past or not*/ ) >= 2)

Small difference from a standard implementation here:
Which can potentially give you an infinite loop.

 while ( array[leftposition] < pivot ) {
 leftposition++;
 }
 while ( array[rightposition] > pivot ) {
 rightposition--;
 }

What happens if the value is equal to the pivot. In that case it will go to a random side. Also if both left and right side hit a value that is equal to pivot then you will swap them and restart the loop which will do nothing swap them and restart the loop (etc)

 while ( array[leftposition] <= pivot ) {
 leftposition++;
 }
 while ( array[rightposition] > pivot ) {
 rightposition--;
 }

Your left bound of the first recursive call is wrong.

 qsort( array, 0, leftposition - 1 );
 qsort( array, leftposition + 1, rightbound );

Should be leftbound.

 qsort( array, leftbound, leftposition - 1 );
 qsort( array, leftposition + 1, rightbound );

Since there can potentially be more than one value equal to pivot. Maybe you can remove the all from the side you put them in.

You need to look at the standard libs.

We have a std::swap()

void swap(int &a, int &b)
// use 
std::swap(val1, val2)

We have a way to print stuff:

void output( int* array, int size )
// use
std::copy(std::begin(array), std::end(array), std::ostream_iterator<int>(std::cout, "\n"));

We have a way to generate values:

void randomize( int* array, int size )
// use
generate(std::begin(array), std::end(array), ::rand);

This is what I got:

#include <iostream>
#include <iterator>
#include <algorithm>
#include <ctime>
void qsort( int* array, int leftbound , int rightbound );
int main()
{ 
 //declare array
 int array[100];
 //get size of the array
 int size = sizeof(array)/sizeof(array[0]);
 //randomize value
 std::srand(std::time(NULL));
 std::generate(std::begin(array), std::end(array), std::rand);
 //start the quicksort algorithm
 qsort(array, 0, size);
 //output the array
 std::copy(std::begin(array), std::end(array), std::ostream_iterator<int>(std::cout, "\n"));
}
void qsort( int* array, int leftbound , int rightbound )
{
 if ((rightbound - leftbound) >= 2)
 { 
 int pivot = array[leftbound + (rand() % (rightbound - leftbound))];
 int leftposition = leftbound;
 int rightposition = rightbound - 1;
 while ( leftposition < rightposition )
 { 
 while ((array[leftposition] <= pivot) && (leftposition < rightposition))
 { ++leftposition;
 } 
 while ((array[rightposition] > pivot ) && (leftposition < rightposition))
 { --rightposition;
 } 
 std::swap(array[leftposition], array[rightposition]);
 }
 // At least the pivot point will be on the left hand side.
 // This will also be the largest value. So move the leftposition back
 // to remove all the pivot points.
 while(((leftposition-1) > leftbound) && (array[leftposition-1] == pivot))
 { --leftposition;
 }
 qsort(array, leftbound, leftposition-1); // leftposition is one past the end of the left
 qsort(array, rightposition+1, rightbound); // Thus at the start of the right.
 } 
}

Question 3

+1 but omitting elements equal to the pivot is not only common but actually required for strong runtime guarantees, if I recall correctly from the Bentley & McIlroy paper.

Question 4

@KonradRudolph: Yep you are correct. With a badly chosen pivot each time it could potentially recurce forever if you don't take the pivot values. Have changed the above to correct.

Question 5

Why prefer <ctime> over <time.h>? The first puts the symbols into namespace std and maybe also in global scope, the latter the other way around. I'm missing any advantage at all.

Question 6

Also, we have neither std::begin nor std::end for pointers.

Question 7

@Deduplicator: Putting things in a namespace is an advantage unto itself. So making sure they are in std is useful. Also consistency across C++ programs. Why not use the C++ version of the library.

Question 8

similar to the last one below is a list of differences

Accounted for the condition where both left and right point to the pivot in this case you have the code keep swapping endlessly, this would happen in the case of an array 100 100 5 4 3 100 where say value 100 is the pivot at slot 1, then slot zero and slot 5 would never move since you are checking just operator < so you need to move one of the pointers
Also removed the additional 2 swaps to move the pivot out and back into the list as this is not needed.
Additionally, I am not using c++11 so I did this using c++98 removing the c++11 references in the other version
Kept the same syntax of allowing the end of the function to be one more than the end of the data to be consistent with other stl like calls.
Finally just added some logging so that someone can follow what is happening when running the code.

Hope you like this version, enjoy splitting lists, moving pointers, and recursing.

#include <iostream>
#include <iterator>
#include <algorithm>
#include <ctime>
void quicksort ( int array[], int start, int end);
void printArray ( int array[], size_t N, const int p = -1);
void randomize (int array[], size_t N);
int main ( int argc, char *argv[])
{
 const int SIZE = 6;
 int x[SIZE];
 std::cout << "Starting Quick Sort" <<std::endl;
 randomize(x, SIZE);
 std::cout << "Array before quick sort" << std::endl;
 printArray(x,SIZE);
 quicksort(x,0,SIZE);
 std::cout << "Array after quick sort" << std::endl;
 printArray(x,SIZE);
}
void quicksort (int array[], int start, int end )
{
 static unsigned int calls = 0;
 std::cout << "QuickSort Call #: " << ++calls << std::endl;
 //function allows one past the end to be consistent with most function calls
 // but we normalize to left and rightbounds that point to the data
 int leftbound = start;
 int rightbound = end - 1;
 if (rightbound <= leftbound )
 return;
 int pivotIndex = leftbound + (rand() % (end - leftbound));
 int pivot = array[pivotIndex];
 std::cout << " Pivot: " << "[" << pivotIndex << "] " << pivot << std::endl;
 printArray (array,end,pivot);
 int leftposition = leftbound;
 int rightposition = rightbound; // accounting for pivot that was moved out
 while ( leftposition < rightposition )
 {
 while ( leftposition < rightposition && array[leftposition] < pivot )
 ++leftposition;
 while ( rightposition > leftposition && array[rightposition] > pivot )
 --rightposition;
 if(leftposition < rightposition)
 {
 if (array[leftposition] != array[rightposition])
 {
 std::swap(array[leftposition],array[rightposition]);
 std::cout << " Swapping RightPosition: " << rightposition << " and LeftPosition: " << leftposition << std::endl;
 printArray (array,end,pivot);
 }
 else
 ++leftposition;
 }
 }
 std::cout << "Array at the end of QuickSort Call #: " << calls << std::endl;
 printArray (array,end,pivot);
 // sort leaving the pivot out
 quicksort (array,leftbound, leftposition); // leftposition is at the pivot which is one past the data
 quicksort (array,leftposition + 1,end); // leftposition + 1 is past the pivot till the end
}
void printArray ( int array[], size_t N, const int p)
{
//output the array
 for ( unsigned int i = 0; i < N; ++i)
 {
 if (array[i] == p)
 {
 std::cout << " [" << i << "] *" << array[i] << "*";
 }
 else
 {
 std::cout << " [" << i << "] " << array[i];
 }
 }
 std::cout << std::endl;
}
void randomize (int array[], size_t N)
{
 static const unsigned int RANDSIZE = 100;
 srand(time(0));
 for ( unsigned int i = 0; i < N; ++i)
 {
 array[i]=rand() % RANDSIZE + 1;
 }
}

Question 9

The first major problem I see is that you're not considering rightposition when incrementing leftposition and vice versa. This can cause L and R to cross, which will cause additional unnecessary swaps:

pivot value 7
4,7,3,9,6,2,8,1
 L R 
4,7,3,9,6,2,8,1
 L R //<--this is the last swap that should happen, but L < R
4,1,3,9,6,2,8,7
 L //...so L moves to the next value greater than the pivot
4,1,3,2,6,9,8,7
 R L //...and R moves to the next value less than the pivot
4,1,3,2,6,9,8,7
 R L //...which are swapped incorrectly
4,1,3,2,9,6,8,7

I'm going to assume that the != in the main pivoting loop is a typo, because combined with the lack of checking that the positions don't cross, the positions would be incremented beyond the ends of the array and the whole thing would error out.

leftposition and rightposition should NEVER cross. In addition, even if they don't and you miraculously end up with both leftpos and rightpos on the same element (which would have to be the pivot), you swap that element with itself unnecessarily; don't do that.

Finally, in your code the "left half" call always starts at index zero; that means that the left-half call that branches from all right-half calls will scan through the entire left-hand side of the array, not just of the half it should have been given. Because the left half is sorted first, there should be no swaps, but it still has to scan through all those elements, which makes the right-hand half of the algorithm approach O(N²) complexity.

Try this:

void qsort( int* array, int leftbound, int rightbound )
{
 if (rightbound <= leftbound) return; //reduce nesting
 int pivotIndex = leftbound + (rand() % (rightbound - leftbound + 1);
 int pivot = array[pivotIndex];
 //move the pivot value out of the way; we will discover where it goes
 swap(array[pivotIndex], array[rightbound]);
 int leftposition = leftbound;
 int rightposition = rightbound-1;
 //you could use the != here if you really wanted, WITH the changes inside the loop
 while ( leftposition < rightposition ) 
 {
 //don't move leftpos past rightpos
 while ( array[leftposition] < pivot && leftposition < rightposition) 
 leftposition++;
 //ditto
 while ( array[rightposition] > pivot && rightposition > leftposition )
 rightposition--;
 //don't swap if the two positions have merged
 if(leftposition < rightposition)
 swap ( array[leftposition], array[rightposition] );
 }
 //now, swap the pivot element back in; leftposition is its proper location.
 swap(array[rightbound], array[leftposition]);
 //No matter what, leftposition (and rightposition) will be at an element 
 //that should be to the right of the pivot, so the swap works. 
 //Trace it out yourself if you don't believe me.
 //here's your main inefficiency; you were always starting from index 0 
 //on the left side, meaning that the call for the "left half" of all 
 //"right half" calls scans most of the array redundantly.
 qsort( array, leftbound, leftposition - 1 );
 qsort( array, leftposition + 1, rightbound); 
}

I'm sure there are some additional micro-optimizations that could be made here:

since we're making the leftpos vs rightpos check in all the right places, we could use != instead of < everywhere which would probably be faster.
The prefix increment operation is performed assuming that the existing value isn't needed for any further operation, and so it just happens, making it faster than the postfix increment.

However, the algorithm above will perform correctly which is the primary goal.

score 10 · Accepted Answer · 2012-07-13 18:20:40Z

Please stop using:

using namespace std;

It may seem like a time saver. But for anything that isn't a toy program, it causes problems with collisions. So, it is a bad habit to get into. Stop before it becomes a habit.

You have alternatives.
You can selectively bring stuff into the current scope:

using std::cout;
using std::vector;

Alternatively (my preference) you prefix stuff in the the standard library with std::. The name std was chosen so it was short enough that people would not be hampered with typing it.

Prefer to use the C++ version of the headers:

#include <time.h>
// Prefer
#include <ctime>

Note: This also puts the functions in the standard namespace.

This is correct:

int size = sizeof(array)/sizeof(int);

But it makes the code brittle. If you change the type of the array. You also need to remember to change type here in the size expression. It is best to let the compiler deduce this so use:

int size = sizeof(array)/sizeof(array[0]);

Usage:

qsort(array, 0, size-1 );

Here you are making right bound point at the rightmost item. In C++ it is so common that the end points one past the end of the data it may make the code slightly harder to grok. I would stick with the standard convention of pointing one past the end. This means there will be some adjustment in your qsort but not that much and it makes it easier for C++ users to read.

Prefer pre-increment:

++leftposition; // rather than leftposition++;

Though it makes no difference in this situation. There are situations where it can. So it one of those habits you should get into. Then when the types of objects are changed you do not need to search your code to make sure you are using the efficient version you know you used the efficient version automatically.

Your test for a sortable array is not aggressive enough:

if (rightbound > leftbound) {

You only don't sort if the array is zero length. Which may potentially lead to long recursion as you randomly choose a pivot point that always dives stuff to one side (luckily your non standard code below saved you by always leaving the pivot off subsequent sorts).

The real answer is you only need to sort arrays that have a size>= 2. (size zero and one are already sorted).

if ((rightbound - leftbound /* +1 depending on if you rightbound is one past or not*/ ) >= 2)

Small difference from a standard implementation here:
Which can potentially give you an infinite loop.

 while ( array[leftposition] < pivot ) {
 leftposition++;
 }
 while ( array[rightposition] > pivot ) {
 rightposition--;
 }

What happens if the value is equal to the pivot. In that case it will go to a random side. Also if both left and right side hit a value that is equal to pivot then you will swap them and restart the loop which will do nothing swap them and restart the loop (etc)

 while ( array[leftposition] <= pivot ) {
 leftposition++;
 }
 while ( array[rightposition] > pivot ) {
 rightposition--;
 }

Your left bound of the first recursive call is wrong.

 qsort( array, 0, leftposition - 1 );
 qsort( array, leftposition + 1, rightbound );

Should be leftbound.

 qsort( array, leftbound, leftposition - 1 );
 qsort( array, leftposition + 1, rightbound );

Since there can potentially be more than one value equal to pivot. Maybe you can remove the all from the side you put them in.

You need to look at the standard libs.

We have a std::swap()

void swap(int &a, int &b)
// use 
std::swap(val1, val2)

We have a way to print stuff:

void output( int* array, int size )
// use
std::copy(std::begin(array), std::end(array), std::ostream_iterator<int>(std::cout, "\n"));

We have a way to generate values:

void randomize( int* array, int size )
// use
generate(std::begin(array), std::end(array), ::rand);

This is what I got:

#include <iostream>
#include <iterator>
#include <algorithm>
#include <ctime>
void qsort( int* array, int leftbound , int rightbound );
int main()
{ 
 //declare array
 int array[100];
 //get size of the array
 int size = sizeof(array)/sizeof(array[0]);
 //randomize value
 std::srand(std::time(NULL));
 std::generate(std::begin(array), std::end(array), std::rand);
 //start the quicksort algorithm
 qsort(array, 0, size);
 //output the array
 std::copy(std::begin(array), std::end(array), std::ostream_iterator<int>(std::cout, "\n"));
}
void qsort( int* array, int leftbound , int rightbound )
{
 if ((rightbound - leftbound) >= 2)
 { 
 int pivot = array[leftbound + (rand() % (rightbound - leftbound))];
 int leftposition = leftbound;
 int rightposition = rightbound - 1;
 while ( leftposition < rightposition )
 { 
 while ((array[leftposition] <= pivot) && (leftposition < rightposition))
 { ++leftposition;
 } 
 while ((array[rightposition] > pivot ) && (leftposition < rightposition))
 { --rightposition;
 } 
 std::swap(array[leftposition], array[rightposition]);
 }
 // At least the pivot point will be on the left hand side.
 // This will also be the largest value. So move the leftposition back
 // to remove all the pivot points.
 while(((leftposition-1) > leftbound) && (array[leftposition-1] == pivot))
 { --leftposition;
 }
 qsort(array, leftbound, leftposition-1); // leftposition is one past the end of the left
 qsort(array, rightposition+1, rightbound); // Thus at the start of the right.
 } 
}

+1 but omitting elements equal to the pivot is not only common but actually required for strong runtime guarantees, if I recall correctly from the Bentley & McIlroy paper.
@KonradRudolph: Yep you are correct. With a badly chosen pivot each time it could potentially recurce forever if you don't take the pivot values. Have changed the above to correct.
Why prefer <ctime> over <time.h>? The first puts the symbols into namespace std and maybe also in global scope, the latter the other way around. I'm missing any advantage at all.
Also, we have neither std::begin nor std::end for pointers.
@Deduplicator: Putting things in a namespace is an advantage unto itself. So making sure they are in std is useful. Also consistency across C++ programs. Why not use the C++ version of the library.

Stack Exchange Network

My version of C++ quicksort

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

My version of C++ quicksort

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions