I built my code according to the algorithm provided in this YouTube video. Please review it and help me find any problems. It definitely takes a lot longer than the qsort()
function in C++. But the algorithm I use should be just as fast. Can anyone find problems?
#include <iostream>
#include <time.h>
using namespace std;
void randomize( int* array, int size );
void qsort( int* array, int leftbound , int rightbound );
void swap( int &a, int &b );
void output( int* array, int size );
int main()
{
//declare array
int array[100];
//get size of the array
int size = sizeof(array)/sizeof(int);
//randomize value
randomize(array, size);
//start the quicksort algorithm
qsort(array, 0, size-1 );
//output the array
output(array, size);
return 0;
}
void randomize( int* array, int size )
{
srand((int)(time(0)));
for ( int i = 0 ; i < size ; i++ ) {
array[i] = rand();
}
}
void output( int* array, int size )
{
for ( int i = 0 ; i < size ; i++ ) {
cout << array[i] << endl;
}
}
void qsort( int* array, int leftbound , int rightbound )
{
if (rightbound > leftbound) {
int pivot = array[leftbound + (rand() % (rightbound - leftbound + 1))];
int leftposition = leftbound;
int rightposition = rightbound;
while ( leftposition != rightposition ) {
while ( array[leftposition] < pivot ) {
leftposition++;
}
while ( array[rightposition] > pivot ) {
rightposition--;
}
swap ( array[leftposition], array[rightposition] );
}
qsort( array, 0, leftposition - 1 );
qsort( array, leftposition + 1, rightbound );
}
}
void swap(int &a, int &b)
{
int tmp = a;
a = b;
b = tmp;
}
3 Answers 3
Please stop using:
using namespace std;
It may seem like a time saver. But for anything that isn't a toy program, it causes problems with collisions. So, it is a bad habit to get into. Stop before it becomes a habit.
You have alternatives.
You can selectively bring stuff into the current scope:
using std::cout;
using std::vector;
Alternatively (my preference) you prefix stuff in the the standard library with std::
. The name std was chosen so it was short enough that people would not be hampered with typing it.
Prefer to use the C++ version of the headers:
#include <time.h>
// Prefer
#include <ctime>
Note: This also puts the functions in the standard namespace.
This is correct:
int size = sizeof(array)/sizeof(int);
But it makes the code brittle. If you change the type of the array. You also need to remember to change type here in the size expression. It is best to let the compiler deduce this so use:
int size = sizeof(array)/sizeof(array[0]);
Usage:
qsort(array, 0, size-1 );
Here you are making right bound point at the rightmost item. In C++ it is so common that the end points one past the end of the data it may make the code slightly harder to grok. I would stick with the standard convention of pointing one past the end. This means there will be some adjustment in your qsort but not that much and it makes it easier for C++ users to read.
Prefer pre-increment:
++leftposition; // rather than leftposition++;
Though it makes no difference in this situation. There are situations where it can. So it one of those habits you should get into. Then when the types of objects are changed you do not need to search your code to make sure you are using the efficient version you know you used the efficient version automatically.
Your test for a sortable array is not aggressive enough:
if (rightbound > leftbound) {
You only don't sort if the array is zero length. Which may potentially lead to long recursion as you randomly choose a pivot point that always dives stuff to one side (luckily your non standard code below saved you by always leaving the pivot off subsequent sorts).
The real answer is you only need to sort arrays that have a size>= 2. (size zero and one are already sorted).
if ((rightbound - leftbound /* +1 depending on if you rightbound is one past or not*/ ) >= 2)
Small difference from a standard implementation here:
Which can potentially give you an infinite loop.
while ( array[leftposition] < pivot ) {
leftposition++;
}
while ( array[rightposition] > pivot ) {
rightposition--;
}
What happens if the value is equal to the pivot. In that case it will go to a random side. Also if both left and right side hit a value that is equal to pivot then you will swap them and restart the loop which will do nothing swap them and restart the loop (etc)
while ( array[leftposition] <= pivot ) {
leftposition++;
}
while ( array[rightposition] > pivot ) {
rightposition--;
}
Your left bound of the first recursive call is wrong.
qsort( array, 0, leftposition - 1 );
qsort( array, leftposition + 1, rightbound );
Should be leftbound.
qsort( array, leftbound, leftposition - 1 );
qsort( array, leftposition + 1, rightbound );
Since there can potentially be more than one value equal to pivot. Maybe you can remove the all from the side you put them in.
You need to look at the standard libs.
We have a std::swap()
void swap(int &a, int &b)
// use
std::swap(val1, val2)
We have a way to print stuff:
void output( int* array, int size )
// use
std::copy(std::begin(array), std::end(array), std::ostream_iterator<int>(std::cout, "\n"));
We have a way to generate values:
void randomize( int* array, int size )
// use
generate(std::begin(array), std::end(array), ::rand);
This is what I got:
#include <iostream>
#include <iterator>
#include <algorithm>
#include <ctime>
void qsort( int* array, int leftbound , int rightbound );
int main()
{
//declare array
int array[100];
//get size of the array
int size = sizeof(array)/sizeof(array[0]);
//randomize value
std::srand(std::time(NULL));
std::generate(std::begin(array), std::end(array), std::rand);
//start the quicksort algorithm
qsort(array, 0, size);
//output the array
std::copy(std::begin(array), std::end(array), std::ostream_iterator<int>(std::cout, "\n"));
}
void qsort( int* array, int leftbound , int rightbound )
{
if ((rightbound - leftbound) >= 2)
{
int pivot = array[leftbound + (rand() % (rightbound - leftbound))];
int leftposition = leftbound;
int rightposition = rightbound - 1;
while ( leftposition < rightposition )
{
while ((array[leftposition] <= pivot) && (leftposition < rightposition))
{ ++leftposition;
}
while ((array[rightposition] > pivot ) && (leftposition < rightposition))
{ --rightposition;
}
std::swap(array[leftposition], array[rightposition]);
}
// At least the pivot point will be on the left hand side.
// This will also be the largest value. So move the leftposition back
// to remove all the pivot points.
while(((leftposition-1) > leftbound) && (array[leftposition-1] == pivot))
{ --leftposition;
}
qsort(array, leftbound, leftposition-1); // leftposition is one past the end of the left
qsort(array, rightposition+1, rightbound); // Thus at the start of the right.
}
}
-
\$\begingroup\$ +1 but omitting elements equal to the pivot is not only common but actually required for strong runtime guarantees, if I recall correctly from the Bentley & McIlroy paper. \$\endgroup\$Konrad Rudolph– Konrad Rudolph2012年07月13日 21:23:25 +00:00Commented Jul 13, 2012 at 21:23
-
\$\begingroup\$ @KonradRudolph: Yep you are correct. With a badly chosen pivot each time it could potentially recurce forever if you don't take the pivot values. Have changed the above to correct. \$\endgroup\$Loki Astari– Loki Astari2012年07月14日 00:30:38 +00:00Commented Jul 14, 2012 at 0:30
-
\$\begingroup\$ Why prefer
<ctime>
over<time.h>
? The first puts the symbols into namespace std and maybe also in global scope, the latter the other way around. I'm missing any advantage at all. \$\endgroup\$Deduplicator– Deduplicator2015年10月06日 10:06:14 +00:00Commented Oct 6, 2015 at 10:06 -
\$\begingroup\$ Also, we have neither
std::begin
norstd::end
for pointers. \$\endgroup\$Deduplicator– Deduplicator2015年10月06日 10:16:18 +00:00Commented Oct 6, 2015 at 10:16 -
\$\begingroup\$ @Deduplicator: Putting things in a namespace is an advantage unto itself. So making sure they are in
std
is useful. Also consistency across C++ programs. Why not use the C++ version of the library. \$\endgroup\$Loki Astari– Loki Astari2015年10月06日 15:08:41 +00:00Commented Oct 6, 2015 at 15:08
similar to the last one below is a list of differences
Accounted for the condition where both left and right point to the pivot in this case you have the code keep swapping endlessly, this would happen in the case of an array 100 100 5 4 3 100 where say value 100 is the pivot at slot 1, then slot zero and slot 5 would never move since you are checking just operator < so you need to move one of the pointers
Also removed the additional 2 swaps to move the pivot out and back into the list as this is not needed.
Additionally, I am not using c++11 so I did this using c++98 removing the c++11 references in the other version
Kept the same syntax of allowing the end of the function to be one more than the end of the data to be consistent with other stl like calls.
Finally just added some logging so that someone can follow what is happening when running the code.
Hope you like this version, enjoy splitting lists, moving pointers, and recursing.
#include <iostream>
#include <iterator>
#include <algorithm>
#include <ctime>
void quicksort ( int array[], int start, int end);
void printArray ( int array[], size_t N, const int p = -1);
void randomize (int array[], size_t N);
int main ( int argc, char *argv[])
{
const int SIZE = 6;
int x[SIZE];
std::cout << "Starting Quick Sort" <<std::endl;
randomize(x, SIZE);
std::cout << "Array before quick sort" << std::endl;
printArray(x,SIZE);
quicksort(x,0,SIZE);
std::cout << "Array after quick sort" << std::endl;
printArray(x,SIZE);
}
void quicksort (int array[], int start, int end )
{
static unsigned int calls = 0;
std::cout << "QuickSort Call #: " << ++calls << std::endl;
//function allows one past the end to be consistent with most function calls
// but we normalize to left and rightbounds that point to the data
int leftbound = start;
int rightbound = end - 1;
if (rightbound <= leftbound )
return;
int pivotIndex = leftbound + (rand() % (end - leftbound));
int pivot = array[pivotIndex];
std::cout << " Pivot: " << "[" << pivotIndex << "] " << pivot << std::endl;
printArray (array,end,pivot);
int leftposition = leftbound;
int rightposition = rightbound; // accounting for pivot that was moved out
while ( leftposition < rightposition )
{
while ( leftposition < rightposition && array[leftposition] < pivot )
++leftposition;
while ( rightposition > leftposition && array[rightposition] > pivot )
--rightposition;
if(leftposition < rightposition)
{
if (array[leftposition] != array[rightposition])
{
std::swap(array[leftposition],array[rightposition]);
std::cout << " Swapping RightPosition: " << rightposition << " and LeftPosition: " << leftposition << std::endl;
printArray (array,end,pivot);
}
else
++leftposition;
}
}
std::cout << "Array at the end of QuickSort Call #: " << calls << std::endl;
printArray (array,end,pivot);
// sort leaving the pivot out
quicksort (array,leftbound, leftposition); // leftposition is at the pivot which is one past the data
quicksort (array,leftposition + 1,end); // leftposition + 1 is past the pivot till the end
}
void printArray ( int array[], size_t N, const int p)
{
//output the array
for ( unsigned int i = 0; i < N; ++i)
{
if (array[i] == p)
{
std::cout << " [" << i << "] *" << array[i] << "*";
}
else
{
std::cout << " [" << i << "] " << array[i];
}
}
std::cout << std::endl;
}
void randomize (int array[], size_t N)
{
static const unsigned int RANDSIZE = 100;
srand(time(0));
for ( unsigned int i = 0; i < N; ++i)
{
array[i]=rand() % RANDSIZE + 1;
}
}
The first major problem I see is that you're not considering rightposition
when incrementing leftposition
and vice versa. This can cause L and R to cross, which will cause additional unnecessary swaps:
pivot value 7
4,7,3,9,6,2,8,1
L R
4,7,3,9,6,2,8,1
L R //<--this is the last swap that should happen, but L < R
4,1,3,9,6,2,8,7
L //...so L moves to the next value greater than the pivot
4,1,3,2,6,9,8,7
R L //...and R moves to the next value less than the pivot
4,1,3,2,6,9,8,7
R L //...which are swapped incorrectly
4,1,3,2,9,6,8,7
I'm going to assume that the != in the main pivoting loop is a typo, because combined with the lack of checking that the positions don't cross, the positions would be incremented beyond the ends of the array and the whole thing would error out.
leftposition
and rightposition
should NEVER cross. In addition, even if they don't and you miraculously end up with both leftpos and rightpos on the same element (which would have to be the pivot), you swap that element with itself unnecessarily; don't do that.
Finally, in your code the "left half" call always starts at index zero; that means that the left-half call that branches from all right-half calls will scan through the entire left-hand side of the array, not just of the half it should have been given. Because the left half is sorted first, there should be no swaps, but it still has to scan through all those elements, which makes the right-hand half of the algorithm approach O(N2) complexity.
Try this:
void qsort( int* array, int leftbound, int rightbound )
{
if (rightbound <= leftbound) return; //reduce nesting
int pivotIndex = leftbound + (rand() % (rightbound - leftbound + 1);
int pivot = array[pivotIndex];
//move the pivot value out of the way; we will discover where it goes
swap(array[pivotIndex], array[rightbound]);
int leftposition = leftbound;
int rightposition = rightbound-1;
//you could use the != here if you really wanted, WITH the changes inside the loop
while ( leftposition < rightposition )
{
//don't move leftpos past rightpos
while ( array[leftposition] < pivot && leftposition < rightposition)
leftposition++;
//ditto
while ( array[rightposition] > pivot && rightposition > leftposition )
rightposition--;
//don't swap if the two positions have merged
if(leftposition < rightposition)
swap ( array[leftposition], array[rightposition] );
}
//now, swap the pivot element back in; leftposition is its proper location.
swap(array[rightbound], array[leftposition]);
//No matter what, leftposition (and rightposition) will be at an element
//that should be to the right of the pivot, so the swap works.
//Trace it out yourself if you don't believe me.
//here's your main inefficiency; you were always starting from index 0
//on the left side, meaning that the call for the "left half" of all
//"right half" calls scans most of the array redundantly.
qsort( array, leftbound, leftposition - 1 );
qsort( array, leftposition + 1, rightbound);
}
I'm sure there are some additional micro-optimizations that could be made here:
- since we're making the leftpos vs rightpos check in all the right places, we could use != instead of < everywhere which would probably be faster.
- The prefix increment operation is performed assuming that the existing value isn't needed for any further operation, and so it just happens, making it faster than the postfix increment.
However, the algorithm above will perform correctly which is the primary goal.