I've written this code for a merge sort, which is meant to implement the pseudo-code from Cormen's Introduction to Algorithms:
#include <iostream>
#include <cstdlib>
using namespace std;
const unsigned long long infinity = -1ULL;
void merge(int* A,int p,const int q, const int r)
{
const int n_1=q-p+1;
const int n_2=r-q;
int* L = new int [n_1+1];
int* R = new int [n_2+1];
L[n_1]=infinity;
R[n_2]=infinity;
for(int i = 0; i < n_1; i++)
L[i] = A[p+i];
for (int j = 0; j < n_2; j++)
R[j] = A[q+j+1];
int i=0;
int j=0;
// for(int k = p; k <= r; p++) broken code
int k;
for(k=p; k <= r && i < n_1 && j < n_2; ++k)
{
if(L[i] <= R[j])
{
A[k] = L[i];
i++;
}
else
{
A[k] = R[j];
j++;
}
}
// Added the following two loop.
// Note only zero or one loop will actually activate.
while (i < n_1) {A[k++] = L[i++];}
while (j < n_2) {A[k++] = R[j++];}
}
void merge_sort(int* A, const int p, const int r)
{
if (p < r)
{
int q = (p+r)/2;
merge_sort(A, p,q);
merge_sort(A,q+1,r);
merge(A,p,q,r);
}
}
int main()
{
int length;
cout << "Specify array length" << endl;
cin >> length;
cout << "\n";
int A [length];
//Populate and print the Array
for(int i = 0; i < length; i++)
{
A[i] = rand()%99-1;
cout << A[i] << " ";
}
cout << "\n";
merge_sort(A,0,length-1);
cout << "Your array has been merge_sorted and is now this: " << endl;
for(int i = 0; i < length; i++) cout << A[i] << " ";
cout << "\n";
//cout << infinity << endl;
return 0;
}
1 Answer 1
Don't do this:
using namespace std;
For a detailed explanation on why not to use the usign clause.
This is not used anywhere (also its not infinity so baddy named).
const unsigned long long infinity = -1ULL;
Passing pointers is a not a good idea.
void merge(int* A,int p,const int q, const int r)
Try and avoid it because you get into the realm of ownership symantics. Here I would be OK with using it but I would definitely rename the variable to explain what it is. You have a tendency to use excessively short variable names; this make the code hard to read.
If you use the standard practive of begin and end (used in the STL (so begin points at the first and end points one past the last)) then you will find that your code becomes a lot simpler to write (especially in this case).
const int n_1=q-p+1;
const int n_2=r-q;
This is stupendously bad for C++ code. This looks like C code.
int* L = new int [n_1+1];
int* R = new int [n_2+1];
You should practically never use new
. When you do you should always match it with delete (I see no delete anywhere so your code leaks). The reason you never use new
is the problem with matching it with delete (especially when exceptions can be flying around). What you want to do is use a container:
std::vector<int> left(n_1+1);
std::vector<int> right(n_2+1);
You only use new
and delete
when building your own container types or in very low level code. Normally you will use existing containers std::vector
or smart pointers (for these use make_{smartpointertype}
.
Has no affect:
L[n_1]=infinity;
R[n_2]=infinity;
Especially since you never check for infinity. Also because you always know the exect bounds and should not fall off the end.
Also this is a particularly bad implementation of the algorithm.
Most version I have seen use a split/merge in place algorithm. Thus you don't need to copy data around into new arrays all over the place.
for(int i = 0; i < n_1; i++)
L[i] = A[p+i];
for (int j = 0; j < n_2; j++)
R[j] = A[q+j+1];
for(k=p; k <= r && i < n_1 && j < n_2; ++k)
{
if(L[i] <= R[j])
{
A[k] = L[i];
i++;
}
else
{
A[k] = R[j];
j++;
}
}
-
1\$\begingroup\$ "This is not used anywhere (also its not infinity so baddy named)." -- Well, I followed the logic of Cormen as best I could, and understood this particular point as "use a really large number". But yes, it isn't actually used. Admittedly several pages later an exercise is proposed to modify the algorithm so that it wouldn't need to be used. "This is not used anywhere (also its not infinity so baddy named)." --true, but his is generally a pseudo-code to real C++ code exercise, so all the names were clear for me. But I see how your advice would be relevant in the real world. \$\endgroup\$Chiffa– Chiffa2013年09月27日 01:50:26 +00:00Commented Sep 27, 2013 at 1:50
-
\$\begingroup\$ "This looks like C code." -- it is. I wanted to re-write pseudo-code into a simple and fast code. Perhaps I should re-write the C code now into something more like C++. "I see no delete anywhere so your code leaks " --true. I've added them to my original code some time after posting the question, though. So, all things considered, thanks a lot for your answer, now I see many points I have to work on. \$\endgroup\$Chiffa– Chiffa2013年09月27日 01:56:01 +00:00Commented Sep 27, 2013 at 1:56
-
\$\begingroup\$ Read about
using
, but my response is similar to the third one there: as long as I do it in my private code sources, it wouldn't matter much. But perhaps a habit of addingstd::
where needed is worth being formed early. \$\endgroup\$Chiffa– Chiffa2013年09月27日 02:07:47 +00:00Commented Sep 27, 2013 at 2:07 -
\$\begingroup\$ I agree with forming a habit of using
std::
. But, if you really must use it, then keep it local (inside a function) instead of global. \$\endgroup\$Jamal– Jamal2013年09月27日 02:55:10 +00:00Commented Sep 27, 2013 at 2:55 -
3\$\begingroup\$ @Chiffa: <quote>as long as I do it in my private code sources, it wouldn't matter much</quote>. You already nearly broke your own code with this tiny simple example.
std::merge()
is now in the same namespace as your::merge()
. You are one parameter off hittingstd::merge()
. If your code is any larger than this you are going to start hitting issues that then become hard to debug. Unless your code is shorter than 20 lines you should never useusing namespace XX;
. It will hit you and debugging it will be a nightmare. \$\endgroup\$Loki Astari– Loki Astari2013年09月27日 15:22:32 +00:00Commented Sep 27, 2013 at 15:22
gdb
or another debugger and check whether the indeces used are correct? \$\endgroup\$for(int k = p; k <= r; p++)
. It should be:for(int k = p; k <= r && i < n_1 && j < n_2; k++)
. Notice TWO Mistakes. 1) Ifi
orj
get to the end of their section then comparing them against the other section is pointless and can lead to reading beyond the end of the array. 2) You are incrementingp
when you should be incrementingk
. \$\endgroup\$