Implementation of Radix Sort algorithm for sorting integers in c++

Question 1

I wrote the following code for Radix Sort algorithm for number sorting. Any review is highly appreciated.

You can also view and run/test the code here Radix Sort

#include<iostream>
using namespace std;
int getNumberAtPosition(int num,int position){
 return (num/position)%10;
}
void radixSort(int array[],int length){
int sizeOfEachBucket = length;
int numberOfBuckets = 10;
int buckets[10][sizeOfEachBucket];
int large = 0;
int maxPasses = 0;
//finding largest number from array
for(int i=0; i<length; i++){
 if(array[i]>large){
 large = array[i];
 }
}
//finding the number of passes
while(large != 0){
 maxPasses++;
 large = large/10;
}
cout<<"Max passes ="<<maxPasses<<endl;
int position = 1;
int bucketIndex = 0;
int newListIndex = 0;
int arrayLengths[10];
for(int i=0; i<maxPasses; i++){
 //cout<<"i ="<<i<<endl;
 for(int k=0; k<=9; k++){
 //cout<<"k ="<<k<<endl;
 bucketIndex = 0;
 for(int j=0; j<length; j++){
 if(k==getNumberAtPosition(array[j],position)){
 buckets[k][bucketIndex] = array[j];
 bucketIndex++;
 }
 }
 arrayLengths[k] = bucketIndex;
 }
 position = position*10;
 int newArrayIndex = 0;
 for(int k=0; k<=9; k++){
 //cout<<"k ="<<k<<endl;
 bucketIndex = 0;
 for(int x=0; x<arrayLengths[k];x++){
 array[newArrayIndex] = buckets[k][x];
 newArrayIndex++;
 }
 }
}
for(int i=0; i<length; i++){
 cout<<array[i]<<"\t";
}
}

Question 2

First things first, don't use using namespace std in header files, where your radix sort implementation should be. Besides, convenience isn't a valid excuse when the only thing you import from std is cout: just type std::cout and you're done.
One liner functions are often useless and noisy: you need to refer to the implementation to know exactly what they do, so they don't make the code more readable, and you have to come up with a name, which is not always easy. In the case of getNumberAtPosition, for instance, it's impossible to tell from the name if the position is meant for the most or least significant digit, and both are equally likely in a radix sort algorithm.
everything that isn't meant to change should be const. The only place where it isn't idiomatic is in function signature, where you often don't tag built-in type arguments passed by value as const.
Also, don't alias variables: length is only used to define sizeOfEachBucket, that's two names to track down instead of one.
Use the standard library: there may not be many things inside compared to other languages, but what's inside is very well implemented. It will also make your code more concise and expressive. For instance, the largest element in a [first, last) sequence is the result of std::max_element(first, last) (std::max_element resides in <algorithm>). Using standard containers is also a statement: a constant size array will be a std::array, whereas a variable-size one will be a std::vector.
Avoid naked loops: by that I mean loops like for (int i = 0; i < z; ++i). The nested ones are particularly difficult to read, with all those meaningless one-letter variable names. Use either range based for loops when you iterate over the whole container (e.g: for (auto item : vector)) or named algorithm when your loop has a standardly implemented purpose (std::for_each, std::copy, etc.).
When implementing your own algorithms, try to use an stl-like interface: iterators are preferable because they abstract the concrete container-type. Your algorithm won't work on an std::list although it very well could be (radix sort doesn't rely on random access like quick sort does).

Here's an example of better-looking (though not thoroughly tested) code:

#include <algorithm>
#include <vector>
#include <array>
#include <cmath>
template <typename Iterator>
void radix_sort(Iterator first, Iterator last) {
 const int max_divisor = std::pow(10, std::log10(*std::max_element(first, last)));
 for (int divisor = 1; divisor < max_divisor; divisor *= 10) {
 std::array<std::vector<int>, 10> buckets;
 std::for_each(first, last, [&buckets, divisor](auto i) {
 buckets[(i / divisor) % 10].push_back(i);
 });
 auto out = first;
 for (const auto& bucket : buckets) {
 out = std::copy(bucket.begin(), bucket.end(), out);
 }
 }
}

EDIT: since the algorithm exposed in the question relies on decimal digits, I also formulated a base-10 based algorithm. But thinking back about this, I feel like my answer isn't complete if I don't precise that a base-two approach is more optimal (and more generally used as far as I know). Why is that?

because binary arithmetic is easier for a computer (not a very strong reason since decimal arithmetic is often optimized into binary arithmetic by the compiler);
because -and that's a much stronger reason- you can the rely on a very well-known algorithm to distribute your number between buckets, an algorithm that moreover does it in place, thus without any memory allocation;
by the way, that algorithm is std::stable_partition

And here is a sample:

template <typename Iterator>
void binary_radix_sort(Iterator first, Iterator last) {
 using integer_type = std::decay_t<decltype(*first)>;
 bool finished = false;
 for (integer_type mask = 1; !finished; mask <<= 1) {
 finished = true;
 std::stable_partition(first, last, [mask, &finished](auto i) {
 if (mask < i) finished = false;
 return !(mask & i);
 });
 }
}

Question 3

Thanks for such a valuable advice!

Question 4

I have to strongly disagree with #2. One line functions often provide important abstractions that are invaluable and significantly reduce cognitive load

papagaga papagaga 5,80712 silver badges25 bronze badges · Accepted Answer · 2019-03-29 09:27:59Z

First things first, don't use using namespace std in header files, where your radix sort implementation should be. Besides, convenience isn't a valid excuse when the only thing you import from std is cout: just type std::cout and you're done.
One liner functions are often useless and noisy: you need to refer to the implementation to know exactly what they do, so they don't make the code more readable, and you have to come up with a name, which is not always easy. In the case of getNumberAtPosition, for instance, it's impossible to tell from the name if the position is meant for the most or least significant digit, and both are equally likely in a radix sort algorithm.
everything that isn't meant to change should be const. The only place where it isn't idiomatic is in function signature, where you often don't tag built-in type arguments passed by value as const.
Also, don't alias variables: length is only used to define sizeOfEachBucket, that's two names to track down instead of one.
Use the standard library: there may not be many things inside compared to other languages, but what's inside is very well implemented. It will also make your code more concise and expressive. For instance, the largest element in a [first, last) sequence is the result of std::max_element(first, last) (std::max_element resides in <algorithm>). Using standard containers is also a statement: a constant size array will be a std::array, whereas a variable-size one will be a std::vector.
Avoid naked loops: by that I mean loops like for (int i = 0; i < z; ++i). The nested ones are particularly difficult to read, with all those meaningless one-letter variable names. Use either range based for loops when you iterate over the whole container (e.g: for (auto item : vector)) or named algorithm when your loop has a standardly implemented purpose (std::for_each, std::copy, etc.).
When implementing your own algorithms, try to use an stl-like interface: iterators are preferable because they abstract the concrete container-type. Your algorithm won't work on an std::list although it very well could be (radix sort doesn't rely on random access like quick sort does).

Here's an example of better-looking (though not thoroughly tested) code:

#include <algorithm>
#include <vector>
#include <array>
#include <cmath>
template <typename Iterator>
void radix_sort(Iterator first, Iterator last) {
 const int max_divisor = std::pow(10, std::log10(*std::max_element(first, last)));
 for (int divisor = 1; divisor < max_divisor; divisor *= 10) {
 std::array<std::vector<int>, 10> buckets;
 std::for_each(first, last, [&buckets, divisor](auto i) {
 buckets[(i / divisor) % 10].push_back(i);
 });
 auto out = first;
 for (const auto& bucket : buckets) {
 out = std::copy(bucket.begin(), bucket.end(), out);
 }
 }
}

EDIT: since the algorithm exposed in the question relies on decimal digits, I also formulated a base-10 based algorithm. But thinking back about this, I feel like my answer isn't complete if I don't precise that a base-two approach is more optimal (and more generally used as far as I know). Why is that?

because binary arithmetic is easier for a computer (not a very strong reason since decimal arithmetic is often optimized into binary arithmetic by the compiler);
because -and that's a much stronger reason- you can the rely on a very well-known algorithm to distribute your number between buckets, an algorithm that moreover does it in place, thus without any memory allocation;
by the way, that algorithm is std::stable_partition

And here is a sample:

template <typename Iterator>
void binary_radix_sort(Iterator first, Iterator last) {
 using integer_type = std::decay_t<decltype(*first)>;
 bool finished = false;
 for (integer_type mask = 1; !finished; mask <<= 1) {
 finished = true;
 std::stable_partition(first, last, [mask, &finished](auto i) {
 if (mask < i) finished = false;
 return !(mask & i);
 });
 }
}

I have to strongly disagree with #2. One line functions often provide important abstractions that are invaluable and significantly reduce cognitive load

Stack Exchange Network

Implementation of Radix Sort algorithm for sorting integers in c++

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Implementation of Radix Sort algorithm for sorting integers in c++

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions