I wrote the following code for Radix Sort algorithm for number sorting. Any review is highly appreciated.
You can also view and run/test the code here Radix Sort
#include<iostream>
using namespace std;
int getNumberAtPosition(int num,int position){
return (num/position)%10;
}
void radixSort(int array[],int length){
int sizeOfEachBucket = length;
int numberOfBuckets = 10;
int buckets[10][sizeOfEachBucket];
int large = 0;
int maxPasses = 0;
//finding largest number from array
for(int i=0; i<length; i++){
if(array[i]>large){
large = array[i];
}
}
//finding the number of passes
while(large != 0){
maxPasses++;
large = large/10;
}
cout<<"Max passes ="<<maxPasses<<endl;
int position = 1;
int bucketIndex = 0;
int newListIndex = 0;
int arrayLengths[10];
for(int i=0; i<maxPasses; i++){
//cout<<"i ="<<i<<endl;
for(int k=0; k<=9; k++){
//cout<<"k ="<<k<<endl;
bucketIndex = 0;
for(int j=0; j<length; j++){
if(k==getNumberAtPosition(array[j],position)){
buckets[k][bucketIndex] = array[j];
bucketIndex++;
}
}
arrayLengths[k] = bucketIndex;
}
position = position*10;
int newArrayIndex = 0;
for(int k=0; k<=9; k++){
//cout<<"k ="<<k<<endl;
bucketIndex = 0;
for(int x=0; x<arrayLengths[k];x++){
array[newArrayIndex] = buckets[k][x];
newArrayIndex++;
}
}
}
for(int i=0; i<length; i++){
cout<<array[i]<<"\t";
}
}
1 Answer 1
First things first, don't use
using namespace std
in header files, where your radix sort implementation should be. Besides, convenience isn't a valid excuse when the only thing you import fromstd
iscout
: just typestd::cout
and you're done.One liner functions are often useless and noisy: you need to refer to the implementation to know exactly what they do, so they don't make the code more readable, and you have to come up with a name, which is not always easy. In the case of
getNumberAtPosition
, for instance, it's impossible to tell from the name if the position is meant for the most or least significant digit, and both are equally likely in a radix sort algorithm.everything that isn't meant to change should be
const
. The only place where it isn't idiomatic is in function signature, where you often don't tag built-in type arguments passed by value as const.Also, don't alias variables:
length
is only used to definesizeOfEachBucket
, that's two names to track down instead of one.Use the standard library: there may not be many things inside compared to other languages, but what's inside is very well implemented. It will also make your code more concise and expressive. For instance, the largest element in a
[first, last)
sequence is the result ofstd::max_element(first, last)
(std::max_element
resides in<algorithm>
). Using standard containers is also a statement: a constant size array will be astd::array
, whereas a variable-size one will be astd::vector
.Avoid naked loops: by that I mean loops like
for (int i = 0; i < z; ++i)
. The nested ones are particularly difficult to read, with all those meaningless one-letter variable names. Use either range based for loops when you iterate over the whole container (e.g:for (auto item : vector)
) or named algorithm when your loop has a standardly implemented purpose (std::for_each
,std::copy
, etc.).When implementing your own algorithms, try to use an stl-like interface: iterators are preferable because they abstract the concrete container-type. Your algorithm won't work on an
std::list
although it very well could be (radix sort doesn't rely on random access like quick sort does).
Here's an example of better-looking (though not thoroughly tested) code:
#include <algorithm>
#include <vector>
#include <array>
#include <cmath>
template <typename Iterator>
void radix_sort(Iterator first, Iterator last) {
const int max_divisor = std::pow(10, std::log10(*std::max_element(first, last)));
for (int divisor = 1; divisor < max_divisor; divisor *= 10) {
std::array<std::vector<int>, 10> buckets;
std::for_each(first, last, [&buckets, divisor](auto i) {
buckets[(i / divisor) % 10].push_back(i);
});
auto out = first;
for (const auto& bucket : buckets) {
out = std::copy(bucket.begin(), bucket.end(), out);
}
}
}
EDIT: since the algorithm exposed in the question relies on decimal digits, I also formulated a base-10 based algorithm. But thinking back about this, I feel like my answer isn't complete if I don't precise that a base-two approach is more optimal (and more generally used as far as I know). Why is that?
because binary arithmetic is easier for a computer (not a very strong reason since decimal arithmetic is often optimized into binary arithmetic by the compiler);
because -and that's a much stronger reason- you can the rely on a very well-known algorithm to distribute your number between buckets, an algorithm that moreover does it in place, thus without any memory allocation;
by the way, that algorithm is
std::stable_partition
And here is a sample:
template <typename Iterator>
void binary_radix_sort(Iterator first, Iterator last) {
using integer_type = std::decay_t<decltype(*first)>;
bool finished = false;
for (integer_type mask = 1; !finished; mask <<= 1) {
finished = true;
std::stable_partition(first, last, [mask, &finished](auto i) {
if (mask < i) finished = false;
return !(mask & i);
});
}
}
-
\$\begingroup\$ Thanks for such a valuable advice! \$\endgroup\$Deepeshkumar– Deepeshkumar2019年03月29日 17:34:08 +00:00Commented Mar 29, 2019 at 17:34
-
4\$\begingroup\$ I have to strongly disagree with #2. One line functions often provide important abstractions that are invaluable and significantly reduce cognitive load \$\endgroup\$miscco– miscco2019年03月29日 18:12:46 +00:00Commented Mar 29, 2019 at 18:12