Radix sort (LSD) and counting sort

Question 1

I'm reading CLRS, and to practice, I rolled my version of radix sort and counting sort. I was looking at some reference implementations, particularly the LSD one from Rosetta Code, and mine performs significantly better (~10 times), especially on 64 bit inputs and if the maximum input range is known.

I think one place that could be improved is the creation of res inside counting sort. It'd be faster if I could somehow use reserve as right now I default initialize everything and then assign over them. However, I can't grow it using push_back since in the following loop, res is grown randomly, rather than linearly as required by push_back. I don't know of any data structure that can have a runtime set size with unintialized values that also gets optimized nicely in the copy line after the loop.

I experimented with variable length arrays such as T res[n] and it works up to the stack limit. However, limiting input size to the size of my stack is not acceptable. Anyway, this change did not have a noticeable impact on performance.

Suggestions for improvements would be appreciated.

Repository of algorithms

Counting sort

// counting sort, assumes each input is integral between 0 to k
// O(n) if k = O(n)
template <typename Iter, typename Op>
void cnt_sort(Iter begin, Iter end, size_t k, Op op) {
 vector<int> counts(k); // init to 0
 for (auto i = begin; i != end; ++i) // count # elems == i
 ++counts[op(*i)]; 
 for (size_t i = 1; i < k; ++i)
 counts[i] += counts[i-1]; // turn into # elems <= i
 vector<typename Iter::value_type> res(distance(begin, end)); // doing useless initialization here
 for (auto j = end;;) {
 --j;
 res[--counts[op(*j)]] = *j;
 if (j == begin) break;
 }
 copy(res.begin(), res.end(), begin); // compiler optimizes this out
}

Radix sort

// radix sort, more practical than counting sort
// O(d(n + k)) running time where d is # digits, k is size of digit
class Digit_cmp { // functor for comparing a "digit" (particular bits)
 const long long mask; // 0..63 bitfield to test against
 const size_t to_shift;
public:
 Digit_cmp(long long m, size_t ts) : mask(m), to_shift(ts) {}
 template <typename T>
 T operator()(T n) const {
 return (n & mask) >> to_shift; // mask then shift to unit digit ex. 0xfab20000 >> 16
 }
};
template <typename Iter> 
void rdx_sort(Iter begin, Iter end, int bits) { 
 // bits is # bits to consider up to if a max val is known ahead of time
 // most efficent when digits are base n, having lg(n) bits
 size_t r {static_cast<size_t>(log2(end - begin))}; // # bits in digit
 size_t k {static_cast<size_t>(pow(2, r))}; // range of digit
 size_t d {0}; // current digit num
 for (long long mask = ~(~0 << r);; // ex. 0x000000ff for setting lower 8 bits on 32 bit num
 mask <<= r) {
 cnt_sort(begin, end, k, Digit_cmp(mask, r*d));
 ++d;
 if (mask & (1 << (bits-1))) break; // finished masking most significant digit
 }
}
template <typename Iter> // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
 int bits {sizeof(typename Iter::value_type)*CHAR_BIT};
 rdx_sort(begin, end, bits);
}

Question 2

I found a list a few things that you could improve, but nothing that would significantly change the way your sorts work:

At the end of the counting sort, you could use std::move instead of std::copy to move the elements to the original collection. For simple integers, it shouldn't change anything but it could make a significant difference if you try to sort big integers.
There is a more idiomatic way to compute sizeof(typename Iter::value_type)*CHAR_BIT in the standard library thanks to std::numeric_limits::digits, so you can actually rewrite rdx_sort as follows:
```
template <typename Iter> // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
 int bits {std::numeric_limits<typename Iter::value_type>::digits};
 rdx_sort(begin, end, bits);
}
```
Any number library (like a big number library) is allowed to specialize std::numeric_limits and every decent numbers library does so, so I shouldn't make your code less portable. By the way, the returned value is not always int, so the best thing you could do is to use a type alias to make things clearer:
```
template <typename Iter> // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
 using value_type = typename Iter::value_type
 value_type bits {std::numeric_limits<value_type>::digits};
 rdx_sort(begin, end, bits);
}
```
Be careful of wild assumptions:
```
const long long mask; // 0..63 bitfield to test against
```
long long is not guaranteed to be a 64-bit integer, even if it is more often the case than not. If you want to avoid problems, you should use std::int64_t from <cstdint>. This type only exists if the architecture actually has a 64-bit integer, bug I guess that you will have more problems anyway if your implementation does not.
```
const std::int64_t mask; // 0..63 bitfield to test against
```
It seems that you are using using namespace std;, which is something you should really avoid in a header-only library like yours. Using this directive will import every name from std:: into the global namespace for anyone using your library and you will for sure end with name clashes at some point. Consider std::-qualifying everything from the C and C++ standard libraries (except macros of course), even std::size_t: some implementations don't import the names from the C standard library into the global namespace.
Dropping random vowels from function names isn't going to help anymore. Consider using full names like counting_sort and radix_sort instead.

Morwenn Morwenn 20.2k3 gold badges69 silver badges132 bronze badges · Answer 1 · 2015-12-18 20:25:58Z

I found a list a few things that you could improve, but nothing that would significantly change the way your sorts work:

At the end of the counting sort, you could use std::move instead of std::copy to move the elements to the original collection. For simple integers, it shouldn't change anything but it could make a significant difference if you try to sort big integers.
There is a more idiomatic way to compute sizeof(typename Iter::value_type)*CHAR_BIT in the standard library thanks to std::numeric_limits::digits, so you can actually rewrite rdx_sort as follows:
```
template <typename Iter> // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
 int bits {std::numeric_limits<typename Iter::value_type>::digits};
 rdx_sort(begin, end, bits);
}
```
Any number library (like a big number library) is allowed to specialize std::numeric_limits and every decent numbers library does so, so I shouldn't make your code less portable. By the way, the returned value is not always int, so the best thing you could do is to use a type alias to make things clearer:
```
template <typename Iter> // range of input not known, just use max ex. 32 bits for ints
void rdx_sort(Iter begin, Iter end) {
 using value_type = typename Iter::value_type
 value_type bits {std::numeric_limits<value_type>::digits};
 rdx_sort(begin, end, bits);
}
```
Be careful of wild assumptions:
```
const long long mask; // 0..63 bitfield to test against
```
long long is not guaranteed to be a 64-bit integer, even if it is more often the case than not. If you want to avoid problems, you should use std::int64_t from <cstdint>. This type only exists if the architecture actually has a 64-bit integer, bug I guess that you will have more problems anyway if your implementation does not.
```
const std::int64_t mask; // 0..63 bitfield to test against
```
It seems that you are using using namespace std;, which is something you should really avoid in a header-only library like yours. Using this directive will import every name from std:: into the global namespace for anyone using your library and you will for sure end with name clashes at some point. Consider std::-qualifying everything from the C and C++ standard libraries (except macros of course), even std::size_t: some implementations don't import the names from the C standard library into the global namespace.
Dropping random vowels from function names isn't going to help anymore. Consider using full names like counting_sort and radix_sort instead.

Stack Exchange Network

Radix sort (LSD) and counting sort

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Radix sort (LSD) and counting sort

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions