I wrote a little C++ sorting routine for my Algorithm studies, but I feel that a lot can be skipped out.
I am aware that modern sort takes Iterators and comparator functions, let's disregard that for now. I just want to clean up this code. Specifically, I don't like the if
and value variables.
I have looked into other solutions, but they didn't clear up much for me.
void SelectionSorter::sort(std::vector<int> &v){
int currentIndex = 0;
while (currentIndex < v.size()) {
int indexOfSmallestItem = -1;
int value = v[currentIndex];
bool shouldSwap = false;
for (int i = currentIndex; i < v.size(); i++) {
if (v[i] < value) {
value = v[i];
indexOfSmallestItem = i;
shouldSwap = true;
}
}
if (shouldSwap) {
int temp = v[currentIndex];
v[currentIndex] = v[indexOfSmallestItem];
v[indexOfSmallestItem] = temp;
}
currentIndex++;
}
}
2 Answers 2
Some points:
The
shouldSwap
variable is not needed. Instead, setindexOfSmallestItem
tocurrentIndex
at the beginning of the loop (this is necessary for the next step to work) and check is theindexOfSmallestItem
iscurrentIndex
:if (indexOfSmallestItem != currentIndex) { int temp = v[currentIndex]; v[currentIndex] = v[indexOfSmallestItem]; v[indexOfSmallestItem] = temp; }
value
is also not needed, as you can access the vector directly:if (v[i] < v[indexOfSmallestItem]) { value = v[i]; indexOfSmallestItem = i; }
The outside
while
loop can be afor
loop, because:int currentIndex = 0; // initialization while (currentIndex < v.size()) { // condition // ... currentIndex++; // increment }
So it would be:
for (int currentIndex = 0; currentIndex < v.size(); currentIndex++) { // ... }
You actually can start on
currentIndex + 1
because the first one is the current value.The naming of the vector is not good, as one-letter names are hard to read. Change it to
vect
or something similar.
Now all you need is the indexOfSmallestItem
and the vector:
void SelectionSorter::sort(std::vector<int> &vect) {
for (int currentIndex = 0; currentIndex < vect.size(); currentIndex++) {
int indexOfSmallestItem = currentIndex;
for (int i = currentIndex + 1; i < vect.size(); i++) {
if (vect[i] < vect[indexOfSmallestItem]) {
indexOfSmallestItem = i;
}
}
if (indexOfSmallestItem != currentIndex) {
int temp = vect[currentIndex];
vect[currentIndex] = vect[indexOfSmallestItem];
vect[indexOfSmallestItem] = temp;
}
}
}
Optionally, you can remove the if (indexOfSmallestItem != currentIndex)
completely, as it is not much of an improvement.
Also, you can extract the swapping part into a method. In fact, you should. This makes code (even this size) easier to maintain.
void SelectionSorter::sort(std::vector<int> &vect) {
for (int currentIndex = 0; currentIndex < vect.size(); currentIndex++) {
int indexOfSmallestItem = currentIndex;
for (int i = currentIndex + 1; i < vect.size(); i++) {
if (vect[i] < vect[indexOfSmallestItem]) {
indexOfSmallestItem = i;
}
}
swap(vect, currentIndex, indexOfSmallestItem);
}
}
void swap(std::vector<int> &vect, int i, int j) {
int temp = vect[i];
vect[i] = vect[j];
vect[j] = temp;
}
-
\$\begingroup\$ Hypothetically speaking, nothing bad happens if we swap first value with itself, is there anything to gain if we remove the if statement alltogether and keep the inside code? Can we get improvements due to not branching and not cache missing? \$\endgroup\$Dvole– Dvole2015年11月28日 17:50:16 +00:00Commented Nov 28, 2015 at 17:50
-
\$\begingroup\$ @Dvole Yes, nothing bad happens if swap was done on the same element, but it is more costly than the quick inequality check. It is a micro-optimization, so it can be removed. \$\endgroup\$TheCoffeeCup– TheCoffeeCup2015年11月28日 17:53:31 +00:00Commented Nov 28, 2015 at 17:53
Prefer high-level abstractions over low-level handcrafted code
Functions that try to do too much often become hard to read, extend, test, and debug. Functions should be focused and short. An easy way to tell whether or not you should refactor your code into smaller functions is the "And"-test. Describe what your function does. If your descriptions uses and, then you should consider splitting the function into its parts. For your selection sort, you would describe it by:
Iterate over each element. Find the index of the minimum element from the remaining elements and swap if that index is not our current index.
Write the small and focused min_element_index()
and swap()
functions.
#include <vector>
std::size_t min_element_index(const std::vector<int>& vec,
std::size_t startingIndex = 0) {
for (auto candidate = startingIndex + 1; candidate < vec.size();
++candidate) {
if (vec[candidate] < vec[startingIndex]) {
startingIndex = candidate;
}
}
return startingIndex;
}
void swap_by_index(std::vector<int>& vec, int first, int second) {
const auto temp = vec[first];
vec[first] = vec[second];
vec[second] = temp;
}
Let sort()
be a composed function of those two small, focused, and tested functions.
void SelectionSorter::sort(std::vector<int>& vec) {
for (auto currentIndex = 0; currentIndex < vec.size(); ++currentIndex) {
const auto indexOfSmallestItem = min_element_index(vec, currentIndex);
swap_by_index(vec, currentIndex, indexOfSmallestItem);
}
}
We gain reusable functions, avoid magic numbers (-1
), and can const
-qualify our immutable objects.
Prefer to treat warnings as errors
For these small learning programs and new code bases, turn up your warning level and treat warnings as errors. Older code bases should continue to use the same rules that are already being used.
You have mismatched sign comparisons for each std::vector<int>::size()
comparison.
Use the C++ Style Declarator Layout
void SelectionSorter::sort(std::vector<int>& v) {
-------------------------------------------^
Also wasn't sure if the missing space before the {
was intended. Utilize extensions/tools (Clang-Format, Astyle, etc) to maintain the style layout.