I wanted to solve a problem where I sort an array of any size and return the minimum number of swaps.
The explanation for the distance 2 is that I choose 3 neighboring elements ABC and then swap them to CBA.
My code partially works as long as the array isn't too big, but pretty inefficient in my opinion.
Anyone have a suggestion for improvement?
#include <algorithm>
#include <vector>
#include <unordered_map>
using namespace std;
int index_of_largest_number(int n, int arr[]) {
int ret = 0;
for (int i = 0; i < n; i++) {
if (abs(arr[i]) > abs(arr[ret])) {
ret = i;
}
}
return ret;
}
long long my_sort(int n, int arr[]) {
vector<int> sorted(arr, arr + n);
sort(sorted.begin(), sorted.end());
unordered_map<int, int> arr_indices; // this map is to find the index in the original array
unordered_map<int, int> sorted_indices; // this map is to find the index in the sorted vector
int distance[n]; // this array stores the distance from the current position to the sorted position; index is the same as in sorted, not from the original that I don't have to change it every time
for (int i = 0; i < n; i++) {
sorted_indices[sorted[i]] = i;
}
for (int i = 0; i < n; i++) {
int d = sorted_indices[arr[i]] - i;
distance[sorted_indices[arr[i]]] = d;
arr_indices[i + d] = i;
if (d % 2 != 0) // if the distance is not a multiple of 2, it is not possible to sort it with this method, and it returns -1
return -1;
}
long long swaps = 0;
while (true) {
int idx = index_of_largest_number(n, distance); // search the index of the element with the largest distance to its sorted position
if (distance[idx] == 0)
break;
swaps++;
// and here comes the tricky part. It works kind of but it is a bit messy
if (distance[idx] < 0) {
distance[sorted_indices[arr[arr_indices[idx]]]] += 2;
distance[sorted_indices[arr[arr_indices[idx] - 2]]] -= 2;
int tmp = arr[arr_indices[idx]];
arr[arr_indices[idx]] = arr[arr_indices[idx] - 2];
arr[arr_indices[idx] - 2] = tmp;
arr_indices[idx] -= 2;
arr_indices[idx - 2] += 2;
} else {
distance[sorted_indices[arr[arr_indices[idx]]]] -= 2;
distance[sorted_indices[arr[arr_indices[idx] + 2]]] += 2;
int tmp = arr[arr_indices[idx]];
arr[arr_indices[idx]] = arr[arr_indices[idx] + 2];
arr[arr_indices[idx] + 2] = tmp;
arr_indices[idx] += 2;
arr_indices[idx + 2] -= 2;
}
}
return swaps;
}
```
1 Answer 1
General Observations
Because the code only partially works the question is actually off-topic on the Code Review Community, please read How do I ask a good question? before you post another question. Question, are you only compiling in DEBUG mode, or have you compiled it using the -O3 optimization and it is still slow?
The name my_sort
doesn't really seem to represent what the function actually does, which is count the number of swaps.
Prefer C++ Container Classes Over Old C Style Arrays
If the distance
array were declared as either a C++ std::vector
or a C++ std::array
you would automatically be able to use iterators and other features that the C++ containers provide. Personally I would make the distance
array a std::vector
, a lot more flexibility there.
Avoid using namespace std;
If you are coding professionally you probably should get out of the habit of using the using namespace std;
statement. The code will more clearly define where cout
and other identifiers are coming from (std::cin
, std::cout
). As you start using namespaces in your code it is better to identify where each function comes from because there may be function name collisions from different namespaces. The identifiercout
you may override within your own classes, and you may override the operator <<
in your own classes as well. This stack overflow question discusses this in more detail.
DRY Code
There is a programming principle called the Don't Repeat Yourself Principle sometimes referred to as DRY code. If you find yourself repeating the same code mutiple times it is better to encapsulate it in a function. If it is possible to loop through the code that can reduce repetition as well.
The then and else compound statements in the following code repeat themselves, with the exception of the value of -2 or 2. The repetative code could be written as a function or lambda expression to reduce the amount of code in the my_sort()
function.
// and here comes the tricky part. It works kind of but it is a bit messy
if (distance[idx] < 0) {
distance[sorted_indices[arr[arr_indices[idx]]]] += 2;
distance[sorted_indices[arr[arr_indices[idx] - 2]]] -= 2;
int tmp = arr[arr_indices[idx]];
arr[arr_indices[idx]] = arr[arr_indices[idx] - 2];
arr[arr_indices[idx] - 2] = tmp;
arr_indices[idx] -= 2;
arr_indices[idx - 2] += 2;
}
else {
distance[sorted_indices[arr[arr_indices[idx]]]] -= 2;
distance[sorted_indices[arr[arr_indices[idx] + 2]]] += 2;
int tmp = arr[arr_indices[idx]];
arr[arr_indices[idx]] = arr[arr_indices[idx] + 2];
arr[arr_indices[idx] + 2] = tmp;
arr_indices[idx] += 2;
arr_indices[idx + 2] -= 2;
}
To get arround the need to pass the arrays in the function you might create an class that contains the distance array
and the two unordered maps.
Complexity
The function my_sort()
is too complex (does too much). While it barely fits on a page it could be simplified by making the 2 for loops functions, as well as the recommendations in the DRY Code section above.
There is also a programming principle called the Single Responsibility Principle that applies here. The Single Responsibility Principle states:
that every module, class, or function should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by that module, class or function.
Line Length
Most modern IDEs are flexible as far as line length goes, but having to scroll right or left makes it more difficult to read the code. The comment on the distance
array makes the line 200 characters wide. This can be problematic, try to keep lines under 80 characters, definitely limit them to 120 characters to prevent scrolling.
Performance
The code would probably perform better if you used iterators rather than indexes in the loops. All of the loops can probably be sped up if you use iterators.
Explore related questions
See similar questions with these tags.
1,3,2,4
?? \$\endgroup\$