I have a college assignment to implement a class called sorter, whose object is a sorted vector (the user is allowed to sort it by the method he/she desires). I have been able to implement it as following. Please review it and suggest me the changes I should make to make it efficient.
template <typename T>
class Sorter {
std::vector<T> vector;
int count(T* array) {
register int i = 0;
while (*(array + i)) {
i++;
}
i--;
return i;
}
void swap(T& t1, T& t2) {
T temp = t1;
t1 = t2;
t2 = temp;
}
void bubbleSort() {
for (int i = 0; i < vector.size(); i++) {
for (int j = 0; j < vector.size() - 1; j++) {
if (vector[j] > vector[j + 1]) {
swap(vector[j], vector[j + 1]);
}
}
}
}
std::vector<T> mergeSort(std::vector<T> vec) {
int begin = 0;
int end = vec.size() - 1;
auto combine = [=](std::vector<T> lv, std::vector<T> rv) {
std::vector<T> retVec;
int i = 0;
int j = 0;
while (i < lv.capacity() && j < rv.capacity()) {
if (lv[i] < rv[j]) {
retVec.push_back(lv[i]);
i++;
}
else {
retVec.push_back(rv[j]);
j++;
}
}
while (i < lv.capacity()) {
retVec.push_back(lv[i]);
i++;
}
while (j < rv.capacity()) {
retVec.push_back(rv[j]);
j++;
}
return retVec;
};
if (vec.size() == 1) {
return vec;
}
int mid = (begin + end) / 2;
auto it = vec.begin();
std::vector<T> left = mergeSort(std::vector<T>(it, it + mid + 1));
std::vector<T> right = mergeSort(std::vector<T>(it + mid + 1, vec.end()));
return combine(left, right);
}
void sort() {
std::cout << "Choose your option:" << "\n";
std::cout << "1. bubblesort" << "\n";
std::cout << "2. mergesort" << "\n";
int op = 2;
//std::cin >> op;
switch (op) {
case 1:
bubbleSort();
break;
case 2:
vector = mergeSort(vector);
break;
}
}
public:
Sorter() = delete;
Sorter(std::vector<T> vec) : vector(vec) {
sort();
}
Sorter(T* array) {
int num = count(array);
for (int i = 0; i < num; i++) {
this->vector.push_back(*(array + i));
}
sort();
}
std::vector<T> getSorted() {
return vector;
}
};
The updated version is always available at this link
2 Answers 2
In the sort() function its currently hardcoded to merge sort which I assume is not intentional. However, if you do allow the user to give an input you should do some error checking. You'll currently get unexpected behaviour if the user enters anything other than 1 or 2.
You should use std::swap rather than implementing your own swap function. Some types can make optimisations over what you've done.
From a design point of view I would argue that c++ is not an inherently object oriented language and that bubblesort and mergesort should be free functions taking a pair of iterators, to allow them to be more easily reused. You've pretty much done this for merge sort already, it uses none of the members of the Sorter class so could just be made a free function.
Your bubble sort implementation is inefficient. It should check if the vector is sorted after each pass, rather than doing the outer loops you could do
while(!isSorted(vector))
or similar.Merge sort has a very big lambda in it for combine. From a readability point of view I would pull that out into a separate function.
Both mergesort and bubblesort use indices with operator[] to get the elements from the vector. The preferred way to do this in c++ is using iterators, see https://stackoverflow.com/questions/131241/why-use-iterators-instead-of-array-indices
In
count()
register is deprecated (and strictly is removed since c++17). You should probably avoid using it to improve compatibility.I would need to think about it more, but I question if
while (*(array + i))
in count will always behave as expected. You assume that the array has an end sentinel value that will dereference to null. I don't that's guaranteed. As far as I know there is not an easy way to get the size from a c style array. The normal way to handle this is to require the user to pass in the size as well.Update on capacity(). Your use of capacity() in the merge sort is not correct. You should use size() instead. size() tells you the number of elements currently in the vector whereas capacity() tells you how many elements the vector reserved space for internally. i.e. if you insert more than this number of things into the vector it will internally reserve more space and do reallocate all the elements. It's important to note that the capacity can be greater than the number of elements in the vector. In your case it works ok as the constructor with a pair of iterators initialises the capacity equal to the size. However, I don't think this is guaranteed and is complier dependant. To see the difference try the following code:
std::vector<int> v;
std::cout << 0 << " " << v.size() << " " << v.capacity() <<"\n";
for( int i = 0; i < 10; ++i)
{
v.push_back(i);
std::cout << i << " " << v.size() << " " << v.capacity() <<"\n";
}
0 0 0
0 1 1
1 2 2
2 3 4
3 4 4
4 5 8
5 6 8
6 7 8
7 8 8
8 9 16
9 10 16
-
\$\begingroup\$
while (*(array + i))
is definitely wrong, it assumes there's a sentinel value that can be converted to afalse
. Definitely not guaranteed, and usually doesn't even compile. \$\endgroup\$Mooing Duck– Mooing Duck2020年03月27日 00:49:13 +00:00Commented Mar 27, 2020 at 0:49 -
1\$\begingroup\$
(begin + end) / 2
risks overflow, but that's more of an edge case. \$\endgroup\$Mooing Duck– Mooing Duck2020年03月27日 00:51:44 +00:00Commented Mar 27, 2020 at 0:51 -
1\$\begingroup\$ @nivag, I agree with your opinion completely. However, I always wondered about point 6 you have mentioned, from like when I learned c++. I know that std::vector::begin will return me an iterator pointing to the first element and then, to point to the, umm.. lets say third element (any element in-between actually), I will have to increment the iterator that many times. This is not as simple as saying
std::vector<int> v = { 1,2,3,4,5 }; std::cout << v[3]
. So, is there any equivalent function in c++ which will return me an iterator directly pointing to nth element? If there is, how do I use it? \$\endgroup\$cpplover– cpplover2020年03月27日 06:18:50 +00:00Commented Mar 27, 2020 at 6:18 -
1\$\begingroup\$ If you really want a particular element v[3] can be fine, but in your case you are mainly just looping through all the elements. Also vector has a random access iterator so supports e.g.
v.begin()+3
if needed. I did mean to say something about capacity but ran out of time yesterday. I'll try and add something later if no one else does. \$\endgroup\$nivag– nivag2020年03月27日 09:09:24 +00:00Commented Mar 27, 2020 at 9:09 -
1\$\begingroup\$ @cpplover: Well, if the array has 1.5 million entries, and
begin
is 1,499,999,999, andend
if 1,500,000,000, thenbegin+end
is 2,999,999,999, but the maximum that fits in an int is 2,147,483,648, sobegin+end
will result in an unspecified number, usually -852,516,351. Then -852,516,351/2 is -426,258,175, so your code will try to access index -426,258,175 of the array.int mid = begin/2 + end/2;
doesn't have this overflow problem. \$\endgroup\$Mooing Duck– Mooing Duck2020年03月27日 17:11:08 +00:00Commented Mar 27, 2020 at 17:11
READABILITY
Function: int count(T* array)
I don't prefer this way of acessing a (T) while(*(array+i))
, because of readability reasons. Better way: while(array[i])
PERFORMANCE
Function: void swap(T& t1, T& t2)
Faster way is to use XOR (^) swap algorithm. it is faster and small amout of memory is used.
if (t1 != t2)
{
*t1 ^= *t2;
*t2 ^= *t1;
*t1 ^= *t2;
}
and function declaration should be inline void swap(T& t1, T& t2)
Function: std::vector<T> getSorted()
Should be inline std::vector<T> getSorted()
-
5\$\begingroup\$ I wouldn't recommend an xor swap not least because it won't for lots of types and this code looks like it expects to sort a vector of any type. See stackoverflow.com/questions/10549155/… for more reasons. \$\endgroup\$nivag– nivag2020年03月26日 17:56:00 +00:00Commented Mar 26, 2020 at 17:56
-
2\$\begingroup\$ Also there are bigger problems, namely that
count
usually reads past the end of the array. \$\endgroup\$Mooing Duck– Mooing Duck2020年03月27日 00:46:56 +00:00Commented Mar 27, 2020 at 0:46 -
4\$\begingroup\$ The XOR algorithm is much slower and uses no less memory than the normal swap. But a bigger problem is that
swap(a, a)
will result ina = 0
. And marking functions defined in the class has zero use. \$\endgroup\$L. F.– L. F.2020年03月27日 02:11:42 +00:00Commented Mar 27, 2020 at 2:11 -
1\$\begingroup\$ Also, I feel, it is better to call Pointer declared arrays as *(array+i) instead of array[i]. This helps in emphasizing that the array has been declared as a pointer and allocated dynamically, instead of static allocation. \$\endgroup\$kesarling– kesarling2020年03月27日 06:31:09 +00:00Commented Mar 27, 2020 at 6:31
-
\$\begingroup\$ Oops, I meant "marking functions defined in the class
inline
" in my last comment ... Sorry \$\endgroup\$L. F.– L. F.2020年03月28日 04:13:49 +00:00Commented Mar 28, 2020 at 4:13
Explore related questions
See similar questions with these tags.
quicksort
function. Am I supposed to post it as a new question with this one linking to that (and that one linking to this) or do I update the code here itself? \$\endgroup\$