#include <iterator>
#include <memory>
#include <algorithm>
template<typename RandomIt>
void sub_merge(RandomIt begin, RandomIt mid, RandomIt end) {
// create copy of input array;
const auto left_size = std::distance(begin, mid);
const auto right_size = std::distance(mid, end);
auto left = std::make_unique<typename std::iterator_traits<RandomIt>::value_type[]>(left_size);
auto right = std::make_unique<typename std::iterator_traits<RandomIt>::value_type[]>(right_size);
const auto left_begin = left.get();
const auto left_end = left_begin + left_size;
const auto right_begin = right.get();
const auto right_end = right_begin + right_size;
std::copy(begin, mid, left_begin);
std::copy(mid, end, right_begin);
// merge until one array is completed
auto left_iter = left_begin;
auto right_iter = right_begin;
auto arr_iter = begin;
while (left_iter < left_end && right_iter < right_end) {
if (*left_iter < *right_iter) {
*arr_iter = *left_iter;
left_iter++;
} else {
*arr_iter = *right_iter;
right_iter++;
}
arr_iter++;
}
// copy remaining array over
std::copy(right_iter, right_end, arr_iter);
std::copy(left_iter, left_end, arr_iter);
}
template<typename RandomIt>
void merge_sort(RandomIt begin, RandomIt end) {
if (begin < end - 1) {
const auto mid = begin + std::distance(begin, end) / 2;
merge_sort(begin, mid);
merge_sort(mid, end);
sub_merge(begin, mid, end);
}
}
My testing shows this to be correctly implemented, however I want to know if there is anywhere I could be more efficient. I also have no way to prove if my templates are used correctly. I also do not know which Iterator type is appropriate for this. STL uses RandomIterator so I went with that but I think InputIterator or possibly ForwardIterator would be suitable.
3 Answers 3
Naming
sub_merge
is weird, as it doesn't match up at all with the meaning of the english word "submerge". I get its origin ("subroutine merge"), but why not simply call it merge
instead?
Algorithm
This seems like a straightforward top-down implementation of mergesort. However, there are some issues:
Stability
sub_merge
doesn't preserve the relative order of elements that compare equal. However, this can easily be fixed by changing the comparison operator inif(*left_iter < *right_iter)
from<
to<=
.Allocations + copies
Since each
sub_merge
call creates its own scratch space, there are \2ドルn\$ allocations in total. This could be replaced by creating one scratch buffer at the highest level instead.A benefit of doing so is the reduction in the number of necessary copies. Currently,
sub_merge
has to copy each element into the scratch space and then back into the original range. One of those copies could be skipped using a preallocated scratch buffer and alternating between it and the original range on different recursion levels, at the cost of one fixed copy when creating the scratch buffer.
Implementation
- Why not use
std::vector<std::iterator_traits<RandIter>::value_type>
instead of manually managing the memory via thosestd::unique_ptr
s?
Iterator categories
You are correct that a ForwardIterator
would technically work for merge sort. However, using those would mean that you'd need 1.5 extra passes over the range for each midpoint calculation (1 pass for the std::distance
call, and half a pass to find the mid
iterator).
This wouldn't raise the runtime complexity into a worse category, but there are more efficient sorting strategies for those cases that don't require that many passes (especially since those passes usually aren't that cheap in data structures that only have ForwardIterator
s or BidirectionalIterator
s).
That's why RandomAccessIterator
s are preferred for merge sort.
However, a lot of code would need to be changed for that to work, as
ForwardIterator
s only need to support a small subset of the operations aRandomAccessIterator
has to provide (i.e. they only have operators*
,->
,++
,==
and!=
, so no comparisons using<
or iteration using-
).
So, why won't InputIterator
s work? Because merge sort requires multiple passes over the input/output range, and InputIterator
s don't guarantee that they can do so.
Improved version
With some improvements, the implementation could look something like this:
#include <vector>
#include <type_traits>
template<typename Iter>
constexpr bool is_forward_iterator_v = std::is_base_of_v<std::forward_iterator_tag, typename std::iterator_traits<Iter>::iterator_category>;
template<typename ForwardIter>
ForwardIter middle_iterator(ForwardIter first, ForwardIter last) {
static_assert(is_forward_iterator_v<ForwardIter>, "middle_iterator requires at least a ForwardIterator to work properly");
auto dist = std::distance(first, last);
if(dist == 0) return first;
return std::next(first, (dist + 1) / 2);
}
template<typename InIter1, typename InIter2, typename OutIter>
OutIter merge(InIter1 left, InIter1 left_end, InIter2 right, InIter2 right_end, OutIter out) {
while(left != left_end && right != right_end) {
if(*left <= *right) {
*out++ = *left++;
} else {
*out++ = *right++;
}
}
if(left != left_end) return std::copy(left, left_end, out);
else return std::copy(right, right_end, out);
}
template<typename Iter, typename Iter2>
void merge_sort(Iter first, Iter last, Iter2 scratch, Iter2 scratch_end) {
const auto mid = middle_iterator(first, last);
if(mid != last) {
const auto scratch_mid = middle_iterator(scratch, scratch_end);
merge_sort(scratch, scratch_mid, first, mid);
merge_sort(scratch_mid, scratch_end, mid, last);
merge(scratch, scratch_mid, scratch_mid, scratch_end, first);
}
}
template<typename Iter>
void merge_sort(Iter first, Iter last) {
auto scratch_buffer = std::vector<typename std::iterator_traits<Iter>::value_type>(first, last);
merge_sort(first, last, scratch_buffer.begin(), scratch_buffer.end());
}
To Do:
Better assert iterator requirements
Add
noexcept
specifications where possibleMaybe add support for move-only types?
-
\$\begingroup\$ I really like your scratch space idea. \$\endgroup\$Brady Dean– Brady Dean2018年08月13日 02:48:00 +00:00Commented Aug 13, 2018 at 2:48
-
\$\begingroup\$ @BradyDean: It's not really my idea, more a kind of general knowledge of efficient merge sort implementations. Even the pseudocode top-down implementation on the wikipedia site for merge sort does a copy first and reuses that for further recursion. \$\endgroup\$hoffmale– hoffmale2018年08月13日 02:51:30 +00:00Commented Aug 13, 2018 at 2:51
-
\$\begingroup\$ Your code gives me a Segmentation fault and I believe it's related to
if (first != last)
. It works correctly when implemented asfirst + 1 != last
. \$\endgroup\$Brady Dean– Brady Dean2018年08月13日 03:09:13 +00:00Commented Aug 13, 2018 at 3:09 -
2\$\begingroup\$ You know the great quote "I have only proved it right, not run it."? \$\endgroup\$Deduplicator– Deduplicator2018年08月13日 11:05:16 +00:00Commented Aug 13, 2018 at 11:05
-
1\$\begingroup\$ @BradyDean: That fix will only work for
RandomAccessIterator
s. I tried to keep my solution as generic as possible (so it will work withForwardIterator
s), so using-
for distance calculation wouldn't work. I could use another call tostd::distance
, but that will worsen the performance even more for those cases, so I preferred to reuse the distance calculated insidemiddle_iterator
. // Another route would be rewriting the check toif(first != last && std::next(first) != last)
\$\endgroup\$hoffmale– hoffmale2018年08月13日 19:05:59 +00:00Commented Aug 13, 2018 at 19:05
Well, interesting.
There's a bug in
merge_sort()
: Subtracting1
fromend
, without knowing whether the range is empty, is Undefined Behavior. Subtract the iterators and compare the difference instead. While you are at it, usestd::distance
for that.There's no need to move the second half out of the way in
sub_merge()
: You are merging from the front, and the first half fits in the first half of the range.Allocating memory is expensive. Consider whether you can't do one allocation in the public function, and pass it through to the helpers as needed.
Also consider whether you wouldn't prefer your ordering stable. It doesn't cost anything but being careful with the merge. Currently, you prefer the second range where you should prefer the first.
Use prefix operators over postfix operators, unless you really need the copy of the old value. The clarity of intent and occasional increased efficiency shouldn't be rejected.
-
\$\begingroup\$ Do you have a reference for number 5? I've heard the exact opposite before, and now I'm confused as to which it is. \$\endgroup\$user1118321– user11183212018年08月13日 02:18:33 +00:00Commented Aug 13, 2018 at 2:18
-
\$\begingroup\$ I believe I fixed number one by writing it as
if (begin + 1 < end)
. Also I am not sure what you mean by number 2 and 4. What do you mean by I prefer the second range over the first? \$\endgroup\$Brady Dean– Brady Dean2018年08月13日 02:22:15 +00:00Commented Aug 13, 2018 at 2:22 -
\$\begingroup\$ @user1118321 The reasonable way to implement
operator++(int)
isauto r = *this; ++*this; return r;
. Depending on how complex and expensive copying is, the compiler might not be able to lower that to justoperator++()
. Still, most Iterators, especially randomaccess ones, are trivially copied. \$\endgroup\$Deduplicator– Deduplicator2018年08月13日 02:24:37 +00:00Commented Aug 13, 2018 at 2:24 -
\$\begingroup\$ @user1118321: The postfix operator has to create a copy and then advance the original iterator, whereas the prefix operator just directly increases the original iterator and returns that. Not much of a difference, unless the copy creation isn't cheap (or the compiler isn't smart enough to elide unnecessary copies). // @BradyDean:
if(begin + 1 < end)
causes undefined behavior ifbegin == end
andend
is an off-the-end iterator. You wantbegin != end
for the comparison. \$\endgroup\$hoffmale– hoffmale2018年08月13日 02:28:48 +00:00Commented Aug 13, 2018 at 2:28 -
1\$\begingroup\$ @BradyDean 1. No, that might go beyond Bounds the other direction, which isn't any better. 2. The important part is moving mid-2-end out of the way is superfluous. 4. Look at what happens if the elements are equal, which is preferred? \$\endgroup\$Deduplicator– Deduplicator2018年08月13日 02:28:56 +00:00Commented Aug 13, 2018 at 2:28
This assumes that the type being copied has a default constructor.
auto left = std::make_unique<typename std::iterator_traits<RandomIt>::value_type[]>(left_size);
auto right = std::make_unique<typename std::iterator_traits<RandomIt>::value_type[]>(right_size);
If you use std::vector
you can use the constructor that takes iterators and create and copy the elements in one go:
// Replace all this with:
auto left = std::make_unique<typename std::iterator_traits<RandomIt>::value_type[]>(left_size);
auto right = std::make_unique<typename std::iterator_traits<RandomIt>::value_type[]>(right_size);
std::copy(begin, mid, left_begin);
std::copy(mid, end, right_begin);
// Replace all the above with.
using ValueType = std::iterator_traits<RandomIt>::value_type;
std::vector<ValueType> leftSide(begin, mid);
std::vector<ValueType> rightSide(mid, end);
But we are still using a copy.
So lets improve the efficiency with moving the objects.
using ValueType = std::iterator_traits<RandomIt>::value_type;
std::vector<ValueType> leftSide(std::make_move_iterator(begin), std::make_move_iterator(mid));
std::vector<ValueType> rightSide(std::make_move_iterator(mid), std::make_move_iterator(end));
We can also improve the merge part by again using move:
*arr_iter = std::move(*left_iter);
But lets reduce the whole expression to make it easier to read:
ValueType& lObject = *left_iter;
ValueType& rObject = *right_iter;
auto& mIter = (lObject < rObject) ? left_iter : right_iter;
ValueType& mObject = *mIter++;
*arr_iter = std::move(mObject);
At the end lets make sure we move rather than copy
// This can be replaced with:
std::copy(right_iter, right_end, arr_iter);
std::copy(left_iter, left_end, arr_iter);
// Move version of copy; called move (no need for move iterators):
std::move(right_iter, right_end, arr_iter);
std::move(left_iter, left_end, arr_iter);
You are making an assumption here (that begin != end)
if (begin < end - 1) {
Easier to use:
if (std::distance(begin, end) > 1) {
Note:
Your types don't need to be move able to use this. If there is no move semantics associated with the object then copy will be used as a backup automatically. But this way if they are moveable then you will use the move operation,
std::merge
? Otherwise, I don't see the point in writing your own loop like that. \$\endgroup\$