Avoid a gratuitous albeit small pessimization when sorting an std::deque
The following line will likely perform a few extra operations if we try to sort an std::deque
with quickmergesort
:
auto pivot = first + 2 * (size / 3) - 2;
Since operator+
is left-associative, the statement above is equivalent to the following one:
auto pivot = (first + 2 * (size / 3)) - 2;
When first
is an std::deque
iterator, it is first incremented by first + 2 * (size / 3)
, then decremented by 2
. Considering that operator+
and operator-
are not trivial for std::deque
iterators, computing 2 * (size / 3) - 2
then incrementing first
is likely more efficient. We can thus avoid this small albeit gratuitous pessimization by transforming the statement as follows:
auto pivot = first + (2 * (size / 3) - 2);
In this specific algorithm, it should not make a noticeable difference, but some algorithms can become dramatically faster with std::deque
(and other random-access containers with not-trivial logic) when making sure that we are avoiding such pessimizations.
Avoid a gratuitous albeit small pessimization when sorting an std::deque
The following line will likely perform a few extra operations if we try to sort an std::deque
with quickmergesort
:
auto pivot = first + 2 * (size / 3) - 2;
Since operator+
is left-associative, the statement above is equivalent to the following one:
auto pivot = (first + 2 * (size / 3)) - 2;
When first
is an std::deque
iterator, it is first incremented by first + 2 * (size / 3)
, then decremented by 2
. Considering that operator+
and operator-
are not trivial for std::deque
iterators, computing 2 * (size / 3) - 2
then incrementing first
is likely more efficient. We can thus avoid this small albeit gratuitous pessimization by transforming the statement as follows:
auto pivot = first + (2 * (size / 3) - 2);
In this specific algorithm, it should not make a noticeable difference, but some algorithms can become dramatically faster with std::deque
(and other random-access containers with not-trivial logic) when making sure that we are avoiding such pessimizations.
The left partition is always smaller
The left partition is always smaller
The left partition is always smaller
The left partition is always smaller
I realized that at least two things could be simplified since asking the question.
half_inplace_merge
takes only one kind of iterators
The function half_inplace_merge
was borrowed from libc++'s implementation of std::inplace_merge
. However, despite its name, std::inplace_merge
might allocate additional memory to perform an out-of-place merge when possible, so when half_inplace_merge
is called, it's sometimes with different kinds of iterators.
However, in our case, the merge is always internal, and swaps elements in the same collection, so the iterators are always the same. We can thus simplify the signature of half_inplace_merge
as follows:
template<
typename InputIterator,
typename OutputIterator,
typename Compare = std::less<>
>
auto half_inplace_merge(InputIterator first1, InputIterator last1,
InputIterator first2, InputIterator last2,
OutputIterator result, Compare compare={})
-> void
{
// ...
}
It doesn't seem to incur any additional performance cost or benefit, but it makes the code simpler and cleaner.
The left partition is always smaller
In libc++'s implementation of std::inplace_merge
, the left and right partitions to merge in a sequence could have any size, so the check to know which of the partitions was smaller was necessary to always make the best of the algorithm.
However, in our case, the sequence to mergesort is always split in two parts, the first one always being the smaller one. Moreover, an additional trick is used to shrink the left partition, making it even smaller, without having any effect on the size of the right partition. In the end, when buffered_inplace_merge
is called, the left partition is always smaller than the right one, and makes the check useless. The function can be simplified as follows:
template<
typename ForwardIterator,
typename Compare = std::less<>
>
auto buffered_inplace_merge(ForwardIterator first, ForwardIterator middle,
ForwardIterator last, ForwardIterator buffer,
Compare compare={})
-> void
{
auto buffer_end = std::swap_ranges(first, middle, buffer);
half_inplace_merge(buffer, buffer_end,
middle, last,
first, compare);
}
Note that dropping the reverse iterators lowered the iterator requirements of the function to forward iterators, even though the overall quick_merge_sort
still requires random-access iterators because of std::nth_element
.
Dropping std::not_fn
also means that the code doesn't use any C++17 feature anymore and works out-of-the-box with a C++14 compiler. Sure, it could be made to work with even C++03, but that's not the point :)