This code implements a simple wrapper, intended to be used with range-for to manipulate items of a container as overlapping pairs, ie. 3 items will give 2 pairs.
Code review focus: Is pair_range
good modern C++? Any easily avoidable performance issues, like unnecessary copies?
#include <iostream>
#include <typeinfo>
#include <iterator>
#include <optional>
#include <utility>
template<class V, template<class...> class C>
class pair_range {
public:
struct pair_iterator {
using iterator_category = std::forward_iterator_tag;
using difference_type = ssize_t;
using value_type = std::pair<V, V>;
using pointer = ssize_t;
using reference = std::pair<V, V>; // no actual reference available
pair_iterator(const pair_range & sequence, pointer index = 0)
: m_sequence(sequence)
, m_index(index) {
};
const reference operator*() const {
return value_type{m_sequence.at(m_index), m_sequence.at(m_index+1)};
}
//operator->() not possible because pairs are not lvalues
// prefix increment
auto &operator++() {
m_index++;
return *this;
}
// postfix increment
auto operator++(int) {
auto tmp = *this;
++(*this);
return tmp;
}
friend bool operator== (const pair_iterator &a, const pair_iterator &b) {
return a.m_index == b.m_index && a.m_sequence.has_same_container(b.m_sequence);
};
friend bool operator!= (const pair_iterator &a, const pair_iterator &b) {
return a.m_index != b.m_index || !a.m_sequence.has_same_container(b.m_sequence);
};
private:
const pair_range &m_sequence;
pointer m_index = 0;
};
pair_range(const C<V> &source) : m_source_ptr(&source) {
const std::type_info &ti = typeid(*this);
std::cout << "&-constructor, type=" << ti.name() << std::endl;
}
pair_range(const C<V> &&source) : m_source_copy(std::move(source)), m_source_ptr(&m_source_copy.value()) {
const std::type_info &ti = typeid(*this);
std::cout << "&&-constructor, type=" << ti.name() << std::endl;
}
pair_iterator begin() const {
return pair_iterator{*this};
}
pair_iterator end() const {
auto size = m_source_ptr->size();
auto end = static_cast<typename pair_iterator::pointer>(size == 0 ? 0 : size - 1);
return pair_iterator{*this, end};
}
bool has_same_container(const pair_range &other) const {
return m_source_ptr == other.m_source_ptr;
}
private:
const V &at(typename pair_iterator::pointer index) const {
return (*m_source_ptr)[index];
}
const std::optional<const C<V> > m_source_copy;
const C<V> * const m_source_ptr = nullptr;
};
// test code
#include <algorithm>
#include <map>
#include <string>
#include <vector>
#include <variant>
template <class T>
void printPair(const T &pair)
{
const std::type_info &ti = typeid(pair);
std::cout << std::get<0>(pair) << ' ' << std::get<1>(pair)
<< ", type=" << ti.name() << std::endl;
}
int main()
{
std::cout << "--- empty list ---\n";
const std::vector<int> list1{};
for(auto pair : pair_range(list1)) {
printPair(pair);
}
std::cout << "--- 1 elements ---\n";
for(auto &pair : pair_range( std::vector{1} )) {
printPair(pair);
}
std::cout << "--- 2 elements ---\n";
std::vector xy{ std::string("x"), std::string("y") };
for(auto &pair : pair_range(xy)) {
printPair(pair);
}
std::cout << "--- 4 elements ---\n";
const std::string abcd = {"abcd"};
for(const std::tuple<char, char> &pair : pair_range(abcd)) {
printPair(pair);
}
}
Here's the output of that code on my machine:
--- empty list ---
&-constructor, type=10pair_rangeIiSt6vectorE
--- 1 elements ---
&&-constructor, type=10pair_rangeIiSt6vectorE
--- 2 elements ---
&-constructor, type=10pair_rangeINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESt6vectorE
x y, type=St4pairINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_E
--- 4 elements ---
&-constructor, type=10pair_rangeIcNSt7__cxx1112basic_stringEE
a b, type=St5tupleIJccEE
b c, type=St5tupleIJccEE
c d, type=St5tupleIJccEE
2 Answers 2
std::cout << "&-constructor, type=" << ti.name() << std::endl;
For this debugging output, consider using __PRETTY_FUNCTION__
(MSVC calls it __FUNCSIG__
) instead of std::type_info
. On GCC, your way prints
&-constructor, type=10pair_rangeIiSt6vectorE
&&-constructor, type=10pair_rangeIiSt6vectorE
whereas a simple std::cout << __PRETTY_FUNCTION__ << '\n';
prints
pair_range<V, C>::pair_range(const C<V>&) [with V = int; C = std::vector]
pair_range<V, C>::pair_range(const C<V>&&) [with V = int; C = std::vector]
Consider naming it pair_range::iterator
instead of pair_range::pair_iterator
.
Consider naming it adjacent_pairs_range
instead of just pair_range
; the operation here is pretty similar to what happens inside std::adjacent_find
and std::adjacent_difference
. In fact, a great test of this code would be for you to reimplement std::adjacent_find
in terms of std::find
-over-an-adjacent_pairs_range
!
Your pair_range
is trying to do something weird where if you pass its constructor an lvalue it captures a pointer to the lvalue, but if you pass it an rvalue it captures a copy of the rvalue and a pointer to its own copy. There are at least three things wrong here:
Value category is not lifetime. It's extremely easy to create a dangling
pair_range
, and worse, in some common cases where the user might think they were creating a danglingpair_range
, the code happens to work anyway! It's hard to form a correct mental model of what's going on with this type.Capturing a pointer into yourself is never a great idea. It generally means that you need to follow the Rule of Three (or Five), instead of the Rule of Zero, because copying or moving such an object won't update its internal pointer-to-self.
for (auto&& p : pair_range(std::vector{1,2,3})) // OK for (auto&& p : std::identity()(pair_range(std::vector{1,2,3}))) // UB, prints garbage
For a simpler mental model, and a simpler implementation (more efficient and twice as easy to unit-test), I recommend giving your type only a constructor from const R&
and always capturing a pointer to that lvalue. Let the caller worry about making sure the lvalue lives long enough to be used; and let the caller forget about any value-category-related gotchas built into pair_range
. (This is the Unix philosophy, btw: simple, composable tools that do one thing, instead of being able to do two things and always having to guess which of them the user wants.)
Typo or misunderstanding (or both) here:
pair_range(const C<V> &&source) : m_source_copy(std::move(source)), m_source_ptr(&m_source_copy.value()) { }
Since source
here is a reference to a const-qualified C<V>
, there's really no point in trying to move from it. std::move
is for when you have an rvalue that you can pilfer the guts of, i.e., source
should be non-const-qualified.
Also, all of these constructors should be explicit
as a matter of course. The only reason to create a non-explicit ("implicit") constructor in C++ is when you want implicit conversion, either from a single argument (e.g. myBignum = 42
instead of myBignum = Bignum(42)
) or from a braced sequence (e.g. myPair = {'x', 42}
instead of myPair = Pair('x', 42)
, or myWidget = {}
instead of myWidget = Widget()
).
//operator->() not possible because pairs are not lvalues
It's possible; you need the arrow proxy idiom. And I think it's worth doing, because if I've got an iterator it
that points into a range of pairs, you know I'm going to want to access it->first
and it->second
. If I can't do that, it's going to be annoying.
As JDługosz mentioned, you can omit operator!=
in C++20; and even in C++17, it's probably more readable in this case to implement its body as return !(a == b);
.
Your pair_iterator
has a data member of reference type. Never use data members of const or reference type! In this particular case, it bites you by making pair_iterator
's assignment operator implicitly deleted (because references are not rebindable). So pair_iterator
does not satisfy C++20's input_or_output_iterator
concept.
Here's some good C++20 advice: If you're trying to implement a type that satisfies some library concept, such as "This type is an iterator" or "This type is a contiguous range" or "This type is regular" or "This type is three-way comparable" — static_assert
is your friend!
static_assert(std::input_or_output_iterator<adjacent_pair_range<int[10]>::iterator>);
If that assertion fails, you know you have work to do.
...And waitaminute, what the heck is going on with these template parameters?!
template<class V, template<class...> class C>
class pair_range {
No no no. It should look more like
template<class R>
class adjacent_pair_range {
where R
is the type of the range to which you're capturing a pointer.
Your pointer
type shouldn't be ssize_t
; that's an integer type. Use std::pair<V,V>*
or perhaps just void
(does that work? I haven't tested). Alternatively, in C++20 and later, just omit pointer
because nobody cares anymore.
Your operator*
returns const reference
, which is wrong: returning "by const value" is never meaningful, and if reference
actually were a reference type (which it's not), then const reference
would be exactly the same type as reference
(because all references are const themselves; what you might have meant here was "reference to const," but that's not the same thing).
Once you implement the arrow proxy idiom, your reference
type will end up being std::pair<V&, V&>
. This is good because it allows people to iterate over and mutate their ranges, like this:
template<class R>
void inclusive_scan(R&& r) {
for (auto&& pair : adjacent_pairs_range(r)) {
pair->second += pair->first;
}
}
std::vector<int> v = {3,1,4,1,5};
inclusive_scan(v);
assert(( v == std::vector<int>{3,4,8,9,14} ));
I recommend fixing all this stuff — and whatever more you need to fix in order to make your type model forward_range
; remember to use static_assert
liberally in your test cases — and then ask a new question to get more reviews. :)
I thought this exact feature was in Eric's Ranges v3 library, but I don't see it in the C++20 docs. You might look at the original library to find it. (It might be adjecent_view
noted for C++23, but cppreference doesn't explain it)
I always use Boost's iterator facade and iterator adaptor helpers to do anything like this. Then you don't have to write all the details, but just fill in the interesting function.
In C++20, operator!=
will be autogenerated from ==
.
⧺SL.io.50 Don't use endl
.
-
\$\begingroup\$ Those
std::endl
are used with purpose to flush the stream. Should really usestd::cerr
; instead, actually. \$\endgroup\$hyde– hyde2021年12月16日 21:43:42 +00:00Commented Dec 16, 2021 at 21:43 -
2\$\begingroup\$ @hyde: Because misuse of
std::endl
has become endemic, I usually recommend<< '\n' << std::flush
instead, to convey to readers that you really intend to flush the stream. Slightly more typing, but saves time handling the review feedback! \$\endgroup\$Toby Speight– Toby Speight2021年12月18日 09:34:57 +00:00Commented Dec 18, 2021 at 9:34