In the old days when C was the predominant language under the hood and STL/templates were Alex Stepanov's dream, in order for programmers to achieve generality for functions and data-containers, the void*
was used as input argument or underlying container type respectively. A typical example is qsort
which is located in <cstdlib>
.
Nowadays, when dealing with legacy code-bases or code written by dinosaurs, it's highly probable that you might stumble on such kind of data structures that most probably keep their elements in void**
buffers. The primary goal of-course would be to gradually move the code base towards use of modern STL containers and algorithms until these old data-structures become obsolete.
However, there are also the dinosaurs. Often, very large dinosaurs like a tyrannosaur that happens to be your manager. In order to convince them of the C++/STL superiority with out questioning the "usability" of the legacy data-structures that are being functioning all these years without a problem and most probable the author is one of them, I decided to engage the issue politically.
What I thought is to craft a template iterator that could deal with such void**
buffers and would act as a bridge with the STL algorithms (e.g., std::sort
, std::copy
etc.).
Down below lies in a very early stage such an iterator:
template<typename T>
class Iterator : public std::iterator<std::bidirectional_iterator_tag, T> {
using T_ptr = std::remove_pointer_t<T>*;
void **pos;
public:
Iterator(void **pos_) : pos(pos_) { }
bool operator==(Iterator const &other) const { return pos == other.pos; }
bool operator!=(Iterator const &other) const { return pos != other.pos; }
bool operator<( Iterator const &other) const { return pos < other.pos; }
bool operator>( Iterator const &other) const { return pos > other.pos; }
bool operator<=(Iterator const &other) const { return pos <= other.pos; }
bool operator>=(Iterator const &other) const { return pos >= other.pos; }
Iterator& operator++() {
++pos;
return *this;
}
Iterator operator++(int) {
Iterator out(*this);
++pos;
return out;
}
Iterator& operator--() {
--pos;
return *this;
}
Iterator operator--(int) {
Iterator out(*this);
--pos;
return out;
}
Iterator& operator+=(int const n) {
pos += n;
return *this;
}
Iterator& operator-=(int const n) {
pos -= n;
return *this;
}
T& operator[](int const n) { *static_cast<T_ptr>(*(pos + n)); }
T& operator*() { return *static_cast<T_ptr>(*pos); }
T_ptr operator->() { return static_cast<T_ptr>(*pos); }
friend Iterator operator+(Iterator const &lhs, int const n) {
Iterator out(lhs);
out.pos += n;
return out;
}
friend Iterator operator-(Iterator const &lhs, int const n) {
Iterator out(lhs);
out.pos -= n;
return out;
}
friend Iterator operator+(int const n, Iterator const &rhs) {
Iterator out(rhs);
out.pos += n;
return out;
}
friend Iterator& operator-(int const n, Iterator const &rhs) {
Iterator out(rhs);
out.pos -= n;
return out;
}
friend int operator-(Iterator const &A, Iterator const &B) { return B.pos - A.pos; }
};
My ambition is to use this iterator in the following manner. Supposedly, I had the following class
(e.g., Foo
):
struct Foo { int val = 0; explicit Foo(int val_) : val(val_) {} Foo() = default; Foo(Foo const&) = default; Foo(Foo &&) = default; Foo& operator=(Foo const&) = default; Foo& operator=(Foo &&) = default; bool operator< (Foo const& rhs) const { return val < rhs.val; } bool operator==(Foo const& rhs) const { return val == rhs.val; } };
And the following buffer of void*
:
Foo f1(1), f2(2), f3(3), f4(4); void* v[] = {&f4, &f2, &f1, &f3};
And then use for example std::sort
to sort v
with respect the Foo
objects that indirectionally contains:
std::sort(Iterator<Foo>(v), Iterator<Foo>(v + sizeof(v) / sizeof(void*)));
I've been careful to inherit my iterator from the std::iterator<
std::bidirectional_iterator_tag
, T>
and not std::iterator<
std::random_access_iterator_tag
, T>
in order to avoid treating the void**
buffer as contiguous buffer
of T
s with what ever implications that might have.
Is this iterator scheme safe? Or are there any quirks or oddities that I must be aware of?
2 Answers 2
Foo f1(1), f2(2), f3(3), f4(4); void* v[] = {&f4, &f2, &f1, &f3};
And then use for example
std::sort
to sortv
with respect theFoo
objects that indirectionally contains:std::sort(Iterator<Foo>(v), Iterator<Foo>(v + sizeof(v) / sizeof(void*)));
This is already a disaster waiting to happen. v
is an array of pointers, and you should only treat it as such. Try to imagine what happens one of the pointers in v
is actually the nullptr
?
That's perfectly legit with the original array, and also when using the old C functions such as qsort
to sort the array itself (given that the predicate is nullptr
aware).
As a matter of fact, using std::sort
on the referenced values as you just did isn't even semantically the same as applying original qsort
straight to the array. You are copying around values while the array remains unchanged, whereby qsort
would have resorted the array while keeping the values constant.
I've been careful to inherit my iterator from the
std::iterator<std::bidirectional_iterator_tag, T>
and notstd::iterator<std::random_access_iterator_tag, T>
in order to avoid treating thevoid**
buffer as contiguous buffer ofT
s with what ever implications that might have.
Except that is is a contiguous buffer of T*
s. Just mind again the difference between T
and T*
.
Given the possibility of nullptr
, it can't even be an iterator for T
in any form. See the is dereferenceable requirement.
Yes, not being able to bridge to stl
sucks. But to be honest, the problem started when someone decided to write this:
Foo f1(1), f2(2), f3(3), f4(4);
void* v[] = {&f4, &f2, &f1, &f3};
instead of this:
Foo v[] = {Foo(1), Foo(2), Foo(3), Foo(4)};
Which are both valid since C99.
Which could then have been a refactored to:
std::array<Foo, 4> v = {Foo(1), Foo(2), Foo(3), Foo(4)};
With v.data()
giving you access to the raw array, if you actually still needed it.
-
\$\begingroup\$ Nice solution using proper ownership semantics. +1 \$\endgroup\$Incomputable– Incomputable2016年10月19日 17:35:43 +00:00Commented Oct 19, 2016 at 17:35
-
\$\begingroup\$ Well spotted, but what if I can make sure (some how, if I tell you how then I will have to kill you :P) that my buffer won't contain
nullptr
s. Do you see any other danger apart from that? \$\endgroup\$Dimitrios Bouzas– Dimitrios Bouzas2016年10月19日 18:46:43 +00:00Commented Oct 19, 2016 at 18:46 -
\$\begingroup\$ @101010 Well, next up you are going to tell me you have polymorphism in that array, and you need a list of pointers for that reason. And then I will have to tell you, that the value-semantics will horribly break with that, as they essentially enforce a static cast on the very first call-by-value, disregarding the additional memory requirements of the inheriting type, worst case leaving you with an access violation / memory corruption as a virtual method (which still resolves correctly!) attempts to access attributes not present in the base class. There goes not only the foot but the whole leg. \$\endgroup\$Ext3h– Ext3h2016年10月19日 19:58:49 +00:00Commented Oct 19, 2016 at 19:58
-
\$\begingroup\$ @101010 Nevermind, it's not going to be that bad. It's "only" going to resolve the static cast, forcefully using the copy / move constructor of the corresponding base class. It's still going to result in unexpected behavior when using any
stl
algorithm internally calling tostd::move
orstd::swap
. \$\endgroup\$Ext3h– Ext3h2016年10月20日 01:31:55 +00:00Commented Oct 20, 2016 at 1:31
Is this iterator scheme safe? Or are there any quirks or oddities that I must be aware of?
I cannot provide an answer for this, as I'm not sure myself, but instead I leave a comment on two regarding your code:
Don't break encapsulation
friend Iterator operator+(Iterator const &lhs, int const n) {
Iterator out(lhs);
out.pos += n;
return out;
}
This is completely unnecessary. You already implemented Iterator::operator+=
, so just use it:
template<typename T>
Iterator<T> operator+(Iterator<T> const & lhs, int const n) {
Iterator<T> out(lhs);
out += n;
return out;
}
This is just a free function. No need to friend
anything.
Is this really what you want?
friend Iterator& operator-(int const n, Iterator const &rhs) {
Iterator out(rhs);
out.pos -= n;
return out;
}
This looks strange to me. First of all the Iterator &
(a typo? This shouldn't even compile.). But especially since your friend Iterator operator-(Iterator const &lhs, int const n)
has the same content. So ...
(iterator - 5) == (5 - iterator)
...?
DRY
All these operator+
, operator-
etc can be implemented using the corresponding "self-assign" operator. This could be automated with a macro. Just as an example:
#define OP_FROM_SELF_ASSIGN_OP(op, lhstype, rhstype) \
template<typename T> \
Iterator<T> operator op (Iterator<T> const & lhs, \
int const rhs) \
{ \
Iterator<T> copy = Iterator<T>(lhs); \
copy op ## = rhs; \
return copy; \
}
But if you're using such a macro, better keep it "behind closed doors", e.g. only in the implementation file. Since these operators are probably the only ones in this specific code base it's very debatable whether it's worth the effort, though.
-
\$\begingroup\$ Nice, very constructive comments. \$\endgroup\$Dimitrios Bouzas– Dimitrios Bouzas2016年10月19日 22:28:06 +00:00Commented Oct 19, 2016 at 22:28
-
2\$\begingroup\$ The macro for assignment operators. Never do that. Ever. Especially since the code is so short and can be simplified even further by using value parameters so that the copy is done automatically. \$\endgroup\$Loki Astari– Loki Astari2016年10月20日 18:01:17 +00:00Commented Oct 20, 2016 at 18:01
-
1\$\begingroup\$ See its shorter to write it out in full rather than use the macro: gist.github.com/Loki-Astari/7b61e217386ad91a0a5c79815ace196c \$\endgroup\$Loki Astari– Loki Astari2016年10月20日 18:06:27 +00:00Commented Oct 20, 2016 at 18:06
-
\$\begingroup\$ @LokiAstari Ah, using a value parameter instead of a const reference is clever. Regarding the macro in itself: I know that preprocessor macros have a bad reputation, but they can be really useful, e.g. with the macro you're able to change the implementation of all those operators at a single place. But as I said: In this specific case it's "very debatable" whether a macro provides reasonable gain. \$\endgroup\$Daniel Jour– Daniel Jour2016年10月20日 20:57:24 +00:00Commented Oct 20, 2016 at 20:57
-
\$\begingroup\$ Again you can achieve the same effect (with the functions) by adding a layer of indirection. But for such a short function I don't see the point. The
clever
part about passing by value was stolen from the standard copy and swap idiom. \$\endgroup\$Loki Astari– Loki Astari2016年10月20日 21:18:11 +00:00Commented Oct 20, 2016 at 21:18
std::iterator
seems to become deprecated there. \$\endgroup\$friend
functions here (because of auto conversions that can be done by the compiler). But I can't think of a situation where it would fail (so only a comment rather than a review). The rest seems fine. \$\endgroup\$sizeof(v) / sizeof(void*)
can be replaced bystd::size(v)
\$\endgroup\$