I am currently writing some generic functions to make working with vectors a little "cleaner". Working with the "erase-remove idiom" can be a little confusing sometimes, and make the code look a little dirty, so why not throw it into a function? I have four individual function I wish to have reviewed and made better. I know that there are tons of examples online (trust me, I've looked at most of them).
1. Simply Remove All Instances of a Value From a Vector
template<class T>
void Vec_RemoveAll(vector<T>& vec, T val)
{
vec.erase(remove(vec.begin(), vec.end(), val), vec.end());
}
This function is the simplest of all four.
2. Remove All Instances of a Container of Values From a Vector
template<class Container, class T>
void Vec_RemoveAll(vector<T>& vec, const Container& vals)
{
for (const T& val : vals) vec.erase(remove(vec.begin(), vec.end(), val), vec.end());
}
Simply loops through the container checking the vector for occurrences of each value.
3. Remove All Elements That Meet Unary Predicate's Condition
template<class T, class UnaryPredicate>
void Vec_RemoveAll_If(vector<T>& vec, UnaryPredicate* predicate)
{
vec.erase(remove_if(vec.begin(), vec.end(), predicate), vec.end());
}
I'm not familiar with remove_if, I'm worried I may have not implemented this correctly.
4. Remove Elements That Meet Any of the Predicates' Conditions
template<class T, class UnaryPredicate>
void Vec_RemoveAll_If(vector<T>& vec, const vector<UnaryPredicate> predicates)
{
for (const UnaryPredicate& predicate : predicates)
vec.erase(remove_if(vec.begin(), vec.end(), predicate), vec.end());
}
How would one generalize the type of container that the unary predicates are stored in?
4 Answers 4
Before getting into each function, your last question is important -- how do we generalize the type of container? By passing iterators instead of containers, of course! When we first encounter the C++ Standard Library, the ubiquity of iterators can seem confusing -- why not pass containers around instead? The reason is that different programs have many different needs, and passing a particular kind of container can be very restricting. For example, if my function takes a parameter of type const vector<const Foo>&
, I cannot pass a const vector<Foo>&
or a const vector<const Foo*>&
. If instead I take a pair of iterators with the requirement that *it
is of type const Foo&
, I can use a wider range of container types; vector<Foo>
's and vector<const Foo>
's iterators work directly, and vector<const Foo*>
can be made to work with iterator adaptors.
OK, function by function.
1. Simply Remove All Instances of a Value From a Vector
template<class T>
void Vec_RemoveAll(vector<T>& vec, T val)
{
vec.erase(remove(vec.begin(), vec.end(), val), vec.end());
}
First I'll include my standard disclaimer about using namespace std;
: Don't do it, it can lead to subtle bugs, and you'll get used to seeing std::
everywhere.
Second, pass const T&
, not T
. What if T
is a large type? std::remove
takes const T&
.
Third, I don't think this is a particularly wise function to write. It's a common C++ idiom; once you learn it it should become automatic.
2. Remove All Instances of a Container of Values From a Vector
template<class Container, class T>
void Vec_RemoveAll(vector<T>& vec, const Container& vals)
{
for (const T& val : vals) vec.erase(remove(vec.begin(), vec.end(), val), vec.end());
}
First point is a question: Is this the performance characteristic you want? Intuitively, I would think that vals
would tend to be smaller than vec
, so you probably want to iterate through vals
in the inner loop, not the outer loop. You could achieve this by using std::remove_if
instead and writing a function that determines whether an element is in vals
. You could get even better performance if you required vals
to be sorted.
Second: for (const T& val : vals)
is dangerous. Container
isn't required to contains T
s, just things that are comparable to T
. You would want to write this as for (const auto& val : vals)
. For example, consider that vec
is a vector<string>
and vals
is a vector<const char*>
. Each of the strings in vals
would be copied into a string
while iterating using your version.
3. Remove All Elements That Meet Unary Predicate's Condition
template<class T, class UnaryPredicate>
void Vec_RemoveAll_If(vector<T>& vec, UnaryPredicate* predicate)
{
vec.erase(remove_if(vec.begin(), vec.end(), predicate), vec.end());
}
Almost right, except that you should pass UnaryPredicate
, not UnaryPredicate*
. Your version works fine for function pointers but not function objects. But I raise the same objection as to function 1 -- this is a standard idiom, and hiding it behind a function call is more likely to confuse than enlighten experience C++ programmers who read your code.
4. Remove Elements That Meet Any of the Predicates' Conditions
template<class T, class UnaryPredicate>
void Vec_RemoveAll_If(vector<T>& vec, const vector<UnaryPredicate> predicates)
{
for (const UnaryPredicate& predicate : predicates)
vec.erase(remove_if(vec.begin(), vec.end(), predicate), vec.end());
}
I recommend against writing this function in the first place; it's probably best for any potential user of this function to compose their predicates manually into a single predicate, or in the alternative to write a library to help them do so. That is more generally applicable and removes the requirements that users of your library learn this idiosyncratic API. In the alternative, you can do as you did in function 2, and pass const Container&
instead of trying to specify a specific container.
-
\$\begingroup\$ In case 2, even if
vals
is larger thanvec
, your advice to iterate overvals
in the inner loop is still good, as it's read-only, and therefore much cheaper than writing tovec
in the the tight loop. \$\endgroup\$Toby Speight– Toby Speight2017年03月07日 08:35:41 +00:00Commented Mar 7, 2017 at 8:35 -
\$\begingroup\$ Thank you for your detailed answer. This is the second post of this nature I have created, and I am learning a lot (I'm a 16 year-old student to C++). I have one question. In your response to #2, If I required vals to be sorted, would I still use std::remove_if()? \$\endgroup\$Ethan– Ethan2017年03月07日 22:52:09 +00:00Commented Mar 7, 2017 at 22:52
-
\$\begingroup\$ Yes. Something like
vec.erase(std::remove_if(vec.begin(), vec.end(), [&vals] (const T& t) { return std::binary_search(vals.begin(), vals.end(), t); }));
\$\endgroup\$ruds– ruds2017年03月07日 23:04:03 +00:00Commented Mar 7, 2017 at 23:04 -
\$\begingroup\$ I had a feeling a lambda expression would be the way to go. You said in your first paragraph that passing container iterators would be better than specifying the container type. Wouldn't the iterator still have to be of a specific type since different containers erase elements different ways? If so, how does one specify the type of an iterator that is passed? \$\endgroup\$Ethan– Ethan2017年03月07日 23:33:10 +00:00Commented Mar 7, 2017 at 23:33
-
\$\begingroup\$ There I was referring to passing the container of predicates, as that was the question of generalization you were asking. As you point out, you can't pass arbitrary iterators to represent
vec
since the container type matters. \$\endgroup\$ruds– ruds2017年03月07日 23:35:11 +00:00Commented Mar 7, 2017 at 23:35
(This is an extension to the answer by ruds, which I wholly endorse.)
Instead of hiding the erase-remove idiom, you'll gain more by providing yourself with predicates for the tasks you want to do. Here are predicates that implement your cases 2 and 4:
#include <algorithm>
namespace predicate {
// 2: Match All Instances of a Container of Values
// return a predicate that will return true when its value
// is in the supplied container
template<typename Container>
auto contained_in(const Container& container)
{
using std::begin;
using std::end;
auto from = begin(container);
auto to = end(container);
return [from,to](const auto& val) { return std::find(from, to, val) != to; };
}
// 4: Match Elements That Meet Any of the Predicates' Conditions
// return a predicate that will return true when any of the
// supplied predicates does so
template<typename Container>
auto any_of(const Container& container)
{
return [&container](const auto& val) {
for (const auto& predicate: container)
if (predicate(val))
return true;
return false;
};
}
}
If you have a Lisp background, you might be able to tolerate this more functional-style alternative:
template<typename Container>
auto any_of(const Container& container)
{
using std::begin;
using std::end;
auto from = begin(container);
auto to = end(container);
return [from,to](const auto& val) { return std::any_of(from, to, [&val](auto &p) {return p(val);}); };
}
Now you have the predicates you need for std::remove_if()
:
#include <iostream>
template<typename Container>
std::ostream& print(std::ostream& out, const Container& container)
{
const char *sep="";
for (auto const v: container) {
out << sep << v;
sep = ", ";
}
return out;
}
#include <functional>
#include <numeric>
#include <vector>
int main()
{
std::vector<int> small_numbers(21);
std::iota(small_numbers.begin(), small_numbers.end(), -10);
// Example 2
auto v1 = small_numbers;
auto blacklist = { -4, -3, -2, -1, 0, 1, 2, 3, 4 };
v1.erase(std::remove_if(v1.begin(), v1.end(), predicate::contained_in(blacklist)), v1.end());
print(std::cout, v1) << std::endl;
// Example 4
auto v2 = small_numbers;
auto is_even = [](int i){ return i % 2 == 0; };
auto is_negative = [](int i){ return i < 0; };
std::initializer_list<std::function<bool(int)>> tests{is_even, is_negative};
v2.erase(std::remove_if(v2.begin(), v2.end(), predicate::any_of(tests)), v2.end());
print(std::cout, v2) << std::endl;
}
This produces the expected output (with values -4 to 4 removed from v1
, and all negative or even values removed from v2
):
-10, -9, -8, -7, -6, -5, 5, 6, 7, 8, 9, 10
1, 3, 5, 7, 9
Particularly for example 4, leaning into erase-remove with a predicate allows us to make a single pass over the container instead of repeatedly applying remove()
with different predicates. So each element is tested, and potentially moved, just once. Also, as soon as any predicate matches, we don't apply the others to the same element. So we get two different performance benefits compared to the original code.
-
\$\begingroup\$ My concern is that your predicates have lifetime issues. e.g.
auto pred = predicate::contained_in({1, 2, 3}); vec.erase(std::remove_if(vec.begin(), vec.end(), pred));
has undefined behavior because pred refers by reference to a temporary that has already been destroyed. \$\endgroup\$ruds– ruds2017年03月07日 14:35:53 +00:00Commented Mar 7, 2017 at 14:35 -
\$\begingroup\$ That's true, @ruds - I should have pointed that out. As used above, there's no problem, but one could foresee a user passing a temporary collection as argument. I'll leave it as an exercise for the reader to figure out how to make my starting suggestion bulletproof. ;-) \$\endgroup\$Toby Speight– Toby Speight2017年03月07日 14:39:21 +00:00Commented Mar 7, 2017 at 14:39
Working with the "erase-remove idiom" can be a little confusing sometimes, and make the code look a little dirty, so why not throw it into a function?
[...] I don't think this is a particularly wise function to write. It's a common C++ idiom; once you learn it it should become automatic.
I disagree: I think the erase API of the standard containers is incomplete with regards to removing elements, and extracting such basic functionality behind an external API is the way to go.
You can generalize the functionality of your APIs by:
- abstracting away the container type
- abstracting away the condition behind a generic predicate
- providing convenience predicates, according to your needs
Here's the resulting code:
template<class Sequence, class UnaryPredicate>
void remove_if(Sequence& sequence,
const UnaryPredicate& predicate)
{
sequence.erase(remove_if(std::begin(sequence),
std::end(sequence),
predicate),
std::end(sequence));
}
namespace predicates
{
template<typename T>
auto equals(T value) // example: predicate that is used often
{
return [x = std::move(value)](const T& element)
{
return x == element;
};
}
}
auto vec = std::vector<int>{1, 2, 3, 4, 5};
remove_all(vec, predicates::equals(4));
auto lst = std::list<std::string>{ ... };
remove_all(lst, predicates::equals<std::string>("test"));
auto str = std::string{ "12345" };
remove_all(str, [](char c){ return c % 2; }); // remove odd ASCII codes
Lets create a version that can accept different kinds of containers. Many containers have member functions for these things, lets use that if it's available:
1. Simply Remove All Instances of a Value From a Vector
#include <algorithm>
namespace detail
{
struct pick_3 {};
struct pick_2 : pick_3 {};
struct pick_1 : pick_2 {};
template<typename Cont, typename Item>
auto remove_all(Cont& cont, const Item& item, pick_1)
-> decltype(cont.remove(item), void())
{
cont.remove(item);
}
template<typename Cont, typename Item>
auto remove_all(Cont& cont, const Item& item, pick_2)
-> decltype(void())
{
cont.erase(std::remove(cont.begin(), cont.end(), item), cont.end());
}
}
template<typename Cont, typename Item>
void remove_all(Cont& cont, const Item& item)
{
detail::remove_all(cont, item, pick_1{});
}
This is done using SFINAE, but that is not as complicated as it sounds.
Just test the thing you are checking for in a decltype
in a
trailing return.
2. Remove All Instances of a Container of Values From a Vector
May I suggest not providing this one. It is best done with a custom
predicate. One could be synthesized using std::find
for example.
If speed is needed, a std::set
may be appropriate.
This could look something like this:
std::set<int> si = { 7, 8, 9 };
auto is_in_si = [&si](int i) -> bool { return si.count(i); };
remove_if(vi, is_in_si);
3. Remove All Elements That Meet Unary Predicate's Condition
namespace detail
{
template<typename Cont, typename Pred>
auto remove_if(Cont& cont, Pred&& pred, pick_1)
-> decltype(cont.remove_if(pred), void())
{
cont.remove_if(pred);
}
template<typename Cont, typename Pred>
auto remove_if(Cont& cont, Pred&& pred, pick_2)
-> decltype(void())
{
cont.erase(std::remove_if(cont.begin(), cont.end(), pred), cont.end());
}
}
template<typename Cont, typename Pred>
void remove_if(Cont& cont, Pred&& pred)
{
detail::remove_if(cont, pred, detail::pick_1{});
}
If just one version survives SFINAE, that one will be picked. If more than
one is available for overload-resolution, the first one will be chosen,
thanks to inheritance. This is the purpose of the pick_1
, pick_2
arguments.
4. Remove Elements That Meet Any of the Predicates' Conditions
Again, don't provide this one, but rather use a custom predicate.
Let's make a "multi or" :
template<typename Arg>
bool any_apply(const Arg&)
{
return false;
}
template<typename Arg, typename First, typename... Preds>
bool any_apply(const Arg& arg, First&& first, Preds&&... preds)
{
return
first(arg) ||
any_apply( arg, std::forward<Preds>(preds)... );
}
template<typename Arg, typename... Preds>
auto make_multi_or(Preds&&... preds)
-> auto
{
return [&](const Arg& arg)
{
return any_apply(arg, std::forward<Preds>(preds)... );
};
}
This uses variadic template parameters (the ...
part), something
that is very handy once you get a hang of it. The example code above
should get you started.
You can see it all in action here
Explore related questions
See similar questions with these tags.