I wrote this class to perform simple template-set operations on boost::unordered_set
. These operations are part of a larger code base. Calling them will be explicit with SetUtilities::set_intersection(setA, setB)
.
The main reason I wanted this was to optimize intersection operations (using constant time find in the larger set). After taking to heart the comments from here, I edited the class and provided the rest of the methods.
For now I want to use Boost instead of C++11. I can always do this in the future.
#ifndef SET_UTILITIES
#define SET_UTILITIES
#include <boost/unordered_set.hpp>
#include <boost/foreach.hpp>
class SetUtilities{
public:
//!set OPERATION intersection for boost::unordered_set
template<typename Type>
static boost::unordered_set<Type>
set_intersection(const boost::unordered_set<Type> &setA,
const boost::unordered_set<Type> &setB)
{
//iterate the smaller set O(n)
if(setA.size() <= setB.size()){
boost::unordered_set<Type> intersectionSet;
BOOST_FOREACH(const Type &element, setA){
//find element in O(1)
if(set_hasElement(setB,element)){
intersectionSet.insert(element);
}
}
return intersectionSet;
}else{
return set_intersection(setB,setA);
}
}
//!set OPERATION union for boost::unordered_set
template<typename Type>
static boost::unordered_set<Type>
set_union(const boost::unordered_set<Type> &setA,
const boost::unordered_set<Type> &setB)
{
if(setA.size() <= setB.size()){
boost::unordered_set<Type> unionSet = setB;
//iterate the smaller set O(n)
BOOST_FOREACH(const Type &element, setA){
unionSet.insert(element);
}
return unionSet;
}else{
return set_union(setB,setA);
}
}
//!set OPERATION difference A - B
template<typename Type>
static bool
set_difference(const boost::unordered_set<Type> &setA,
const boost::unordered_set<Type> &setB)
{
boost::unordered_set<Type> differenceSet;
//is size are not equal, sets are not equal
BOOST_FOREACH(const Type &element,setA){
if(set_hasElement(setB,element) == false){
differenceSet.insert(element);
}
}
return differenceSet;
}
//!set UTILITY funciton to determine if a set contains an element
template<typename Type>
static bool
set_hasElement(const boost::unordered_set<Type> &mySet,
Type element)
{
return mySet.find(element) != mySet.end();
}
//!set equality test for boost::unordered_set
template<typename Type>
static bool
are_equal(const boost::unordered_set<Type> &setA,
const boost::unordered_set<Type> &setB)
{
//is size are not equal, sets are not equal
if(setA.size() != setB.size()){
return false;
}else{
BOOST_FOREACH(const Type &element, setA){
if(set_hasElement(setB,element)){
return false;
}
}
return true;
}
}
};
#endif
I believe that this version is very readable and explicit. I think that adding typedef
s with stuff like this in a single class would create more confusion. Let me know what you think.
1 Answer 1
Fix compiler errors
Your set_difference
function does not compile. You have declared it returning bool
, but then return differenceSet;
in the function body. This is a type mismatch: your function should return a boost::unordered_set
.
Reinventing the wheel
boost::unordered_set
already supports operator==
for equality comparison. You don't need to implement it yourself -- your are_equal()
function is unnecessary.
Additionally, your set_union
can be simplified. You can replace your foreach-loop:
BOOST_FOREACH(const Type &element, setA){
unionSet.insert(element);
}
with using the two-iterator overload of insert()
:
unionSet.insert(setA.begin(), setA.end());
This should be no less efficient than what you have already written, and is slightly more readable.
Remove irrelevant comments
set_difference()
contains this comment: //is size are not equal, sets are not equal
. This has no relevance to what the function actually does. This, combined with the type error listed above, leads me to believe that this is a copy-paste error. When you do that, make sure things like this don't slip through.
Boolean comparison
You don't need to explicitly compare Boolean conditions against false
:
if(set_hasElement(setB,element) == false)
Can be more succinctly stated:
if(!set_hasElement(setB,element))
Naming
This is a minor nitpick, but set_hasElement
combines underscore_separated
and CamelCase
capitalization conventions. I would rename it to set_contains()
for name consistency with the rest of your functions.
Namespaces
Creating a class with only public static
functions is very Java-esque. C++ allows namespaces, which are a way of organizing functions and classes. Your set functions might be better as free functions placed in a namespace SetUtilities
instead of a class.
-
\$\begingroup\$ Good point about the last thing. With no data members, and especially everything
public
, aclass
is not needed (not even astruct
). \$\endgroup\$Jamal– Jamal2014年07月16日 16:57:37 +00:00Commented Jul 16, 2014 at 16:57 -
\$\begingroup\$ Good points. I just wrote this today to address comments in the other questions and it lead to errors with difference which I didn't really need. It should return a set not a bool. I put it as == false to be more explicit and appease the readability nazi's. I like ! better as well. Good point about namespace. I wanted a static class like java but namespace is a better fit. \$\endgroup\$pbible– pbible2014年07月16日 17:55:57 +00:00Commented Jul 16, 2014 at 17:55