I'm creating a cache system for an object (Mesh) that is expensive to create. A Mesh can be created using a small amount of information in a hashable key (MeshKey). There are multiple methods of creating meshes, and the key must identify which method is being used and supply the parameters to the creation function. The client code must be able to add creation methods. The cache looks like this:
class MeshCache {
public:
Mesh get(const MeshKey& key) const {
return _cache.count(key) ? _cache.at(key) : _cache.try_emplace(key, _create(key)).first->second;
}
private:
Mesh _create(const MeshKey& key) const {
return ...;
}
mutable std::unordered_map<MeshKey, Mesh> _cache;
};
Essentially, the cache is handed a MeshKey from which to identify and return a specific Mesh. If it doesn't yet exist, create it and cache it. This means the key represents both the method that should be used to create the mesh and the parameters used to create it (for example, one key could represent a sphere mesh with a specific radius, another a box with three side lengths).
The big problem is MeshCache::_create. If the set of creation methods were known, I could map an index to a generator function, or std::visit a std::variant of all the possible MeshKey types. As the user must be able to add their own creation modes, I decided polymorphism was the way to go:
struct _MeshKey {
virtual size_t hash() const = 0;
virtual bool operator==(const _MeshKey& rhs) const = 0;
virtual Mesh create() const = 0;
virtual std::unique_ptr<_MeshKey> clone() const = 0;
};
class MeshKey {
public:
MeshKey(std::unique_ptr<_MeshKey>&& x): key(std::move(x)) {}
MeshKey(const MeshKey& other): key(other.key->clone()) {}
size_t hash() const {
return key->hash();
}
bool operator==(const MeshKey& rhs) const {
return *key == *rhs.key;
}
Mesh create() const {
return key->create();
}
private:
std::unique_ptr<_MeshKey> key;
};
namespace std {
template <> struct hash<MeshKey> {
size_t operator()(const MeshKey& x) const noexcept { return x.hash(); }
};
}
MeshCache::_create could then simply return key.create().
This lets me create new types of keys like this:
struct SphereMeshKey: public _MeshKey {
SphereMeshKey(float radius): radius(radius) {}
float radius;
size_t hash() const override { return std::hash<float>()(radius); }
bool operator==(const _MeshKey& rhs) const override {
if (typeid(*this) != typeid(rhs)) {
return false;
}
auto obj = static_cast<const SphereMeshKey&>(rhs);
return radius == obj.radius;
}
Mesh create() const override {
return ...; // Generate sphere mesh
}
std::unique_ptr<_MeshKey> clone() const override {
return std::make_unique<SphereMeshKey>(radius);
}
};
Unfortunately, this resulted in some ugly coupling between the values in the key itself and the object creation -- if creating the object required the use of other objects, those must typically be passed into the derived key's constructor. My first several attempts at decoupling resulted in some unattractive use of RTTI and two sets of polymorphic classes (keys and generators) which had to be authored by the client and then registered with the cache object. This complicated the logic and was generally a messy solution.
The problems I see with this design are:
- Heavy coupling between the mesh generation code and the mesh identification (keying) code
- Relying on a polymorphic structure to do the job of a map key (which usually should behave like a POD) requires prototype pattern (
clonefunction) and heap allocation (dynamically sized key) - While I imagine some use of RTTI will likely make its way into the final code, my use of
typeidin the equality operator felt repetitive and likely to create bugs down the road
Is there a more elegant solution? I generally favor readability and maintainability over performance until profiling reveals issues.
2 Answers 2
I generally favor readability and maintainability over performance until profiling reveals issues.
That's a very good mindset to have. I think the lack of elegance in your code comes from the fact that the classes have too many or not the right responsibilities. For example, MeshKey looks like a base class but it's actually a wrapper around a std::unique_ptr to a concrete key object, and _MeshKey is the actual base. Derived classes should also not have to worry about being compared against different derived types. Keep each class as simple as possible.
You explored using std::variant and using polymorphism, but you missed another possibility: just make the cache itself templated on the key type. This simplifies the cache, as it then only has to deal with one specific type.
Consider:
class MeshCache {
public:
template <typename Key>
static const Mesh& get(const Key& key) {
auto& cache = _cache<Key>;
if (auto it = cache.find(key); it != cache.end()) {
return it->second;
} else {
return cache.try_emplace(key, key.create()).first->second;
}
}
private:
template <typename Key>
inline static std::unordered_map<Key, Mesh> _cache;
};
Now you no longer need derived classes nor std::unique_ptrs. So SphereMeshKey becomes:
struct SphereMeshKey {
float radius;
Mesh create() const {
return ...;
}
};
And the code that uses the cache will look like:
auto& earthMesh = MeshCache::get(SphereMeshKey{6.371e6});
The real magic is in the inline keyword used to declare _cache: it will ensure a _cache<Key> is instantiated for every type Key you use, and if multiple translation units cause the same _cache<Key> to be instantiated, the linker will merge them into one.
The above code has two other optimizations: get() returns a reference to a Mesh instead of making a copy of one, and it also uses find() instead of count()+at(), the latter would search the map twice instead of once.
-
\$\begingroup\$ That
inline statictemplate member is nutty. When did C++ add that feature? :o \$\endgroup\$user673679– user6736792022年11月24日 19:31:40 +00:00Commented Nov 24, 2022 at 19:31 -
\$\begingroup\$ @user673679 In C++17 :) \$\endgroup\$G. Sliepen– G. Sliepen2022年11月24日 19:35:51 +00:00Commented Nov 24, 2022 at 19:35
-
\$\begingroup\$ Interesting. This is essentially the singleton pattern, no? It's impossible to have multiple
Meshcaches for instance. Handling this with templates was something I'd considered, but couldn't quite get looking nice. I've never seeninline staticfrom your implementation, thanks for teaching me something today! \$\endgroup\$Joshua Hyatt– Joshua Hyatt2022年11月28日 21:57:06 +00:00Commented Nov 28, 2022 at 21:57 -
\$\begingroup\$ Yes, it effectively is a singleton, which probably is what you want anyway for a cache. You could make the whole
class MeshCachea template (on anenumtype for example) and make multiple caches that way. \$\endgroup\$G. Sliepen– G. Sliepen2022年11月28日 22:23:21 +00:00Commented Nov 28, 2022 at 22:23
Mesh get(const MeshKey& key) const { return _cache.count(key) ? _cache.at(key) : _cache.try_emplace(key, _create(key)).first->second; }
Some things:
countlooks up the key in the map, thenatdoes the same lookup a second time.- We should add some error handling, in case
try_emplacefails - even anassertwould do. - I don't think there's a good reason to hide the fact that
get()might change the cache. The function shouldn't beconst, and the_cachemember variable shouldn't bemutable. - Should we really be returning a
Meshby value (i.e. copying it) instead of returning aconst&or even a pointer?
struct _MeshKey {
...
class MeshKey {
...
struct SphereMeshKey: public _MeshKey {
...
Hmm. I would call this a MeshGenerator instead of a MeshKey, as I think that's its primary purpose. It seems like the user always has to pass the appropriate MeshKey (or MeshGenerator) into the get() function... so I'm not sure there's a reason to store the entire generator object inside the cache. We could probably use a simple std::unordered_map<std::size_t, Mesh> instead.
I think what's missing is to ensure that the generator class contributes to the resulting std::size_t hash value, like so:
struct MeshGenerator {
virtual ~MeshGenerator() { } // note: important!
virtual std::size_t hash() const = 0;
virtual Mesh create() const = 0;
};
struct SphereMeshGenerator : MeshGenerator {
SphereMeshGenerator(float radius): _radius(radius) { }
std::size_t hash() const override {
using std::literals;
auto const hashGenerator = std::hash<std::string_view>()("sphere"sv);
auto const hashRadius = std::hash<float>()(_radius);
return combineHashes(hashGenerator, hashRadius);
}
...
};
where combineHashes() is something like this:
inline std::size_t combineHashes(std::size_t a, std::size_t b) {
return a ^ (b + 0x9e3779b9 + (a << 6) + (a >> 2)); // note: algorithm pilfered from boost hash_combine
}
So to use it we'd do something like:
auto cache = MeshCache();
auto const generator = SphereMeshGenerator(1.f);
auto const mesh = cache.get(generator);
and get() might be:
Mesh const& get(MeshGenerator const& gen) {
auto const key = gen.hash();
if (auto const entry = _cache.find(key); entry != _cache.end())
return entry->second;
auto mesh = gen.create();
auto [entry, inserted] = _cache.insert({ key, std::move(mesh) });
assert(inserted);
return entry->second;
}
Note that inheritance might not be necessary here unless you need to store different MeshGenerator types in the same container. Otherwise we could make get() a template function instead.
(We could even split the key and the generator function into separate std::size_t and std::function<Mesh(void)> parameters, if we really wanted too).
(Or do what G Sleipen said. I did not know that was a thing.)
-
1\$\begingroup\$ The problem of hashes is that they are not guaranteed to be unique. You must account for the possibility of two different
MeshGenerators to return the same hash value. This is whystd::unordered_maprequires bothstd::hash()and anoperator==()for the key type it stores. \$\endgroup\$G. Sliepen– G. Sliepen2022年11月24日 19:34:49 +00:00Commented Nov 24, 2022 at 19:34 -
\$\begingroup\$ A lot of good catches, somehow a virtual destructor on the base class had completely slipped my mind.
findinstead ofcount/atis also a very good point. Not sure about the mutability, at some point down the line, the cache will have to be mutable, because the mutations are opaque to some of my higher-level classes where it doesn't make sense to dropconst-ness. \$\endgroup\$Joshua Hyatt– Joshua Hyatt2022年11月28日 21:55:07 +00:00Commented Nov 28, 2022 at 21:55