I am trying to implement a minimal symbolic algebra library (AKA C.A.S.) in C++ that allows me to utilize delayed evaluation similar to this question from SO.
My class hierarchy is getting a little complex, as it uses CRTP, abstract base classes, and smart pointers (which is where I'm running into issues).
I'm not confident using smart pointers yet, so I'm hoping that I can get feedback on how I'm using them and whether or not there are better ways to handle this. I'm also appreciative of any feedback on how to make this work without running into any unforeseen or classic pitfalls.
I've included background code to demonstrate a simplified version of what I have so far, and my actual question is down below. I appreciate any feedback I can get.
Background:
I have an abstract base class which looks like this:
class BaseSymbolic
: public std::enable_shared_from_this<BaseSymbolic> {
public:
virtual ~BaseSymbolic() {}
inline std::shared_ptr<BaseSymbolic> as_ptr();
virtual std::string as_str() const = 0; // For printing a symbolic.
virtual size_t hash() const = 0; // For comparing two symbolics.
};
inline std::shared_ptr<BaseSymbolic> BaseSymbolic::as_ptr() {
return shared_from_this();
}
With a CRTP base class that looks like this:
template<typename Derived>
class Symbolic
: public BaseSymbolic {
public:
inline Derived &operator~();
inline const Derived &operator~() const;
inline std::shared_ptr<Derived> as_ptr();
std::string as_str() const override;
inline operator std::string() const;
size_t hash() const override;
};
// Using the overloaded binary 'not' operator here is just a style choice. I'm
// sure this could easily be changed to 'derived' or 'const_derived', but I
// like the way it looks as an operator rather than a function call.
template<typename Derived>
inline Derived &Symbolic<Derived>::operator~() {
return static_cast<Derived &>(*this);
}
template<typename Derived>
inline const Derived &Symbolic<Derived>::operator~() const {
return static_cast<const Derived &>(*this);
}
// Uses static_pointer_cast to down cast the shared pointer to the derived type.
template<typename Derived>
inline std::shared_ptr<Derived> Symbolic<Derived>::as_ptr() {
return std::static_pointer_cast<Derived>(shared_from_this());
}
template<typename Derived>
std::string Symbolic<Derived>::as_str() const {
// I realize underscores before names can be bad style,
// I just don't know what to call these vvvvvvv yet.
return static_cast<const Derived &>(*this)._as_str();
}
template<typename Derived>
inline Symbolic<Derived>::operator std::string() const {
return static_cast<const Derived &>(*this)._as_str();
}
template<typename Derived>
size_t Symbolic<Derived>::hash() const {
return static_cast<const Derived &>(*this)._hash();
}
I chose not to use the Derived
/Base
CRTP pattern here for simplicity. Right now, nothing inherits from the derived classes.
I'm separating the virtual call from the CRTP call (as_str()
vs _as_str()
), just to minimize the amount of virtual function calls in my code. I'm pretty sure the vtable gets carried around regardless, but I think it reduces the size of the vtable.
I'm unsure about the as_ptr()
calls, which return a casted std::shared_ptr
. In my tests, I occasionally get away with calling the Symbolic
version, but mostly it defaults to the BaseSymbolic
version.
Now, I have a couple derived classes from Symbolic
, such as symbolic variables, numbers, expressions, etc. that I want to be able to use in a matrix, vector, etc. (which is why the abstract base class is needed). I chose to use CRTP to manage all of these so that I could use templates to handle the types.
An example declaration of a symbolic number:
template<typename T>
class Number
: public Symbolic<Number<T>> {
private:
T value_;
size_t hash_;
public:
explicit inline Number();
// Checks to make sure T is a number internally.
explicit inline Number(const T &m);
inline Number(const Number<T> &m);
inline Number<T> &operator=(const T &rhs);
inline Number<T> &operator=(const Number<T> &rhs);
inline std::string _as_str() const;
inline size_t _hash() const;
inline T value() const;
};
In order to implement delayed evaluation, I create a templated "expression" class for each operation (add, subtract, multiply, etc.) that stores a const reference to the Symbolic
arguments and is also derived from Symbolic
.
An example of a delayed evaluation operation:
template<typename T1, typename T2>
class ExprAdd
: public Symbolic<ExprAdd<T1, T2>> {
private:
const T1 &lhs_;
const T2 &rhs_;
public:
explicit inline ExprAdd(const T1 &lhs, const T2 &rhs);
inline std::string _as_str() const;
inline size_t _hash() const;
};
template<typename T1, typename T2>
inline ExprAdd<T1, T2>::ExprAdd(const T1 &lhs, const T2 &rhs)
: lhs_(lhs),
rhs_(rhs) {}
// The trailing return type is just a style choice. Sometimes as these get
// more complex, the return type can get pretty large, or I want to use
// decltype(...) to keep things simple.
// Also, the `const` modifier for the return type is just so that I can
// store the result in the wrapper class below.
template<typename T1, typename T2>
inline auto operator+(const Symbolic<T1> &lhs, const Symbolic<T2> &rhs)
-> const ExprAdd<T1, T2> {
return ExprAdd<T1, T2>(~lhs, ~rhs);
}
This leads the expression a + b
, where a
and b
are both variables, to return the type const ExprAdd<Variable, Variable>
My goal in the end is to check whether the two are both numbers and to "collapse" the expression if they are, meaning const ExprAdd<Number<...>, Number<...>>
gets replaced with a.value() + b.value()
.
But I also don't want to have the user keep track of the type, meaning I don't want them to use template arguments for creating variables.
I don't want this:
Number<int> a = 2;
Number<int> b = 3;
ExprAdd<Number<int>, Number<int>> c = a + b;
I do want this:
sym a = 2;
sym b = 3;
sym c = a + b;
Actual Question:
I created a derived class called sym
which holds a smart pointer to a BaseSymbolic
. I chose to use the 'rule of zero' for the class, because all it does is hold a smart pointer. Not sure if I explicitly need the copy and move functions for this.
Note that the sample below is simplified just for integers. There will be other constructors to handle other types in real life.
Also, the symbolic manipulations will be handled elsewhere, this is just to demonstrate one specific aspect of the code, namely the smart pointer class.
class sym
: public Symbolic<sym> {
private:
// Not sure if this needs to be a 'shared_ptr'. Might be a 'unique_ptr'.
std::shared_ptr<BaseSymbolic> ptr_;
public:
// Other ctors for other types. For example, there might be:
// explicit inline sym(std::string name) for a named variable.
inline sym(int m);
// These seem wrong. I want to be able to accept 'const' expressions,
// but don't mind copying. Right now, this accepts the rvalues
// from the addition operator.
template<typename Derived>
inline sym(const Symbolic<Derived> &&m);
template<typename Derived>
inline sym &operator=(const Symbolic<Derived> &&rhs);
inline std::string _as_str() const;
inline size_t _hash() const;
};
// Constructor
inline sym::sym(int m)
: ptr_(std::make_shared<Number<int>>(m)) {}
// This is most certainly wrong, but it works somehow.
template<typename Derived>
inline sym::sym(const Symbolic<Derived> &&m)
: ptr_(std::make_shared<Derived>(std::move(~m))) {}
// Assignment Operators
// This is also most certainly wrong.
template<typename Derived>
inline sym &sym::operator=(const Symbolic<Derived> &&m) {
ptr_ = std::make_shared<Derived>(std::move((~m)));
return *this;
}
// Member Functions
inline std::string sym::_as_str() const {
return ptr_->as_str();
}
inline size_t sym::_hash() const {
return ptr_->hash();
}
My question is whether or not I am implementing the smart pointer wrapper class correctly. I'm having trouble understanding how this might be used to chain expressions such as c = c + a
while maintaining the reference to c
. My first thought is that I'll need a temporary reference while I replace the smart pointer with a new value using reset
.
I would appreciate any feedback, especially about how to handle the smart pointer wrapper class. So far, my solution seems tenuous at best, and I would like to learn how to improve on it.
Here is a link to some working code.
1 Answer 1
Code:
The only real utility of the
inline
keyword nowadays is to allow function definitions to be placed header files. However, template functions, or functions in template classes, don't need to be markedinline
. There's also no need to mark both the declaration and the definition as inline.Use
std::size_t
, notsize_t
(the former is the C++ version, the latter is the C version).Don't overload random operators because it looks cool! Call it
as_derived
or something.The
Number
class stores thehash_
of the string (o.O), but thehash()
function returns a hash of the value? Seems weird.Use the default implementations of the copy and move constructors and assignment operators where appropriate, e.g.:
Number(Number const&) = default; Number& operator=(Number const&) = default; Number(Number&&) = default; Number& operator=(Number&&) = default;
If we allow assignment of a value to
Number
, it might be cleaner to implement const and non-const reference versions of thevalue()
member function instead:T& value() { return value_; } T const& value() const { return value_; }
Having different, non-virtual
as_ptr()
functions inBaseSymbolic
andSymbolic
is quite confusing. Do we actually need theBaseSymbolic
version?The
operator std::string() const
inSymbolic
should be made explicit. And then deleted, since it does exactly whatas_str()
does.There doesn't seem to be any reason to split off
_as_str()
fromas_str()
and_hash()
fromhash()
inSymbolic
. SinceNumber
andExprAdd
are ultimately derived fromBaseSymbolic
, we can avoid just override them there, and leave them as abstract inSymbolic
.operator+()
to make theExprAdd
will work fine without theoperator~
!Storing references will cause problems, e.g.:
sym a = sym(3) + sym(5); a.as_str();
will crash (or something). Both those temporaries are destroyed beforea.as_str()
is evaluated. We should probably store symbols by value and provide a way to store references when explicitly requested by the user (see below).sym a = 5; a->as_ptr();
will crash (or something) because thesym
isn't created as astd::shared_ptr
.Note that allocating symbols on the heap as shared pointers adds overhead, if we don't need to build a tree of arbitrary expressions at run-time, this is quite a steep price to pay to save some keystrokes (and seems quite contrary to trying to optimize the math expressions).
We can actually replace
std::shared_ptr
in thesym
class withstd::unique_ptr
. This further highlights that we don't actually needoperator~()
at all. The only time we need the derived type is when copying the object. So we can add avirtual BaseSymbolic* clone() const
function instead.
Suddenly, we don't need std::enable_shared_from_this
, or as_ptr()
, and we can delete a fair amount of code:
#include <iostream>
#include <string>
#include <memory>
class BaseSymbolic {
public:
virtual ~BaseSymbolic() {}
virtual BaseSymbolic* clone() const = 0;
virtual std::string as_str() const = 0;
virtual std::size_t hash() const = 0;
};
template<typename Derived>
class Symbolic : public BaseSymbolic {
public:
// ̄\_(ツ)_/ ̄
};
template<typename T>
class Number : public Symbolic<Number<T>> {
private:
T value_;
public:
explicit Number(T value): value_(value) { }
Number(Number const&) = default;
Number& operator=(Number const&) = default;
Number(Number&&) = default;
Number& operator=(Number&&) = default;
Number* clone() const override { return new Number(value_); }
std::string as_str() const { return std::to_string(value_); }
std::size_t hash() const { return std::hash<T>{}(value_); }
T& value() { return value_; }
T const& value() const { return value_; }
};
template<typename T1, typename T2>
class ExprAdd : public Symbolic<ExprAdd<T1, T2>> {
private:
T1 lhs_;
T2 rhs_;
public:
explicit ExprAdd(const T1 &lhs, const T2 &rhs): lhs_(lhs), rhs_(rhs) { }
ExprAdd(ExprAdd const&) = default;
ExprAdd& operator=(ExprAdd const&) = default;
ExprAdd(ExprAdd&&) = default;
ExprAdd& operator=(ExprAdd&&) = default;
ExprAdd* clone() const override { return new ExprAdd(lhs_, rhs_); }
std::string as_str() const override { return lhs_.as_str() + " + " + rhs_.as_str(); }
std::size_t hash() const override { return lhs_.hash() ^ rhs_.hash(); }
};
template<typename T1, typename T2>
auto operator+(const Symbolic<T1> &lhs, const Symbolic<T2> &rhs) -> const ExprAdd<T1, T2> {
return ExprAdd<T1, T2>(lhs, rhs);
}
class sym : public Symbolic<sym> {
private:
std::unique_ptr<BaseSymbolic> ptr_;
public:
sym(int m):
ptr_(std::make_unique<Number<int>>(m)) { }
template<class Derived>
sym(Symbolic<Derived> const& m):
ptr_(m.clone()) { }
sym(sym const& other):
ptr_(other.ptr_->clone()) { }
sym& operator=(sym const& other)
{
sym temp(other);
ptr_ = std::move(temp.ptr_);
return *this;
}
sym(sym&&) = default;
sym& operator=(sym&&) = default;
sym* clone() const override { return new sym(*this); }
std::string as_str() const override { return ptr_->as_str(); }
std::size_t hash() const override { return ptr_->hash(); }
};
int main() {
sym a = 5;
sym b = 10;
sym c = a + b;
std::cout << c.as_str() << std::endl;
}
We could probably get rid of the Symbolic
class too, but maybe there's a better way to organize things...
Design:
It's a little unclear whether you want to be able to construct an expression tree at run-time (e.g. parsing arbitrary string input from the user and calculating the result), or at compile time (e.g. improving C++ code efficiency with expression templates).
These are two different things. The latter (which is what that stackoverflow question seems to be about) shouldn't require virtual functions or run-time polymorphism at all.
Compile-time:
For static lazy evaluation we don't need the base-class and inheritance hierarchy. As long as we implement the same static interface with our types, we can use them in template functions without caring what type they are.
We can use the auto
keyword, and some utility functions to save the user bit of typing:
#include <memory>
#include <string>
#include <utility>
template<class T>
struct Number
{
T value;
std::string to_string() const { return std::to_string(value); }
std::size_t hash() const { return std::hash<T>()(value); }
};
template<class T>
Number<T> make_num(T value)
{
return{ value };
}
template<class T>
struct Ref
{
T* thing;
std::string to_string() const { return thing->to_string(); }
std::size_t hash() const { return thing->hash(); }
};
template<class T>
Ref<T> make_ref(T& value)
{
return{ &value };
}
template<class T>
Ref<const T> make_ref(T const& value)
{
return{ &value };
}
template<class LeftT, class RightT>
struct Add
{
LeftT left;
RightT right;
std::string to_string() const { return left.to_string() + " + " + right.to_string(); }
std::size_t hash() const { return left.hash() ^ right.hash(); }
};
template<class LeftT, class RightT>
Add<LeftT, RightT> operator+(LeftT a, RightT b) // copy
{
return{ std::move(a), std::move(b) }; // and move
}
#include <iostream>
int main()
{
auto a = make_num(5);
auto b = make_num(10);
auto sum = a + make_ref(b);
std::cout << sum.to_string() << std::endl;
auto test = make_num(3) + make_num(34);
std::cout << test.to_string() << std::endl;
}
We could even use free-functions instead of member functions for to_string()
and hash()
. If we supply appropriate overloads for standard numeric types (int
, float
, etc.), then we don't need the Number
class at all.
Since the type information is immediately available, it's also simple (in an awful C++ way) to overload various operators to do particular things, e.g.
namespace impl
{
// "simple"
template<class A, class B>
struct AddImpl
{
using return_type = Add<A, B>;
static return_type add(A a, B b) // general version
{
return{ std::move(a), std::move(b) };
}
};
template<class A, class B>
struct AddImpl<Number<A>, Number<B>>
{
using return_type = Number<decltype(A() + B())>;
static return_type add(Number<A> const& a, Number<B> const& b) // specific version
{
return return_type{ a.value + b.value }; // trivial types, so just do this now...
}
};
} // impl
template<class LeftT, class RightT>
typename impl::AddImpl<LeftT, RightT>::return_type operator+(LeftT const& a, RightT const& b)
{
return impl::AddImpl<LeftT, RightT>::add(a, b);
}
#include <iostream>
int main()
{
auto a = make_num(5);
auto b = make_num(10);
auto sum = a + make_ref(b);
std::cout << sum.to_string() << std::endl;
auto test = make_num(3) + make_num(34);
std::cout << test.to_string() << std::endl;
}
Run-time:
If you do need to construct the tree at run-time, I'd suggest watching one of Sean Parent's talks, which describe an alternative approach to run-time polymorphism:
- Going Native 2013 - Inheritance is the Base Class of Evil.
- NDC London 2017 - Better Code: Runtime Polymorphism.
In short, we make our expression / number types adhere to a static interface (concept) as above. Then we can use a type-hiding class to add the run-time polymorphism in a very self-contained fashion:
#include <memory>
#include <string>
#include <utility>
class Expression
{
public:
template<class T>
explicit Expression(T e):
model(std::make_unique<Model<T>>(std::move(e))) { }
Expression(Expression const& other):
model(other.model->clone()) { }
Expression& operator=(Expression const& other)
{
model.reset(other.model->clone());
return *this;
}
Expression(Expression&& other):
model(std::move(other.model)) { }
Expression& operator=(Expression&& other)
{
model = std::move(other.model);
return *this;
}
std::string to_string() const { return model->to_string(); }
std::size_t hash() const { return model->hash(); }
private:
struct Concept
{
virtual Concept* clone() const = 0;
virtual std::string to_string() const = 0;
virtual std::size_t hash() const = 0;
};
template<class T>
class Model : public Concept
{
public:
explicit Model(T const& e):
expression(e) { }
explicit Model(T&& e):
expression(e) { }
virtual Model* clone() const { return new Model(expression); }
virtual std::string to_string() const { return expression.to_string(); }
virtual std::size_t hash() const { return expression.hash(); }
private:
T expression;
};
std::unique_ptr<Concept> model;
};
This works fine together with compile-time code above so we can do things like this:
#include <iostream>
int main()
{
auto a = Expression(make_num(5));
auto b = make_num(10);
auto sum = a + Expression(make_ref(b));
std::cout << sum.to_string() << std::endl; // evaluate!
}
It's a little wordy when creating the expressions, but if we're parsing user input to make an expression tree, that's unlikely to be a big problem.
We could get more towards the desired syntax with a utility function, e.g.:
template<class T, typename = std::enable_if_t<std::is_integral<T>::value || std::is_floating_point<T>::value>>
Expression make_expr(T number)
{
return Expression(Number<T>{ number });
}
But note again that the main point of adding the run-time polymorphism is to allow us to construct arbitrary expression trees at run-time. We can probably make the syntax "nice-enough" with the compile-time version if we don't need this feature.
-
\$\begingroup\$ This is an amazing, well thought out, and comprehensive answer that addresses nearly every one of my concerns and problems. Bullets 1-3, 5, 6, 8-11 are great advice. For point 4, I'm hashing the string to avoid ambiguity in comparing the numbers and expressions. For point 7, I need the
BaseSymbolic
version in some other functions that do comparisons, etc. \$\endgroup\$Adam– Adam2019年01月27日 19:42:30 +00:00Commented Jan 27, 2019 at 19:42 -
\$\begingroup\$ The concepts are something I haven't played with yet, but I've read a few articles and seen some of the videos you linked to. I will definitely look into implementing that. The big reason for the abstract base class was to allow these to be used in a
std::vector
within aMatrix<sym>
class that I define elsewhere. Mostly, I'm looking for compile-time expression trees, but I want to enable these to be used from a shared library, which will require some amount of run-time code. \$\endgroup\$Adam– Adam2019年01月27日 19:48:26 +00:00Commented Jan 27, 2019 at 19:48
Explore related questions
See similar questions with these tags.