Implementing a binary tree in C++ using "std::unique_ptr"

Question 1

I have implemented a simple binary search tree class in C++ using std::unique_ptr objects to hold the pointers to each node. In doing this I have come across a situation that is somewhat questionable in the Delete method. Since I can't copy the unique pointers, but I need a way to traverse the tree with a "temporary" pointer, I had to resort to using a raw pointer to do the traversal.

This is the section I am talking about (In the Delete method):

// Need to use a raw pointer because I can't assign to a std::unique_ptr
Node* n = node.get();
while (n->right) {
 // Using a reference to a unique_ptr instead of a raw pointer will cause this line to fail to compile
 n = n->right.get();
}

Is this an acceptable thing to do?

It compiles and runs. And since the underlying pointer is still managed by a unique_ptr object it will be deleted properly. Everything about this works fine, but it just doesn't feel right for some reason.

The full source code is shown below (Feel free to comment on anything else that may be improved as well):

#include <iostream>
#include <memory>
using std::make_unique;
using std::unique_ptr;
template <typename T>
class Tree {
 struct Node {
 Node(const T& value)
 : value(value), left(nullptr), right(nullptr) {}
 T value;
 unique_ptr<Node> left;
 unique_ptr<Node> right;
 };
public:
 Tree() : root_(nullptr) {}
 // Insert a value into the tree
 void Insert(const T& value) {
 Insert(root_, value);
 }
 // Delete a value from the tree
 bool Delete(const T& value) {
 return Delete(root_, value);
 }
 // Search the tree for a node and return true if it is found
 bool Contains(const T& value) const {
 return Contains(root_, value);
 }
private:
 void Insert(unique_ptr<Node>& node, const T& value) {
 if (not node) {
 node = make_unique<Node>(value);
 }
 else {
 value < node->value
 ? Insert(node->left, value)
 : Insert(node->right, value);
 }
 }
 bool Delete(unique_ptr<Node>& node, const T& value) {
 if (not node) {
 return false;
 }
 else if (value == node->value) {
 if (node->left) {
 unique_ptr<Node>& right = node->right;
 node = move(node->left);
 if (right) {
 // Need to use a raw pointer because I can't assign to a std::unique_ptr
 Node* n = node.get();
 while (n->right) {
 // Using a reference to a unique_ptr instead of a raw pointer will cause this line to fail to compile
 n = n->right.get();
 }
 n->right = move(right);
 }
 }
 else {
 node = move(node->right);
 }
 return true;
 }
 else {
 return value < node->value
 ? Delete(node->left, value)
 : Delete(node->right, value);
 }
 }
 bool Contains(const unique_ptr<Node>& node, const T& value) const {
 if (not node) {
 return false;
 }
 else if (node->value == value) {
 return true;
 }
 else {
 return value < node->value
 ? Contains(node->left, value)
 : Contains(node->right, value);
 }
 }
 unique_ptr<Node> root_;
};

Question 2

As to your specific question: it might be a matter of opinion, but I'd say that a raw pointer is exactly correct for referencing something you don't own.

Question 3

Is there a reason you can't use std::shared_ptr?

Question 4

@tinstaafl, I could implement the whole thing using std::shared_ptr, but using std::unique_ptr is generally more efficient (theres no extra memory overhead and move operations are faster than copy). I guess using a raw pointer to refer to something owned by a std::unique_ptr is essentially the same as creating a std::weak_ptr from a std::shared_ptr?

Question 5

@tjwrona1992, IIRC having weak ptr locks object counter on shared ptr, thus forcing it to exist a little while longer.

Question 6

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers .

Question 7

On raw and smart pointers

First, I'll address your specific question, about using a raw pointer to refer to nodes in the tree. I believe you're doing exactly the right thing here, because smart pointers convey information about ownership of the resource.

We are looking at nodes that are owned by the tree (actually, each node is owned by its parent), and it's entirely appropriate that that ownership is encoded as a std::unique_ptr; that says that this is the only place that's responsible for deleting the node when it's no longer required.

Our Delete method walks nodes from the root downwards; it needs to refer to the nodes, but only during the execution of the method (it doesn't store pointers anywhere). A raw pointer is perfectly suitable here.

General review

Is <iostream> used anywhere? I think it can be omitted.

If this is a header file, avoid bringing std::make_unique and std::unique_ptr into the global namespace, as that will affect every file that includes the header. It's reasonable to do so inside the function bodies, but at that point it's likely better to simply use the qualified names. Including them within the scope of the type is a middle ground where opinions will differ.

We should include <utility> for std::move (and the same comments apply to importing that name into ::).

Consider turning tests around to test a positive first, which slightly reduces cognitive load. Instead of:

if (not node) {
 node = make_unique<Node>(value);
}
else {
 value < node->value
 ? Insert(node->left, value)
 : Insert(node->right, value);
}

This is easier to reason about:

if (node) {
 value < node->value
 ? Insert(node->left, value)
 : Insert(node->right, value);
}
else {
 node = make_unique<Node>(value);
}

Actually, that ternary can be narrowed to within the argument list (you can add more parens according to taste):

if (node) {
 Insert(value < node->value ? node->left : node->right, value);
}
else {
 node = make_unique<Node>(value);
}

On a related note, if we return early from within if, we can just flow into the next code without else:

if (not node) {
 return false;
}
else if (node->value == value) {
 return true;
}
else {
 return value < node->value
 ? Contains(node->left, value)
 : Contains(node->right, value);
}

That can become

if (not node) {
 return false;
}
if (node->value == value) {
 return true;
}
return Contains(value < node->value ? node->left : node->right, value);

We might be able to remove some repetition by adding a method to find the insertion point for a value - Contains() can return true if that insertion point has a value that compares equal, and Delete() can start from that point in removing the value.

If the tree could be large, then perhaps iteration may be better than recursion for walking the tree. That means we'll need pointers in some places where we currently have references, and some things may have to lose const qualification, but may give a performance improvement and save on stack usage (or may not, depending on how your compiler manages to optimise the recursive call - always check these things!).

Further directions

Try to evolve the public interface to supporting standard container operations. This probably begins with creating iterators and the member functions that return them.

Consider maintaining a balanced tree - you'll want to read up on the theory if you're not already familiar with the techniques to do this.

You haven't shown your unit tests here, but it's worthwhile occasionally running them with a memory checker such as Valgrind. As well as ensuring we don't leak memory, that will also help identify any uses of dangling pointers or uninitialized values, which an ordinary run might not.

Question 8

How would I go about turning the recursion into iteration? I've heard that it can be much more efficient to do things that way but I can't really see how it should be done. Are there any good examples on how to do this?

Question 9

And as for the other comments - iostream is used later in the file for some testing routines I wrote but you're right, it is not needed for the tree. Also the tree class will eventually be in a header so I will remove the using statements.

Question 10

Additionally it seems in Visual Studio I don't need #include <utility> or using std::move in order to call move... The code compiles and runs fine... that's a little scary.

Question 11

Worthwhile reading on SO: Way to go from recursion to iteration

Question 12

I looked into replacing the recursion with iteration, but for binary trees the code starts to get pretty messy (you need to manage a lot of pointers to pointers). For the sake of readability I feel recursion is the better approach in this case. I don't foresee myself using it in a situation with huge datasets that could cause stack overflows.

Question 13

Here are some ideas to help you improve your program.

Fix the bug

If we create a Tree<std::string> t and then insert "epsilon", "gamma", "beta", "delta", "alpha", "mu", and "kappa" in that order, we have a tree like this:

tree before deletion

If we then call t.Delete("beta"); we invoke undefined behavior and mess up the tree by effectively deleting two nodes (both "beta" and "delta" are now gone):

tree after deletion

Use raw pointers

The usual advice is to replace raw pointers with smart pointers, but in this particular case, I would advise going the other way for a number of reasons. First, the std::unique_ptr is not part of the public interface of the tree, so changing it to a raw pointer would not affect users of the tree at all. Second, the smart pointer is only getting in the way here, since it would be much simpler to write this code with plain pointers. Third, a plain pointer version would better allow you to have a single private concrete implementation class using a void * for the data. The template would then only have a very light wrapper to convert to and from void * to type T. This makes it certain that if you have multiple kinds of trees, there will only be one code instance that actually implements the tree manipulations and each templated version will only have a small bit of code overhead per type.

Avoid polluting the global namespace with `using`

The using std::make_unique and using std::unique_ptr in this code are at file scope. If this is intended to be an include file, then that effectively brings those to into the global scope which could cause name collisions. Instead, you could simply remove those and simply add std:: where appropriate or omit them entirely if you follow the previous suggestion.

Don't define a constructor that only initializes data

The only thing your Tree constructor does is to initialize root_ to a fixed value. It's generally better to instead use member initializers instead. See Cpp Guidelines C.45. I'd also simplify the Node constructor by using member initializers for left and right.

Provide a usable interface

The way it's currently defined, it's not possible to do anything at all with nodes once they are stored in the Tree except to determine whether they're stored in the tree. That's unlikely to be a useful interface in the real world.

Use only necessary `#include`s

The #include <iostream> line is not necessary and can be safely removed.

Consider using more idiomatic C++

While your use of not is not technically wrong, you should be aware that many experienced C++ programmers will be unfamiliar with alternative operators, and so if others read your code, it might be an impediment to their understanding of it.

Question 14

Could you expand on the idea of using void * in the implementation and then only having a template wrapper? From what I understand templates should always be preferred over using void * but maybe I have missed the point. Is void * fine to use as long as it doesn't appear in the public interface?

Question 15

I've fixed the bug. Thanks for pointing it out :)

Question 16

The use of void * as the data type is just one way to have a concrete implementation which lies beneath (and is hidden under) a templated class. Imagine creating the Tree with a concrete type such as int. It's usually easier to make sure everything works. Then once it works, change to void * and you can store pointers to anything instead. There is very little reason to use void * in modern C++, but IMHO, using them within a class is fine. Most implementations of operator new use it just that way, for example.

Question 17

unique_ptr<Node>& right = node->right;
node = move(node->left); // it will destroy the previous node and right nodes
if (right) // right will be nullptr as the right nodes would have been destructed in the previous line
{...}

In your Delete() function, this logic seems to be flawed. When you move your node->left into node, then all the existing right nodes will be destroyed because you just created a reference of unique_ptr which doesn't stop the following move operation from deleting these nodes.

if (right) {...}

So, in effect the above condition will never be true and also, you will lose the right nodes.

The correct way to do this will be to move the right nodes into a local variable something like this :

auto right = std::move(node->right);
node = move(node->left);
if (right) // Now, right will contain all the right nodes
{...}

Question 18

Missing include guard.

#ifndef TJWRONA_CONTAINER_BINARY_SEARCH_TREE_H
#define TJWRONA_CONTAINER_BINARY_SEARCH_TREE_H
// code
#endif // TJWRONA_CONTAINER_BINARY_SEARCH_TREE_H

#include what you use. Don't #include what you don't. You don't use iostream in your code. You do use memory (std::unique_ptr) and utility (std::move).
```
#include <memory>
#include <utility>
```
Don't pollute the global namespace. Wrap your code in a namespace.
```
namespace container {
// code
} // namespace container
```
std::make_unique doesn't exist in C++11 due to an oversight. It was introduced in C++14. You can implement it easily in C++11.
- Herb Sutter's Simple Implementation.
- Steven T. Lavajev's Complete Implementation that was standardized.

Use using-directives at narrower scopes.

namespace container {
using std::unique_ptr; 
// unique_ptr to all code within container namespace.
} // namespace container
// container::unique_ptr to anyone outside the container namespace.

You can also alias types into logical names for simplification.

namespace container {
template <typename T>
struct Tree {
private:
 struct Node;
 using link_type = std::unique_ptr<Node>;
 using value_type = T;
 struct Node { 
 link_type prev_;
 link_type next_;
 value_type value_;

Give entities names that better describes what they represent.
```
template <typename ElementType>
class BinarySearchTree {
```
When initializing data members to constant values, prefer in-class member initialization.
std::unique_ptr default-initializes its value to nullptr.

Provide a constructor to initialize the object into a valid state. If you define a constructor yourself, you will need to explicitly define a default constructor if you want one.

 struct Node {
 Node(value_type const& value)
 : value_(value)
 {}
 Node(value_type&& value, link_type&& left, link_type&& right)
 : value_(std::move(value)), left_(std::move(left)), right_(std::move(right))
 {}
 Node(value_type const& value, link_type&& left, link_type&& right)
 : value_(value), left_(std::move(left)), right_(std::move(right))
 {}
 value_type value_;
 link_type left_;
 link_type right_;
 };
 // No other constructors for BinarySearchTree().
 // The compiler will implicitly create one for you.
 // You can explicitly define a default constructor if you choose.
 // BinarySearchTree() = default;
 link_type root_;

Consider the rule of three/five/zero. Rule of Three/Five/Zero Since you haven't defined any of the five special member functions in your BinarySearchTree, this is what happens.
- Copy Constructor - Implicitly deleted by the compiler because std::unique_ptr has a deleted copy constructor. If you want copy construction, you need to define this yourself.
- Copy Assignment - Implicitly deleted by the compiler because std::unique_ptr has a deleted copy-assignment operator. If you want copy assignment, you need to define this yourself.
- Move Constructor - Automatically generated by the compiler. Behavior is correct for std::unique_ptr, transferring ownership upon move.
- Move Assignment - Automatically generated by the compiler. Behavior is correct for std::unique_ptr, transferring ownership upon move.
- Destructor - Automatically generated by the compiler. However, the behavior of std::unique_ptr's destructor is a recursive chain of destructor calls for its left and right children. Your BinarySearchTree isn't a balanced tree, so it's possible for the tree to take on the characteristic of a singly-linked list (elements inserted in either ascending or descending order). This tree of breadth 1 is going to have N recursive destruction calls. For a sufficiently large enough N, you are going to consume the available stack space and overflow. You need to declare the destructor and define the correct behavior.
By user declaring a destructor, the compiler will no longer implicitly declare/define the move operations. So you will need to declare/define those yourself as well. The copy operations would normally still remain implicitly defaulted, but std::unique_ptr being uncopyable leaves them as deleted.
```
BinarySearchTree(BinarySearchTree const&) = delete;
BinarySearchTree& operator=(BinarySearchTree const&) = delete;
BinarySearchTree(BinarySearchTree&&) = default;
BinarySearchTree& operator=(BinarySearchTree&&) = default;
~BinarySearchTree()
{
 // Iterative destruction implementation
}
```

Toby Speight Toby Speight 87.3k14 gold badges104 silver badges322 bronze badges · Accepted Answer · 2019-03-06 19:00:16Z

On raw and smart pointers

First, I'll address your specific question, about using a raw pointer to refer to nodes in the tree. I believe you're doing exactly the right thing here, because smart pointers convey information about ownership of the resource.

We are looking at nodes that are owned by the tree (actually, each node is owned by its parent), and it's entirely appropriate that that ownership is encoded as a std::unique_ptr; that says that this is the only place that's responsible for deleting the node when it's no longer required.

Our Delete method walks nodes from the root downwards; it needs to refer to the nodes, but only during the execution of the method (it doesn't store pointers anywhere). A raw pointer is perfectly suitable here.

General review

Is <iostream> used anywhere? I think it can be omitted.

If this is a header file, avoid bringing std::make_unique and std::unique_ptr into the global namespace, as that will affect every file that includes the header. It's reasonable to do so inside the function bodies, but at that point it's likely better to simply use the qualified names. Including them within the scope of the type is a middle ground where opinions will differ.

We should include <utility> for std::move (and the same comments apply to importing that name into ::).

Consider turning tests around to test a positive first, which slightly reduces cognitive load. Instead of:

if (not node) {
 node = make_unique<Node>(value);
}
else {
 value < node->value
 ? Insert(node->left, value)
 : Insert(node->right, value);
}

This is easier to reason about:

if (node) {
 value < node->value
 ? Insert(node->left, value)
 : Insert(node->right, value);
}
else {
 node = make_unique<Node>(value);
}

Actually, that ternary can be narrowed to within the argument list (you can add more parens according to taste):

if (node) {
 Insert(value < node->value ? node->left : node->right, value);
}
else {
 node = make_unique<Node>(value);
}

On a related note, if we return early from within if, we can just flow into the next code without else:

if (not node) {
 return false;
}
else if (node->value == value) {
 return true;
}
else {
 return value < node->value
 ? Contains(node->left, value)
 : Contains(node->right, value);
}

That can become

if (not node) {
 return false;
}
if (node->value == value) {
 return true;
}
return Contains(value < node->value ? node->left : node->right, value);

We might be able to remove some repetition by adding a method to find the insertion point for a value - Contains() can return true if that insertion point has a value that compares equal, and Delete() can start from that point in removing the value.

If the tree could be large, then perhaps iteration may be better than recursion for walking the tree. That means we'll need pointers in some places where we currently have references, and some things may have to lose const qualification, but may give a performance improvement and save on stack usage (or may not, depending on how your compiler manages to optimise the recursive call - always check these things!).

Further directions

Try to evolve the public interface to supporting standard container operations. This probably begins with creating iterators and the member functions that return them.

Consider maintaining a balanced tree - you'll want to read up on the theory if you're not already familiar with the techniques to do this.

You haven't shown your unit tests here, but it's worthwhile occasionally running them with a memory checker such as Valgrind. As well as ensuring we don't leak memory, that will also help identify any uses of dangling pointers or uninitialized values, which an ordinary run might not.

How would I go about turning the recursion into iteration? I've heard that it can be much more efficient to do things that way but I can't really see how it should be done. Are there any good examples on how to do this?
And as for the other comments - iostream is used later in the file for some testing routines I wrote but you're right, it is not needed for the tree. Also the tree class will eventually be in a header so I will remove the using statements.
Additionally it seems in Visual Studio I don't need #include <utility> or using std::move in order to call move... The code compiles and runs fine... that's a little scary.
Worthwhile reading on SO: Way to go from recursion to iteration
I looked into replacing the recursion with iteration, but for binary trees the code starts to get pretty messy (you need to manage a lot of pointers to pointers). For the sake of readability I feel recursion is the better approach in this case. I don't foresee myself using it in a situation with huge datasets that could cause stack overflows.

Stack Exchange Network

Implementing a binary tree in C++ using "std::unique_ptr"

4 Answers 4

On raw and smart pointers

General review

Further directions

Fix the bug

Use raw pointers

Avoid polluting the global namespace with `using`

Don't define a constructor that only initializes data

Provide a usable interface

Use only necessary `#include`s

Consider using more idiomatic C++

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Implementing a binary tree in C++ using "std::unique_ptr"

4 Answers 4

On raw and smart pointers

General review

Further directions

Fix the bug

Use raw pointers

Avoid polluting the global namespace with using

Don't define a constructor that only initializes data

Provide a usable interface

Use only necessary #includes

Consider using more idiomatic C++

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Avoid polluting the global namespace with `using`

Use only necessary `#include`s