Avoiding memory leaks and using pointers the right way in my binary search tree implementation - C++
I'm learning C++ and I wanted to test my knowledge by implementing a Binary Search Tree using struct
(I know, I’m using C++ I should use classes but I want to keep it simple).
As far as I know, all the functions work right, my problems are:
Can I improve how I managed raw pointers?
- (I know I should use
unique_ptr
but still...)
- (I know I should use
Did I use
new
anddelete
the right way?Should I refactor some functions?
In general, if I could improve something, just let me know!
Here's the complete code: https://gist.github.com/NeptuneLuke/f456c27a9c118d704c1734f7d4cb46c0
The main functions I'm concerned about are:
struct Node {
int data;
Node* left;
Node* right;
};
void insert(Node* &root, int data) {
// this implementation of a BST tree can not have duplicate nodes
if(exist(root,data))
return;
if(root == nullptr) {
root = new Node();
root->data = data;
root->left = nullptr;
root->right = nullptr;
}
else {
if(data < root->data) {
insert(root->left,data);
}
else {
insert(root->right,data);
}
}
}
void delete_tree(Node* &root) {
if(root != nullptr) {
delete_tree(root->left);
delete_tree(root->right);
delete root;
root = nullptr;
}
}
Node* delete_node(Node *root, int data) {
if(root == nullptr || !exist(root,data)) {
return nullptr;
}
else if(data < root->data) {
root->left = delete_node(root->left,data);
}
else if (data > root->data) {
root->right = delete_node(root->right,data);
}
else {
// case 1 - no child
if(root->left == nullptr && root->right == nullptr) {
delete root;
root = nullptr;
}
// case 2 - one child
else if(root->left == nullptr) {
Node *temp = root;
root = root->right;
delete temp;
}
// case 2 - one child
else if(root->right == nullptr) {
Node *temp = root;
root = root->left;
delete temp;
}
// case 3 - 2 children
else {
Node *temp = max_node(root->right);
root->data = temp->data;
root->right = delete_node(root->right,temp->data);
}
}
return root;
}
2 Answers 2
Given you are not using any actual C++ functionality, I’m going to think about this code as C code that uses new
and delete
instead of malloc
and free
.
I don’t see any obvious problems with memory allocation. If new
fails in insert
, the tree is still in a valid state.
About the API: insert
and delete_tree
take the root node pointer by reference, so they can modify it. You should do the same thing with delete_node
, instead of returning the new root pointer. Do note that the caller can have multiple copies of the root node pointer, if one copy is modified, the others won’t. This is dangerous, and needs to be communicated clearly to the caller, in the form of API documentation.
It is better to have a tree object that holds the root pointer. The caller can then reference this object (struct instance) any number of times, and you’re free to change the root pointer at any time. When you’re ready to use actual C++ constructs, you’ll be able to delete the copy constructor and copy assignment operator, to ensure that the user doesn’t copy the object. You can then also call delete_tree
in the destructor of the tree object, to prevent the user from forgetting to call this function.
About formatting: be consistent. You sometimes have an empty line after opening braces (in if or else statements), and sometimes don’t. I find this confusing, and try to look for a meaning...
This construct:
if(...) {
return ...;
}
else { ... }
doesn’t need the else
. You can often simplify code using early returns.
For C++, avoid using new/delete or c variants of memory allocation. Instead use
std::make_unique
.Fix formatting, for example, consider using
clang format
, but there are others.Consider setting the internal state of the object in the constructor:
root = new Node();
root->data = data;
root->left = nullptr;
root->right = nullptr;
=>
auto node = std::make_unique<Node>();
// all other things are set inside the Node ctor
Double-check your algorithm, use peer-review papers (for newer algorithms) or well-established books with the latest implementation. It took ~16 years to implement corectly a simple binary search algorithm.
Reading the comment and code
// this implementation of a BST tree can not have duplicate nodes
if(exist(root,data))
return;
Start a comment with a capital letter (my personal preference) as it is easier to read, especially when you have more text in the comment. More importantly, the caller will not know if the element has been inserted or not. It may matter for some cashing systems - consider throwing an exception in this situation.
- Logic in code can be simplified, avoid using
if-elseif
where possible:
if(root == nullptr || !exist(root,data)) {
return nullptr;
}
else if(data < root->data) {
root->left = delete_node(root->left,data);
}
=>
if(root == nullptr || !exist(root,data)) {
return nullptr;
}
if(data < root->data) {
root->left = delete_node(root->left,data);
}
You could consider using generics in c++, so that the algorithm can deal with other types of data, rather than
int
, but it is a bit opinionated and is subject to another question.In real code, you'd want to document the branching conditions more, why are you having an
if-else
and why did you choose this specific boundaries.
-
7\$\begingroup\$ I'd argue against leaving out
{
}
, in every project style guide I've encountered until now this is the default (expected to always use{
and}
, noif(a) expression;
\$\endgroup\$Raf– Raf2023年07月24日 07:57:15 +00:00Commented Jul 24, 2023 at 7:57 -
\$\begingroup\$ @Raf Seconded. Accidentally writing
if(a) expression1; expression2;
is a common enough bug and is really hard to find in review/debugging if the code is indented. Because it's so hard to find, all coding standards for safety-related code which I'm aware of mandate the use of braces after every conditional statement. MISRA at least considers this "required" rather than "advisory". \$\endgroup\$Graham– Graham2023年07月24日 11:31:36 +00:00Commented Jul 24, 2023 at 11:31 -
\$\begingroup\$ Types in C and C++ are read right to left,
Node*&
is a reference to a pointer toNode
. By taking the pointer by reference, the pointer can be updated to point to a different node. \$\endgroup\$Cris Luengo– Cris Luengo2023年07月24日 13:14:07 +00:00Commented Jul 24, 2023 at 13:14 -
1\$\begingroup\$ Updated the code with the feedback \$\endgroup\$oleksii– oleksii2023年07月24日 15:52:17 +00:00Commented Jul 24, 2023 at 15:52
unique_ptr
but still" - well maybe. Usually I prefer to have aTree
own all the nodes (which become an implementation detail of tree, not exposed), instead of the nodes owning each other and being out in the open. It depends what you want I suppose. \$\endgroup\$exists()
and thendelete()
the code is first searching the element to make sure it exists and afterwards searching again to delete it. I think this could be simplified to a single traversal of the tree (delete if exists and return null if it doesn't) \$\endgroup\$