#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
typedef struct Node NODE;
struct Node {
int value;
int is_leaf;
struct Node *right;
struct Node *left;
};
NODE *new_leaf(void) {
NODE *leaf = malloc(sizeof(NODE));
assert(leaf);
leaf->is_leaf = 1;
leaf->value = 0;
leaf->right = NULL;
leaf->left = NULL;
return leaf;
}
int insert(NODE *node, int value) {
assert(node);
if (node->is_leaf) {
node->is_leaf = 0;
node->value = value;
node->left = new_leaf();
node->right = new_leaf();
} else if (node->value < value) {
return insert(node->left, value);
} else if (node->value > value) {
return insert(node->right, value);
}
return -1;
}
int contains(NODE *node, int value) {
assert(node);
if (node->is_leaf) {
return 0;
} else if (node->value < value) {
return contains(node->left, value);
} else if (node->value > value) {
return contains(node->right, value);
}
return 1;
}
int max_height(NODE *node) {
assert(node);
if (node->is_leaf) {
return 0;
}
int left_height = max_height(node->left);
int right_height = max_height(node->right);
if (left_height > right_height) {
return 1 + left_height;
} else {
return 1 + right_height;
}
}
void destroy(NODE *tree) {
free(tree);
}
int main(void) {
NODE *tree = new_leaf();
insert(tree, 7);
printf("contains 7? : %d\n", contains(tree, 7));
insert(tree, 6);
printf("contains 6? : %d\n", contains(tree, 6));
insert(tree, 5);
printf("contains 5? : %d\n", contains(tree, 5));
printf("height should be 3, is : %d\n", max_height(tree));
insert(tree, 8);
printf("contains 8? : %d\n", contains(tree, 8));
printf("height should be 3, is : %d\n", max_height(tree));
printf("no contains 1? : %d\n", contains(tree, 9));
destroy(tree);
}
Looking for general style advice, and if I missed any memory leaks.
Specific question: does destroy
ing the tree at the end free up all of the memory allocated for every node in the tree? If so, how? If not, how can I ensure that occurs? Also, am I using asserts correctly? It's very different than the languages I have used previously.
3 Answers 3
Firstly, in C, ALL_CAPS
is generally used only for macros. Hence, I wouldn't use the name NODE
as a typedef
for your struct node
, but instead something like typedef struct Node node_type;
. Further, generally C is written using lowercase_with_underscores
, so I'd rename it to struct node
.
Using an assert
in your new_leaf
function on what you get back from malloc
is fine in my books, as this will terminate the process if the OS can't give you back any memory, in which case you're unlikely to be able to do anything anyway.
Your function int contains(NODE *node, int value)
doesn't really need an assert
. If the user gives you a NULL
node, then you can immediately just return 0
. Giving a NULL
to this function shouldn't terminate the program. A similar argument can be made for max_height
.
Of course, the biggest problem is that no, your destroy
function definitely does NOT clean up all the memory allocated by the program. Tools like valgrind can help to inform you if you have any memory leaks lurking around. In this case, however, it's very easy to see that the only memory that will be reclaimed is the memory for the root node (whatever you call destroy
with).
Fixing this is pretty simple: it's just using a recursive algorithm that you've basically already used in insert
and contains
:
void destroy(node_type root)
{
// If we have a non-NULL pointer, we need to
// recursively free its left and right children,
// and then free it.
if(root) {
destroy(root->left);
destroy(root->right);
free(root);
}
}
A few other comments:
- I'm not a big fan of 2 space indentation. It's not enough to (easily) visually scope blocks. I'd much rather 4 spaces.
- You might want to start looking into how to user header files to split up interface and implementation as a next step (if you haven't already).
- I'm not a big fan of explicitly having keeping track of what is a leaf with
is_leaf
. It means that any node that is down the bottom will have 2 extra dynamic allocations performed to have leaf values (seeinsert
). Instead, I'd just set theleft
andright
pointers to beNULL
, and leave it at that. You can easily write anis_leaf
function that simply checks that both pointers areNULL
. - The API is a bit deficient (no deletion of a value, for example), but I'm guessing this is done as a learning exercise, so that's ok.
Specific questions first:
No, your
destroy
function does not destroy the entire tree - it just frees the current node and you'll leak all other nodes. You will have to implement a recursive delete similar to all your other functions:void destroy(NODE *node) { if (!node->is_leaf) { destroy(node->left); destroy(node->right); } free(node); }
I think your usage of
assert
is quite sensible. Just be aware that asserts are typically removed when build withNDEBUG
defined - in that case your code will just segfault.
Some general remarks:
UPPERCASE
names are typically only used for macros and not for typedefs.I don't quite see the point of returning a value from insert. You always return -1 anyway.
Personally I'm not a fan of implementations where an element of the data structure is used to also represent the entry into the data structure. You are effectively leaking the internals if your implementation to the user. An alternative API would something like this:
In your header:
typedef struct BinarySearchTree BinarySearchTree; BinarySearchTree* bst_new_tree(); void bst_insert(BinarySearchTree* tree, int value); int bst_contains_value(BinarySearchTree* tree, int value); int bst_max_height(BinarySearchTree* tree);
In your implementation
struct BinarySearchTree { Node *root; } ....
This way any user of you data structure doesn't know anything about the internals which in turn makes it easy to change the implementation without affecting anything else.
Typedefing the struct in the header but only defining it in the implementation file is called an opaque type - a type where the consumer of the header does not know anything about the internals of it.
It doesn't use memory very efficiently:
- The first node is initially a 'leaf' node which contains no useful value
- Every leaf node contains no useful value
- Every non-leaf node (which contains a value) contains two empty leaf nodes
It's a bit hard to see what's happening: initially-empty leaf nodes later have a value inserted into them.
A cleaner implementation might be:
- Every Node contains a value; the value is inserted in the Node when the Node is created and never changed afterwards
- A tree which contains no values is represented by a null Node pointer (i.e. zero Node instances)
- Left and right nodes aren't added until they're needed
To implement this:
- Add a
int value
parameter to the new_leaf function - Remove the is_leaf member of the node
- Create left and right nodes just-in-time i.e. not before/until they are needed
For example:
struct Node {
int value;
struct Node *right;
struct Node *left;
};
NODE *new_leaf(int value) {
NODE *leaf = (NODE*)malloc(sizeof(NODE));
assert(leaf);
leaf->value = value;
leaf->right = NULL;
leaf->left = NULL;
return leaf;
}
int insert(NODE *node, int value) {
assert(node);
if (node->value < value) {
if (node->left)
return insert(node->left, value);
else
node->left = new_leaf(value);
} else { // if (node->value >= value)
if (node->right)
return insert(node->right, value);
else
node->right = new_leaf(value);
}
return -1;
}
int contains(NODE *node, int value) {
assert(node);
if (node->value == value) {
return 1;
} else if (node->value < value) {
if (node->left)
return contains(node->left, value);
} else { // if (node->value > value)
if (node->right)
return contains(node->right, value);
}
return 0;
}
int max_height(NODE *node) {
assert(node);
int left_height = (node->left) ? max_height(node->left) : 0;
int right_height = (node->right) ? max_height(node->right) : 0;
if (left_height > right_height) {
return 1 + left_height;
} else {
return 1 + right_height;
}
}
void destroy(NODE *tree) {
NODE *left = tree->left;
NODE *right = tree->right;
free(tree);
if (left)
destroy(left);
if (right)
destroy(right);
}
Explore related questions
See similar questions with these tags.