How can I optimize this Binary Search Tree?

Question 1

I've added some methods to a basic BST implementation for integers - if anyone can point out some ways that this could be written in a more efficient or clean manner, please let me know.

public class BST {
 
 /*
 * 1) Node inner class
 */
 
 private class Node {
 
 int data;
 Node leftChild, rightChild;
 
 public Node(int num, Node left, Node right) {
 this.data = num;
 this.leftChild = left;
 this.rightChild = right;
 }
 }
 
 
 // 2) Variables
 
 private Node root = null; // every binary search tree object will be assigned a root node = null
 private int nodeCount = 0;
 
 // 3) Methods: size
 public int size() {
 return nodeCount;
 }
 
 // 4) Methods: isEmpty
 public boolean isEmpty() {
 return (root == null); // if the root of the BST is null (we haven't added any nodes) then the BST is empty
 }
 
 // 5) Methods: contains (recursive), SEARCH
 public boolean treeContains(int info) {
 return treeContains(root, info);
 }
 
 public boolean treeContains(Node node, int info) {
 
 if (node == null) return false; // corner case: if tree is null, but also BASE CASE
 
 // SEARCHING, but looking for a number so we can use O log(n) time
 if (info > node.data) { // go to the right
 return treeContains(node.rightChild, info);
 }
 else if (info < node.data) {
 return treeContains(node.leftChild, info);
 }
 else { // either node has bigger or smaller data or this case: it is equal
 return true;
 }
 }
 
 // 6) Methods: add
 public boolean add(int newData) {
 if (treeContains(newData)) return false;
 
 else {
 root = add(root, newData); // NOTE: this works but why can't it just be add(root,newdata) without the "root = " ?
 nodeCount++;
 return true; 
 }
 }
 
 public Node add(Node node, int newData) {
 
 if (node == null) {
 node = new Node(newData, null, null); // NOTE: how does this add a node? it doesn't place it anywhere? it just exists somewhere? i mean i understand that by then we're pointing to a space that is null but still
 }
 
 else if (newData > node.data) {
 node.rightChild = add(node.rightChild, newData);
 }
 else { // else if newData is less or equal to node data
 node.leftChild = add(node.leftChild, newData);
 }
 return node;
 }
 
 // 7) Methods: remove
 // look up node to be removed, set it to null, recursive
 
 public boolean remove(int newData) {
 if (!treeContains(newData)) return false; // will null BST's fall under this ???
 
 else {
 root = remove(root,newData); // NOTE: why do we need to have root = remove(root, newData) ??? instead of just remove..
 nodeCount--;
 return true;
 }
 }
 
 public Node remove(Node node, int newData) {
 
 if (node.data == newData) { // if we find the node we want to delete 
 
 // 3 SCENARIOS - the node can be a leaf, can have 1 or 2 children
 
 // leaf node
 if (node.leftChild==null && node.rightChild== null) {
 node = null;
 }
 
 // node has left child only
 else if (node.leftChild !=null && node.rightChild == null) {
 node.data = node.leftChild.data; // left child is now node
 node.leftChild = null;
 }
 
 // node has right child only
 else if (node.leftChild==null && node.rightChild!=null) {
 Node temp = node.rightChild; // trying another way of switching nodes & deleting
 node.rightChild = null;
 node.data = temp.data; 
 }
 
 // node has both children
 else {
 Node temp = findMin(node.rightChild); 
 node.data = temp.data; // switch data
 node.rightChild = remove(node.rightChild, temp.data); // remove the findMin node, originally node.RC = remove...
 }
 }
 
 else if (newData < node.data) {
 node.leftChild = remove (node.leftChild, newData);
 }
 
 else { // if newData > node.data
 node.rightChild = remove(node.rightChild, newData);
 }
 
 return node;
 }
 
 // Helper method to find the leftmost node (which has the smallest value)
 private Node findMin(Node node) {
 while (node.leftChild != null) 
 node = node.leftChild;
 return node;
 }
 
 
 // 8) Methods: In order tree traversal
 public void traverseInOrder(Node node) {
 if (node == null) return;
 
 traverseInOrder(node.leftChild);
 System.out.print(node.data + " ");
 traverseInOrder(node.rightChild);
 }
 
 // 9) Methods: Pre order tree traversal
 public void traversePreOrder(Node node) {
 if (node == null) return;
 
 System.out.print(node.data + " ");
 traversePreOrder(node.leftChild);
 traversePreOrder(node.rightChild);
 }
 
 
 // 10) Methods: Post order tree traversal
 public void traversePostOrder(Node node) {
 if (node == null) return;
 
 traversePostOrder(node.leftChild);
 traversePostOrder(node.rightChild);
 System.out.print(node.data + " ");
 }
 
 // RUN METHODS
 
 public static void main(String[] args) {
 
 BST tree1 = new BST();
 
 tree1.add(10);
 tree1.add(7);
 tree1.add(5);
 tree1.add(8);
 tree1.add(20);
 tree1.add(15);
 tree1.add(25);
 
 tree1.traverseInOrder(tree1.root);
 
 System.out.println( "\nSize of tree is: " + tree1.size() );
 System.out.println( "Root of tree is: " + tree1.root.data );
 System.out.println( "Min num of tree is: " + tree1.findMin(tree1.root).data );
 System.out.println( "Does the tree contain the #5 ? " + tree1.treeContains(5) );
 
 tree1.remove(8);
 tree1.remove(20);
 
 tree1.traverseInOrder(tree1.root);
 System.out.println( "\nSize of tree is: " + tree1.size() );
 }
 
}

Thank you!!

Question 2

You're looking to optimize, is the code too slow right now? Have you done any profiling? It's a very common fallacy for new coders to want to optimize the ever living daylight out of everything but in general it's much better to make something that works and is complete first and then optimize the final product, this prevents you spending 90% of your time improving performance by 1% as opposed to spending 10% of your time improving performance by 90% once you know what to focus on.

Question 3

I am not a Java devloper, so take review with a grain of salt.

There are a lot of unncessary comments:

3) Methods: size

4) Methods: isEmpty

We can already see that it is a method and what it's name is by looking at the code. Adding a docstring is optional for such short methods.

public Node(int num, Node left, Node right) {
 this.data = num;
 ...
}

Why is the argument named num, but assigned to data? Later on in code, you call num as newData, which is more appropriate.

if (treeContains(newData)) return false;
 else {
 root = add(root, newData);
 nodeCount++;
 return true; 
 }

This looks really awkward to me and I would rewrite as:

if (treeContains(newData)) return false;
root = add(root, newData);
nodeCount++;
return true;

// NOTE: this works but why can't it just be add(root,newdata) without the "root = " ?

This is because root, defined as class instance private Node root = null is updated

else if (node.leftChild !=null && node.rightChild == null)

else if (node.leftChild==null && node.rightChild!=null)

Spacing is all over the place. Prefer spaces around the equality operator. For example, node.leftChild == null

System.out.println( "Root of tree is: " + tree1.root.data );

New public methods should be created (getRoot() and getValue()) for encapsulation.

public boolean add(int newData)

Method signature looks odd to me. What is the meaning of the boolean that is returned? (add to documentation)

Why is same-valued nodes not allowed?

nodeCount++;

nodeCount--;

In this case, it does not matter much if pre- or post-increment/decrement is used. Pre-increment should be used when old value does not matter

Question 4

Thank you!! This was super helpful! One quick question though, when you say that "This is because root, defined as class instance private Node root = null is updated" do you mean that the subtrees of the root node are updated? or that the root node itself is occasionally updated?

Question 5

In this case, it is both. I think the confusion comes from the odd method signatures. If I'm correct, the root = part of root = add(...) is only needed when root == null (and not in the general case). Try if (root == null) { root = add(root, newData); } else { add(root, newData); } and see if it makes sense

Question 6

This should also answer your question in the comments:

// NOTE: how does this add a node? it doesn't place it anywhere? it just exists somewhere? i mean i understand that by then we're pointing to a space that is null but still

. The node is assigned to root in the first case, and the subtree in the others

Question 7

 if (!treeContains(newData)) return false; // will null BST's fall under this ???
 
 else {

Please don't do this. It is a pain to read and can cause bugs as separating the else from the if makes it so that it is quite possible that the if is on one screen and the else is off the bottom. People know to check for an else on the next line. Making us continue to check past that is just cruel.

It's also unnecessary in this case. You are returning from the initial clause. So the rest of the method will only run if the check fails. Therefore, you don't need to use an else at all. Just the if is sufficient.

As a general rule, try to avoid mixing the statement and block forms in the same control structure. It adds cognitive load unnecessarily when someone reads the code. And in general, people read code more than they write it. Because most development work is changing existing code rather than writing new code.

There is a school of thought that you should never use the statement form of control structures, as it leads to a number of opportunities for weird, hard-to-find bugs. For example, compare

 if (!treeContains(newData)) //return false;
 doSomething();

to

 if (!treeContains(newData)) {
 //return false;
 }
 doSomething();

At first glance, these might seem to do the same thing. But in actuality, they are quite different. Particularly if doSomething has important side effects.

FromTheStackAndBack FromTheStackAndBack 1,1687 silver badges18 bronze badges · Accepted Answer · 2021-05-10 23:38:15Z

I am not a Java devloper, so take review with a grain of salt.

There are a lot of unncessary comments:

3) Methods: size

4) Methods: isEmpty

We can already see that it is a method and what it's name is by looking at the code. Adding a docstring is optional for such short methods.

public Node(int num, Node left, Node right) {
 this.data = num;
 ...
}

Why is the argument named num, but assigned to data? Later on in code, you call num as newData, which is more appropriate.

if (treeContains(newData)) return false;
 else {
 root = add(root, newData);
 nodeCount++;
 return true; 
 }

This looks really awkward to me and I would rewrite as:

if (treeContains(newData)) return false;
root = add(root, newData);
nodeCount++;
return true;

// NOTE: this works but why can't it just be add(root,newdata) without the "root = " ?

This is because root, defined as class instance private Node root = null is updated

else if (node.leftChild !=null && node.rightChild == null)

else if (node.leftChild==null && node.rightChild!=null)

Spacing is all over the place. Prefer spaces around the equality operator. For example, node.leftChild == null

System.out.println( "Root of tree is: " + tree1.root.data );

New public methods should be created (getRoot() and getValue()) for encapsulation.

public boolean add(int newData)

Method signature looks odd to me. What is the meaning of the boolean that is returned? (add to documentation)

Why is same-valued nodes not allowed?

nodeCount++;

nodeCount--;

In this case, it does not matter much if pre- or post-increment/decrement is used. Pre-increment should be used when old value does not matter

Thank you!! This was super helpful! One quick question though, when you say that "This is because root, defined as class instance private Node root = null is updated" do you mean that the subtrees of the root node are updated? or that the root node itself is occasionally updated?
In this case, it is both. I think the confusion comes from the odd method signatures. If I'm correct, the root = part of root = add(...) is only needed when root == null (and not in the general case). Try if (root == null) { root = add(root, newData); } else { add(root, newData); } and see if it makes sense
This should also answer your question in the comments: // NOTE: how does this add a node? it doesn't place it anywhere? it just exists somewhere? i mean i understand that by then we're pointing to a space that is null but still. The node is assigned to root in the first case, and the subtree in the others

Stack Exchange Network

How can I optimize this Binary Search Tree?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How can I optimize this Binary Search Tree?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions