I was inspired by this answer and decided to implement an AVL tree with the methods equals
and hashCode
as if I was asked to do that at a job interview.
This is my code on GitHub.
In addition to the solution itself, I wrote tests for all the possible cases. If I missed something, or if my solution has any bugs, please let me know.
AvlTree
package com.bst;
import java.util.LinkedList;
public class AvlTree {
Node mRoot;
public AvlTree() {
}
public AvlTree(int root) {
mRoot = new Node(root);
}
private Node insert(Node root, int key) {
if (root == null) {
return new Node(key);
}
if (key < root.mValue) {
root.mLeft = insert(root.mLeft, key);
} else {
root.mRight = insert(root.mRight, key);
}
return balance(root);
}
private Node balance(Node p) {
fixHeight(p);
if (bfactor(p) == 2) {
if (bfactor(p.mRight) < 0) {
p.mRight = rotateRight(p.mRight);
}
return rotateLeft(p);
}
if (bfactor(p) == -2) {
if (bfactor(p.mLeft) > 0) {
p.mLeft = rotateLeft(p.mLeft);
}
return rotateRight(p);
}
return p;
}
private Node rotateRight(Node p) {
Node q = p.mLeft;
p.mLeft = q.mRight;
q.mRight = p;
fixHeight(p);
fixHeight(q);
return q;
}
private Node rotateLeft(Node q) {
Node p = q.mRight;
q.mRight = p.mLeft;
p.mLeft = q;
fixHeight(q);
fixHeight(p);
return p;
}
private int height(Node p) {
return p == null ? 0 : p.mHeight;
}
private int bfactor(Node p) {
return height(p.mRight) - height(p.mLeft);
}
private void fixHeight(Node p) {
int hl = height(p.mLeft);
int hr = height(p.mRight);
p.mHeight = (hl > hr ? hl : hr) + 1;
}
public void insert(int... keys) {
for (int value : keys) {
mRoot = insert(mRoot, value);
}
}
@Override
public boolean equals(Object arg0) {
if (this == arg0) {
return true;
}
if (!(arg0 instanceof AvlTree)) {
return false;
}
AvlTree another = (AvlTree) arg0;
return areTreesEqual(this.mRoot, another.mRoot);
}
private boolean areTreesEqual(Node root1, Node root2) {
if (root1 == root2) {
return true;
}
if (root1 == null || root2 == null) {
return false;
}
return root1.mValue == root2.mValue && areTreesEqual(root1.mLeft, root2.mLeft) && areTreesEqual(root1.mRight, root2.mRight);
}
@Override
public int hashCode() {
if (mRoot == null) {
return 0;
}
LinkedList<Node> nodes = new LinkedList<AvlTree.Node>();
nodes.add(mRoot);
int res = 17;
while (!nodes.isEmpty()) {
Node head = nodes.remove();
res = 31 * res + head.hashCode();
if (head.mLeft != null) {
nodes.addLast(head.mLeft);
}
if (head.mRight != null) {
nodes.addLast(head.mRight);
}
}
return res;
}
@Override
public String toString() {
if (mRoot == null) {
return "[]";
}
StringBuilder builder = new StringBuilder("[");
inOrderPrint(mRoot, builder);
builder.setLength(builder.length() - 2);
builder.append("]");
return builder.toString();
}
private void inOrderPrint(Node root, StringBuilder builder) {
if (root != null) {
inOrderPrint(root.mLeft, builder);
builder.append(root + ", ");
inOrderPrint(root.mRight, builder);
}
}
static class Node {
Node mLeft;
Node mRight;
final int mValue;
private int mHeight;
private Node(int value) {
mValue = value;
mHeight = 1;
}
@Override
public int hashCode() {
int res = 17;
res = 17 * res + mValue;
return res;
}
@Override
public String toString() {
return Integer.toString(mValue);
}
}
}
AvlTreeTest
package com.bst;
import org.junit.Assert;
import org.junit.Test;
public class AvlTreeTest {
@Test
public void testDefaultConstructor() {
AvlTree t1 = new AvlTree();
Assert.assertNull(t1.mRoot);
}
@Test
public void testIntegerConstructor() {
AvlTree t1 = new AvlTree(1);
Assert.assertNotNull(t1.mRoot);
}
@Test
public void testInsertToEmptyTree() {
AvlTree t1 = new AvlTree();
t1.insert(1);
Assert.assertEquals(1, t1.mRoot.mValue);
}
@Test
public void testEqualsItself() {
AvlTree t1 = new AvlTree();
Assert.assertEquals(t1, t1);
}
@Test
public void testNotEqualNotAvlInstance() {
AvlTree t1 = new AvlTree();
Object object = new Object();
Assert.assertNotEquals(t1, object);
}
@Test
public void testEmptyEqual() {
AvlTree t1 = new AvlTree();
AvlTree t2 = new AvlTree();
Assert.assertEquals(t1, t2);
}
@Test
public void testFirstEmpty() {
AvlTree t1 = new AvlTree();
AvlTree t2 = new AvlTree(1);
Assert.assertNotEquals(t1, t2);
}
@Test
public void testSecondEmpty() {
AvlTree t1 = new AvlTree(1);
AvlTree t2 = new AvlTree();
Assert.assertNotEquals(t1, t2);
}
@Test
public void testRootsEqual() {
AvlTree t1 = new AvlTree(1);
AvlTree t2 = new AvlTree(1);
Assert.assertEquals(t1, t2);
}
@Test
public void testRootAndLeftEqual() {
AvlTree t1 = new AvlTree(10);
t1.insert(2);
AvlTree t2 = new AvlTree(10);
t2.insert(2);
Assert.assertEquals(t1, t2);
}
@Test
public void testRootAndRightEqual() {
AvlTree t1 = new AvlTree(1);
t1.insert(2);
AvlTree t2 = new AvlTree(1);
t2.insert(2);
Assert.assertEquals(t1, t2);
}
@Test
public void testRootsEqual_LeftsNotEqual() {
AvlTree t1 = new AvlTree(10);
t1.insert(2);
AvlTree t2 = new AvlTree(10);
t2.insert(1);
Assert.assertNotEquals(t1, t2);
}
@Test
public void testRootsEqual_RightsNotEqual() {
AvlTree t1 = new AvlTree(1);
t1.insert(2);
AvlTree t2 = new AvlTree(1);
t2.insert(4);
Assert.assertNotEquals(t1, t2);
}
@Test
public void testEmptyTreeHashCode() {
AvlTree t1 = new AvlTree();
Assert.assertEquals(0, t1.hashCode());
}
@Test
public void testEqualTreesEqualHashCodes() {
AvlTree t1 = new AvlTree(10);
t1.insert(2, 12);
AvlTree t2 = new AvlTree(10);
t2.insert(2, 12);
Assert.assertEquals(t1.hashCode(), t2.hashCode());
}
@Test
public void testToStringEmpty() {
AvlTree t1 = new AvlTree();
Assert.assertEquals("[]", t1.toString());
}
@Test
public void testToStringSingleNode() {
AvlTree t1 = new AvlTree(1);
Assert.assertEquals("[1]", t1.toString());
}
@Test
public void testToStringManyNodes() {
AvlTree t1 = new AvlTree(1);
t1.insert(12, 56, 7, 2, 1);
Assert.assertEquals("[1, 1, 2, 7, 12, 56]", t1.toString());
}
@Test
public void testSingleRotateLeft() {
AvlTree t1 = new AvlTree(10);
t1.insert(14, 56);
Assert.assertEquals(t1.mRoot.mValue, 14);
Assert.assertEquals(t1.mRoot.mLeft.mValue, 10);
Assert.assertEquals(t1.mRoot.mRight.mValue, 56);
}
@Test
public void testSingleRotateRight() {
AvlTree t1 = new AvlTree(10);
t1.insert(2, 1);
Assert.assertEquals(t1.mRoot.mValue, 2);
Assert.assertEquals(t1.mRoot.mLeft.mValue, 1);
Assert.assertEquals(t1.mRoot.mRight.mValue, 10);
}
@Test
public void testDoubleRotateLeftRight() {
AvlTree t1 = new AvlTree(10);
t1.insert(4, 9);
Assert.assertEquals(t1.mRoot.mValue, 9);
Assert.assertEquals(t1.mRoot.mLeft.mValue, 4);
Assert.assertEquals(t1.mRoot.mRight.mValue, 10);
}
@Test
public void testDoubleRotateRightLeft() {
AvlTree t1 = new AvlTree(10);
t1.insert(14, 12);
Assert.assertEquals(t1.mRoot.mValue, 12);
Assert.assertEquals(t1.mRoot.mLeft.mValue, 10);
Assert.assertEquals(t1.mRoot.mRight.mValue, 14);
}
}
1 Answer 1
Testing all possible cases
In addition to the solution itself, I wrote tests for all the possible cases.
It seems you have verified all execution paths are covered. That's great, but it doesn't mean you have tested "all the possible cases", far from it.
Code coverage (also known as structured basis testing) is but one of the dimensions of testing. Another dimension is data-flow testing, which is just as important, and can be very tricky.
Testing all possible cases is in general not feasible: you'd have to enumerate all conceivable inputs, which is impossible. There are a couple of things you can do:
- Test with "interesting" data sets
- Test boundaries, corner cases
- Test combinations of the above
Consider this, for example:
AvlTree tree = new AvlTree();
tree.insert(1, 1, 1);
What will happen:
- The first 1 is set as root
- The second 1 is set as root -> right
- The second 1 is set as root -> right -> right
- The tree is rebalanced, rotating to the left, with 1 as root, and 1 as left node and 1 as right node
From this we gain some interesting insights:
- The tree implementation allows duplicates: an unusual, and potentially troublesome feature
- The outcome of the rotation is smelly: the implementation of
insert
uses strictly<
to arrange elements to the left sub-tree, which the rotation breaks
I suggest to add more test cases with larger test trees.
Think of node sequences that will trigger interesting rotations.
Testing hashCode
with empty tree and trees with 2 nodes is hardly sufficient.
For more details about data-flow testing, check out Chapter 22 in Code Complete.
To allow duplicate values or not
Allowing duplicate nodes in a binary search tree is unusual,
and invites all kinds of problems.
In addition to the code smell I highlighted earlier,
you'll probably have some headaches when implementing deletion.
For example tree.insert(3, 3, 3, 3, 1, 1, 1)
results in this tree:
3 / \ 1 3 / \ \ 1 3 3 / 1
Given such tree, what should happen on delete(3)
? Remove the first?
How to delete all nodes with 3? Keep removing until nothing is removed?
You could save yourself a lot of headaches by simply not allowing duplicates.
Adopting Android coding practices in plain Java code
I do understand your sentiment for linking the "m" prefix convention used in Android development. If you want to stick with it, that's up to you, but let me remind you of some concerns:
In expressions like
root.mLeft
, you don't need the "m" to clarify it's a member variable:root.left
is perfectly clear tooIn the few places where it might not be obvious, as in
mRoot = insert(mRoot, value)
, you can always usethis.root = ...
to clarify
Looking over the code, most uses cases fall in the first category, and the second is so rare than despite "this." being longer than "m". By using the this.
prefix where ambiguous and dropping the m
prefix, you will get shorter, perfectly unambiguous code that won't put off any Java developers.
Food for thought ;-)
Lastly, it seems that the "m"-prefix is required for code to be contributed to the Android Open Source Project, and not really a style guide for the code of individual Android apps.
LinkedList
as a Queue
In the implementation of hashCode
,
you're using LinkedList<Node> nodes
as a queue.
So I suggest to declare it as a Queue
,
name the variable "queue",
and instead of the remove()
method,
use the more natural poll()
.
Naming
I find it a bit confusing that in insert(Node root, int key) { ... }
the parameter variable is named "root":
it may mislead that the function manipulates the tree root,
when in fact it's most often an intermediary node.
I suggest to rename the variable to "node" instead.
In other methods you name a Node
parameter p
.
I'd suggest the more intuitive node
everywhere.
And where you use q
I'd suggest other
.
In the implementation of equals
,
you call the other node "another".
I suggest using "other", which is just more common.
-
\$\begingroup\$ > I'd suggest the more intuitive node everywhere. I'd prefer
parent
because finally an inserted node a child of a higher node. \$\endgroup\$Maksim Dmitriev– Maksim Dmitriev2015年08月26日 17:05:45 +00:00Commented Aug 26, 2015 at 17:05 -
\$\begingroup\$ Queue is an interface. Would you use LinkedList as an actual type? And why would you prefer poll() over remove()? \$\endgroup\$Maksim Dmitriev– Maksim Dmitriev2015年08月30日 18:22:49 +00:00Commented Aug 30, 2015 at 18:22
-
1\$\begingroup\$ Yes, I suggest to use
Queue<Node> nodes = new LinkedList<AvlTree.Node>();
, and rename "nodes" to "queue". I thought "poll" is a more natural name for removing the first element of a Queue. But it seems I was wrong about that. Wikipedia doesn't even mention the term. And looking at the JavaDoc, there's a difference in the behavior as well, ofpoll()
andremove()
. In your example, it doesn't matter which you choose. \$\endgroup\$janos– janos2015年08月30日 18:40:43 +00:00Commented Aug 30, 2015 at 18:40 -
\$\begingroup\$ Why do you think it's better to store the reference to the linked list in the queue? I can call it
queue
but use LinkedList as a reference type and an actual type.LinkedList<Node> queue = new LinkedList<AvlTree.Node>();
\$\endgroup\$Maksim Dmitriev– Maksim Dmitriev2015年08月31日 08:31:28 +00:00Commented Aug 31, 2015 at 8:31 -
1\$\begingroup\$ "store the reference to the linked list in the queue" means something else, means putting a linked list inside a queue. What I'm suggesting is to refer to
nodes
by the interface typeQueue
instead of the implementation typeLinkedList
. Because the logic of your usage is the concept of a queue. So I'm suggesting to call a queue what it is: a queue. This is also related to the general recommendation of referring to objects by their interfaces, as explained in Joshua Bloch: Effective Java, Item 52 \$\endgroup\$janos– janos2015年08月31日 08:39:36 +00:00Commented Aug 31, 2015 at 8:39
Explore related questions
See similar questions with these tags.
branchFactor()
instead ofbFactor()
. \$\endgroup\$bFactor
is a balance factor \$\endgroup\$