3

I've just built a self-balancing tree (red-black) in Java (language should be irrelevant for this question though), and I'm trying to come up with a good means of testing that it's properly balanced.

I've tested all the basic tree operations, but I can't think of a way to test that it is indeed well and truly balanced. I've tried inserting a large dictionary of words, both pre-sorted and un-sorted. With a balanced tree, those should take roughly the same amount of time, but an unbalanced tree would take significantly longer on the already-sorted list. But I don't know how to go about testing for that in any reasonable, reproducible way. (I've tried doing millisecond tests on these, but there's no noticeable difference - probably because my source data is too small.)

Is there a better way to be sure that the tree is really balanced? Say, by looking at the tree after it's created and seeing how deep it goes? (That is, without modifying the tree itself by adding a depth field to each node, which is just wasteful if you don't need it for anything other than testing.)

Kilian Foth
111k45 gold badges301 silver badges323 bronze badges
asked Nov 11, 2013 at 0:08
2
  • Your source data is too small? Try inserting all 32-bit integers, sorted and unsorted. Commented Nov 11, 2013 at 2:26
  • Linux rbtree test might be of help. github.com/torvalds/linux/blob/master/lib/rbtree_test.c Commented Aug 8, 2016 at 12:54

2 Answers 2

10

One way to do this would be to create a method on your tree that measures the depth of the tree at a given node. You don't have to store the value, and if you use such a getDepth() method only for testing, then there's no extra overhead for normal tree operations. The getDepth() method would recursively traverse its child nodes and return the maximum depth found.

Once you have that, you can then check that your whole tree is balanced by recursing over each node of the tree, and verifying a condition something like:

Math.abs(getDepth(left) - getDepth(right)) <= 1
answered Nov 11, 2013 at 0:27
6
  • 1
    As an addition to this answer: if you prevent direct access to the implementation (eg, by using a factory and an interface), you should feel free to add any useful method to the implementation class. Including methods designed solely for testing. Commented Nov 11, 2013 at 1:36
  • Is the depth-difference of no greater than 1 at every node guaranteed for red-black? I know this would work for AVL trees, but I remember reading somewhere that other balanced trees don't necessarily have this property despite being relatively balanced. I'd think something like comparing the maximum depth to the log(n) would a more general solution, but I think there's some wiggle-room there as well... Commented Nov 11, 2013 at 2:38
  • Well if you have a different criteria for what you consider "balanced", then use that. My answer is showing another way to test that your tree meets your requirements, other than measuring performance like you suggest (measuring performance only determines the "balanced-ness" indirectly and is not definitive). Commented Nov 11, 2013 at 2:39
  • Yeah, this probably is the way to go. I just wasn't sure if it would work for red-black. (I've done AVL trees a couple times in various languages, but this is my first time experimenting with other types of balanced trees.) Commented Nov 11, 2013 at 2:45
  • 1
    The appropriate invariant of red-black tree is "Every path from a given node to any of its descendant leaves contains the same number of black nodes." And only that many red nodes, because red node may only have black children. Commented Nov 11, 2013 at 8:36
0

If you want to measure time complexity for reads one idea is that you could have the tree issuing callbacks for traversals. This way you could measure the number of operations for a read, which would be a more reliable measurement of time complexity.

answered Dec 10, 2013 at 17:31
4
  • And also add complexity to every step of the operation. If it were just being used to test (and compare to something unbalanced, say), that might be okay, but you'd have to then disable that code before putting into production. Commented Dec 10, 2013 at 21:58
  • Yes, it would be costly. A compromise is to use an internal counter, that should be cheaper. Direct statistics are as I said IMO the only really reliable way of measuring time complexity. But if you want to get fancy you can try ThreadMXBean.getThreadCPUTime() to measure it. Commented Dec 11, 2013 at 6:41
  • Well, as long as you can prove the tree is balanced at any given point, you know that any basic operations will operate in log time. So a solution which analyzes the current state of the tree and proves it's balanced should be good enough, and doesn't needlessly add time or space complexity when it's not being used for testing. (And is in fact what I've already done - this was a month ago.) Commented Dec 11, 2013 at 16:10
  • Yes well I just wanted to add some pointers as to how you could go ahead with your original approach if you wanted to. That's why I started with "If you want to measure time complexity". For reference. Commented Dec 11, 2013 at 19:40

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.