Finding common ancestor in a binary tree

Question 1

Following is a Python program I wrote to find the common ancestor in a binary tree (not BST) with no link to parent node. Please review.

def common_ancestor(N1, N2, head):
 if not head:
 return [None, 0, 0]
 left_status = common_ancestor(N1, N2, head.left)
 if left_status[0]:
 return left_status
 this_status = [None, 0, 0]
 if head == N1:
 this_status[1] = 1
 if head == N2:
 this_status[2] = 1
 right_status = common_ancestor(N1, N2, head.right)
 if right_status[0]:
 return right_status
 combined_status = [None, 0, 0]
 combined_status[1] = left_status[1] or this_status[1] or right_status[1]
 combined_status[2] = left_status[2] or this_status[2] or right_status[2]
 if combined_status[1] and combined_status[2]:
 print "Common Ancestor is %s" % head
 combined_status[0] = head
 return combined_status
if __name__ == "__main__":
 head = some_node
 print common_ancestor(5, 32, head)[0]

Question 2

Your code gives the error some_node is not defined, posting the full code would make the job easier for reviewers, so that they can test it.

Question 3

Sorry. It seemed like a duplicate. Felt it will not helpful to anyone.

Question 4

1. Review

You didn't give us the definition of the node objects, so I'm guessing you used something like this:
```
from collections import namedtuple
Node = namedtuple('Node', 'left right'.split())
```
There's no docstring. How do I call common_ancestor? What is the meaning of the arguments N1, N2, and head? (I'm guessing that N1 and N2 are the nodes to find the nearest common ancestor of, and that head had to be the root node of the tree.)
The function common_ancestor does two things: it finds a node and prints a message about it. It would be better to separate these responsibilities.
The print statement means that the code is not portable to Python 3.
The function allocates a new list of three status elements for each node in the tree. But as we'll see below, we only need a total of two status elements.
The test head == N2 is not quite right. Here we need the actual node N2, not just any node that happens to be equal to N2, so the test should use is rather than ==.

2. Simpler implementation

The function common_ancestor traverses the tree in depth-first order, keeping track of which of N1 and N2 have been visited so far. So let's start with a simple depth-first traversal implementation:

def traverse(node):
 """Visit all nodes in the tree starting at node, in depth order."""
 if node is None: return
 traverse(node.left)
 traverse(node.right)

Here's a binary tree with two shaded nodes that we want to find the common ancestor of. The thin line shows the order in which nodes are visited by a depth-first traversal.

Depth-first traversal of a binary tree

Suppose that we augment this traversal function so that it keeps track of how many of n1 and n2 have been visited so far:

count = 0
def traverse(node):
 """Visit all nodes in the tree starting at node, in depth order."""
 if node is None: return
 print("Entering {}, count={}".format(node, count))
 if node is n1: count += 1
 if node is n2: count += 1
 traverse(node.left)
 traverse(node.right)
 print("Exiting {}, count={}".format(node, count))

This figure shows the value of count on entry to each node:

Number of target nodes visited on entry to each node

And this figure shows the value of count on exit from each node:

Number of target nodes visited on exit from each node

You'll see that all the common ancestors of the two nodes have count = 0 on entry and count = 2 on exit (and only common ancestors have this property). So that means that we can implement common_ancestor like this:

def common_ancestor(n1, n2, head):
 """Return the nearest common ancestor of nodes n1 and n2 in the tree
 rooted at head.
 """
 count = 0 # How many nodes in {n1, n2} have been visited so far?
 ancestor = None
 def traverse(node):
 nonlocal count, ancestor
 if node is None or ancestor is not None: return
 count_at_entry = count
 if node is n1: count += 1
 if node is n2: count += 1
 traverse(node.left)
 traverse(node.right)
 if count_at_entry == 0 and count == 2 and ancestor is None:
 ancestor = node
 traverse(head)
 return ancestor

The nonlocal statement was new in Python 3, so if you're stuck on Python 2 then you'll have to find a workaround, for example using a status object:

class Status(object): pass
status = Status()
status.count = 0
status.ancestor = None
# etc.

3. Efficient implementation?

The implementation discussed here has to traverse the whole tree and so takes time \$ O(n) \$. Whether this is important depends on how many common-ancestor queries you have. If there are many, then you'll probably want to change your tree data structure so that you can make efficient common-ancestor queries.

Adding parent pointers allows you to bring the query cost down to \$ O(\log n) \$ on balanced trees but if you have many common-ancester queries then you may prefer to preprocess the tree into a data structure that suppose \$O(1)\$ queries. See Bender and Colton (2000) for an approach based on range-minimum queries.

Question 5

namedtuple also accepts the field names separated with spaces or commas from a single string, there is no need to split it: Node = namedtuple('Node', 'left right')

Question 6

@mkrieger1: Thanks, that's useful to know.

Question 7

Hey thank you for the detailed review. However I feel the above code will break when either of the node to be found is absent and instead, two or more of the other node is present, since count doesn't differentiate between nodes.

Question 8

On another note, how did you create those images? They look good!

Question 9

(i) If "two or more of the other node" are found, then the data structure is not a tree. In a tree there is exactly one path to each node from the root. (ii) I used Inkscape.

Gareth Rees Gareth Rees 50.1k3 gold badges130 silver badges210 bronze badges · Accepted Answer · 2015-03-22 21:29:14Z

1. Review

You didn't give us the definition of the node objects, so I'm guessing you used something like this:
```
from collections import namedtuple
Node = namedtuple('Node', 'left right'.split())
```
There's no docstring. How do I call common_ancestor? What is the meaning of the arguments N1, N2, and head? (I'm guessing that N1 and N2 are the nodes to find the nearest common ancestor of, and that head had to be the root node of the tree.)
The function common_ancestor does two things: it finds a node and prints a message about it. It would be better to separate these responsibilities.
The print statement means that the code is not portable to Python 3.
The function allocates a new list of three status elements for each node in the tree. But as we'll see below, we only need a total of two status elements.
The test head == N2 is not quite right. Here we need the actual node N2, not just any node that happens to be equal to N2, so the test should use is rather than ==.

2. Simpler implementation

The function common_ancestor traverses the tree in depth-first order, keeping track of which of N1 and N2 have been visited so far. So let's start with a simple depth-first traversal implementation:

def traverse(node):
 """Visit all nodes in the tree starting at node, in depth order."""
 if node is None: return
 traverse(node.left)
 traverse(node.right)

Here's a binary tree with two shaded nodes that we want to find the common ancestor of. The thin line shows the order in which nodes are visited by a depth-first traversal.

Depth-first traversal of a binary tree

Suppose that we augment this traversal function so that it keeps track of how many of n1 and n2 have been visited so far:

count = 0
def traverse(node):
 """Visit all nodes in the tree starting at node, in depth order."""
 if node is None: return
 print("Entering {}, count={}".format(node, count))
 if node is n1: count += 1
 if node is n2: count += 1
 traverse(node.left)
 traverse(node.right)
 print("Exiting {}, count={}".format(node, count))

This figure shows the value of count on entry to each node:

Number of target nodes visited on entry to each node

And this figure shows the value of count on exit from each node:

Number of target nodes visited on exit from each node

You'll see that all the common ancestors of the two nodes have count = 0 on entry and count = 2 on exit (and only common ancestors have this property). So that means that we can implement common_ancestor like this:

def common_ancestor(n1, n2, head):
 """Return the nearest common ancestor of nodes n1 and n2 in the tree
 rooted at head.
 """
 count = 0 # How many nodes in {n1, n2} have been visited so far?
 ancestor = None
 def traverse(node):
 nonlocal count, ancestor
 if node is None or ancestor is not None: return
 count_at_entry = count
 if node is n1: count += 1
 if node is n2: count += 1
 traverse(node.left)
 traverse(node.right)
 if count_at_entry == 0 and count == 2 and ancestor is None:
 ancestor = node
 traverse(head)
 return ancestor

The nonlocal statement was new in Python 3, so if you're stuck on Python 2 then you'll have to find a workaround, for example using a status object:

class Status(object): pass
status = Status()
status.count = 0
status.ancestor = None
# etc.

3. Efficient implementation?

The implementation discussed here has to traverse the whole tree and so takes time \$ O(n) \$. Whether this is important depends on how many common-ancestor queries you have. If there are many, then you'll probably want to change your tree data structure so that you can make efficient common-ancestor queries.

Adding parent pointers allows you to bring the query cost down to \$ O(\log n) \$ on balanced trees but if you have many common-ancester queries then you may prefer to preprocess the tree into a data structure that suppose \$O(1)\$ queries. See Bender and Colton (2000) for an approach based on range-minimum queries.

namedtuple also accepts the field names separated with spaces or commas from a single string, there is no need to split it: Node = namedtuple('Node', 'left right')
Hey thank you for the detailed review. However I feel the above code will break when either of the node to be found is absent and instead, two or more of the other node is present, since count doesn't differentiate between nodes.
On another note, how did you create those images? They look good!
(i) If "two or more of the other node" are found, then the data structure is not a tree. In a tree there is exactly one path to each node from the root. (ii) I used Inkscape.

Stack Exchange Network

Finding common ancestor in a binary tree

1 Answer 1

1. Review

2. Simpler implementation

3. Efficient implementation?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Finding common ancestor in a binary tree

1 Answer 1

1. Review

2. Simpler implementation

3. Efficient implementation?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions