Following is a Python program I wrote to find the common ancestor in a binary tree (not BST) with no link to parent node. Please review.
def common_ancestor(N1, N2, head):
if not head:
return [None, 0, 0]
left_status = common_ancestor(N1, N2, head.left)
if left_status[0]:
return left_status
this_status = [None, 0, 0]
if head == N1:
this_status[1] = 1
if head == N2:
this_status[2] = 1
right_status = common_ancestor(N1, N2, head.right)
if right_status[0]:
return right_status
combined_status = [None, 0, 0]
combined_status[1] = left_status[1] or this_status[1] or right_status[1]
combined_status[2] = left_status[2] or this_status[2] or right_status[2]
if combined_status[1] and combined_status[2]:
print "Common Ancestor is %s" % head
combined_status[0] = head
return combined_status
if __name__ == "__main__":
head = some_node
print common_ancestor(5, 32, head)[0]
1 Answer 1
1. Review
You didn't give us the definition of the node objects, so I'm guessing you used something like this:
from collections import namedtuple Node = namedtuple('Node', 'left right'.split())
There's no docstring. How do I call
common_ancestor
? What is the meaning of the argumentsN1
,N2
, andhead
? (I'm guessing thatN1
andN2
are the nodes to find the nearest common ancestor of, and thathead
had to be the root node of the tree.)The function
common_ancestor
does two things: it finds a node and prints a message about it. It would be better to separate these responsibilities.The
print
statement means that the code is not portable to Python 3.The function allocates a new list of three status elements for each node in the tree. But as we'll see below, we only need a total of two status elements.
The test
head == N2
is not quite right. Here we need the actual nodeN2
, not just any node that happens to be equal toN2
, so the test should useis
rather than==
.
2. Simpler implementation
The function common_ancestor
traverses the tree in depth-first order, keeping track of which of N1
and N2
have been visited so far. So let's start with a simple depth-first traversal implementation:
def traverse(node):
"""Visit all nodes in the tree starting at node, in depth order."""
if node is None: return
traverse(node.left)
traverse(node.right)
Here's a binary tree with two shaded nodes that we want to find the common ancestor of. The thin line shows the order in which nodes are visited by a depth-first traversal.
Depth-first traversal of a binary tree
Suppose that we augment this traversal function so that it keeps track of how many of n1
and n2
have been visited so far:
count = 0
def traverse(node):
"""Visit all nodes in the tree starting at node, in depth order."""
if node is None: return
print("Entering {}, count={}".format(node, count))
if node is n1: count += 1
if node is n2: count += 1
traverse(node.left)
traverse(node.right)
print("Exiting {}, count={}".format(node, count))
This figure shows the value of count
on entry to each node:
Number of target nodes visited on entry to each node
And this figure shows the value of count
on exit from each node:
Number of target nodes visited on exit from each node
You'll see that all the common ancestors of the two nodes have count = 0
on entry and count = 2
on exit (and only common ancestors have this property). So that means that we can implement common_ancestor
like this:
def common_ancestor(n1, n2, head):
"""Return the nearest common ancestor of nodes n1 and n2 in the tree
rooted at head.
"""
count = 0 # How many nodes in {n1, n2} have been visited so far?
ancestor = None
def traverse(node):
nonlocal count, ancestor
if node is None or ancestor is not None: return
count_at_entry = count
if node is n1: count += 1
if node is n2: count += 1
traverse(node.left)
traverse(node.right)
if count_at_entry == 0 and count == 2 and ancestor is None:
ancestor = node
traverse(head)
return ancestor
The nonlocal
statement was new in Python 3, so if you're stuck on Python 2 then you'll have to find a workaround, for example using a status object:
class Status(object): pass
status = Status()
status.count = 0
status.ancestor = None
# etc.
3. Efficient implementation?
The implementation discussed here has to traverse the whole tree and so takes time \$ O(n) \$. Whether this is important depends on how many common-ancestor queries you have. If there are many, then you'll probably want to change your tree data structure so that you can make efficient common-ancestor queries.
Adding parent pointers allows you to bring the query cost down to \$ O(\log n) \$ on balanced trees but if you have many common-ancester queries then you may prefer to preprocess the tree into a data structure that suppose \$O(1)\$ queries. See Bender and Colton (2000) for an approach based on range-minimum queries.
-
\$\begingroup\$
namedtuple
also accepts the field names separated with spaces or commas from a single string, there is no need tosplit
it:Node = namedtuple('Node', 'left right')
\$\endgroup\$mkrieger1– mkrieger12015年03月24日 12:47:48 +00:00Commented Mar 24, 2015 at 12:47 -
\$\begingroup\$ @mkrieger1: Thanks, that's useful to know. \$\endgroup\$Gareth Rees– Gareth Rees2015年03月24日 13:12:08 +00:00Commented Mar 24, 2015 at 13:12
-
\$\begingroup\$ Hey thank you for the detailed review. However I feel the above code will break when either of the node to be found is absent and instead, two or more of the other node is present, since count doesn't differentiate between nodes. \$\endgroup\$Anirudh– Anirudh2015年03月24日 16:58:43 +00:00Commented Mar 24, 2015 at 16:58
-
1\$\begingroup\$ On another note, how did you create those images? They look good! \$\endgroup\$Anirudh– Anirudh2015年03月24日 16:59:23 +00:00Commented Mar 24, 2015 at 16:59
-
\$\begingroup\$ (i) If "two or more of the other node" are found, then the data structure is not a tree. In a tree there is exactly one path to each node from the root. (ii) I used Inkscape. \$\endgroup\$Gareth Rees– Gareth Rees2015年03月24日 17:08:12 +00:00Commented Mar 24, 2015 at 17:08
some_node is not defined
, posting the full code would make the job easier for reviewers, so that they can test it. \$\endgroup\$