Adding parent pointers allows you to bring the query cost down to \$ O(\log n) \$, on balanced trees but there are more complex data structures that support constant-timeif you have many common-ancestorancester queries then you may prefer to preprocess the tree into a data structure that suppose \$O(1)\$ queries. See Bender and Colton (2000) for an approach based on range-minimum queries.
Adding parent pointers allows you to bring the query cost down to \$ O(\log n) \$, but there are more complex data structures that support constant-time common-ancestor queries.
Adding parent pointers allows you to bring the query cost down to \$ O(\log n) \$ on balanced trees but if you have many common-ancester queries then you may prefer to preprocess the tree into a data structure that suppose \$O(1)\$ queries. See Bender and Colton (2000) for an approach based on range-minimum queries.
You didn't give us the definition of the node objects, so I'm guessing you used something like this:
from collections import namedtuple Node = namedtuple('Node', 'left right'.split())
There's no docstring. How do I call
common_ancestor
? What is the meaning of the argumentsN1
,N2
, andhead
? (I'm guessing thatN1
andN2
are the nodes to find the nearest common ancestor of, and thathead
had to be the root node of the tree.)The function
common_ancestor
does two things: it finds a node and prints a message about it. It would be better to separate these responsibilities.The
print
statement means that the code is not portable to Python 3.The function allocates a new list of three status elements for each node in the tree. But as we'll see below, we only need a totaltotal of two status elements.
The test
head == N2
is not quite right. Here we need the actual nodeN2
, not just any node that happens to be equal toN2
, so the test should useis
rather than==
.
You didn't give us the definition of the node objects, so I'm guessing you used something like this:
from collections import namedtuple Node = namedtuple('Node', 'left right'.split())
There's no docstring. How do I call
common_ancestor
? What is the meaning of the argumentsN1
,N2
, andhead
? (I'm guessing thatN1
andN2
are the nodes to find the nearest common ancestor of, and thathead
had to be the root node of the tree.)The function
common_ancestor
does two things: it finds a node and prints a message about it. It would be better to separate these responsibilities.The
print
statement means that the code is not portable to Python 3.The function allocates a new list of three status elements for each node in the tree. But as we'll see below, we only need a total of two status elements.
You didn't give us the definition of the node objects, so I'm guessing you used something like this:
from collections import namedtuple Node = namedtuple('Node', 'left right'.split())
There's no docstring. How do I call
common_ancestor
? What is the meaning of the argumentsN1
,N2
, andhead
? (I'm guessing thatN1
andN2
are the nodes to find the nearest common ancestor of, and thathead
had to be the root node of the tree.)The function
common_ancestor
does two things: it finds a node and prints a message about it. It would be better to separate these responsibilities.The
print
statement means that the code is not portable to Python 3.The function allocates a new list of three status elements for each node in the tree. But as we'll see below, we only need a total of two status elements.
The test
head == N2
is not quite right. Here we need the actual nodeN2
, not just any node that happens to be equal toN2
, so the test should useis
rather than==
.
1. Review
You didn't give us the definition of the node objects, so I'm guessing you used something like this:
from collections import namedtuple Node = namedtuple('Node', 'left right'.split())
There's no docstring. How do I call
common_ancestor
? What is the meaning of the argumentsN1
,N2
, andhead
? (I'm guessing thatN1
andN2
are the nodes to find the nearest common ancestor of, and thathead
had to be the root node of the tree.)The function
common_ancestor
does two things: it finds a node and prints a message about it. It would be better to separate these responsibilities.The
print
statement means that the code is not portable to Python 3.The function allocates a new list of three status elements for each node in the tree. But as we'll see below, we only need a total of two status elements.
2. Simpler implementation
The function common_ancestor
traverses the tree in depth-first order, keeping track of which of N1
and N2
have been visited so far. So let's start with a simple depth-first traversal implementation:
def traverse(node):
"""Visit all nodes in the tree starting at node, in depth order."""
if node is None: return
traverse(node.left)
traverse(node.right)
Here's a binary tree with two shaded nodes that we want to find the common ancestor of. The thin line shows the order in which nodes are visited by a depth-first traversal.
Depth-first traversal of a binary tree
Suppose that we augment this traversal function so that it keeps track of how many of n1
and n2
have been visited so far:
count = 0
def traverse(node):
"""Visit all nodes in the tree starting at node, in depth order."""
if node is None: return
print("Entering {}, count={}".format(node, count))
if node is n1: count += 1
if node is n2: count += 1
traverse(node.left)
traverse(node.right)
print("Exiting {}, count={}".format(node, count))
This figure shows the value of count
on entry to each node:
Number of target nodes visited on entry to each node
And this figure shows the value of count
on exit from each node:
Number of target nodes visited on exit from each node
You'll see that all the common ancestors of the two nodes have count = 0
on entry and count = 2
on exit (and only common ancestors have this property). So that means that we can implement common_ancestor
like this:
def common_ancestor(n1, n2, head):
"""Return the nearest common ancestor of nodes n1 and n2 in the tree
rooted at head.
"""
count = 0 # How many nodes in {n1, n2} have been visited so far?
ancestor = None
def traverse(node):
nonlocal count, ancestor
if node is None or ancestor is not None: return
count_at_entry = count
if node is n1: count += 1
if node is n2: count += 1
traverse(node.left)
traverse(node.right)
if count_at_entry == 0 and count == 2 and ancestor is None:
ancestor = node
traverse(head)
return ancestor
The nonlocal
statement was new in Python 3, so if you're stuck on Python 2 then you'll have to find a workaround, for example using a status object:
class Status(object): pass
status = Status()
status.count = 0
status.ancestor = None
# etc.
3. Efficient implementation?
The implementation discussed here has to traverse the whole tree and so takes time \$ O(n) \$. Whether this is important depends on how many common-ancestor queries you have. If there are many, then you'll probably want to change your tree data structure so that you can make efficient common-ancestor queries.
Adding parent pointers allows you to bring the query cost down to \$ O(\log n) \$, but there are more complex data structures that support constant-time common-ancestor queries.