#BST
class Node:
def __init__(self, val, left = None, right = None):
self.val = val
self.left = left
self.right = right
def search(root, k):
if root is None or root.val == k:
return root
elif root.val < k:
return search(root.right, k)
else:
return search(root.left, k)
def minimum(root):
if root is None or root.left is None:
return root
return minimum(root.left)
def maximum(root):
if root is None or root.right is None:
return root
return maximum(root.right)
def successor(root, node):
if node.right is not None:
return minimum(node.right)
succ = None
while root is not None and root.val != node.val:
if root.val > node.val:
succ = root
root = root.left
else:
root = root.right
return succ
def predecessor(root, node):
if node.left is not None:
return maximum(node.left)
else:
pred = None
while root is not None and root.val != node.val:
if root.val > node.val:
root = root.left
else:
pred = root
root = root.right
return pred
def insert(root, key):
if root is None:
return Node(key)
elif root.val < key:
return Node(root.val, root.left, insert(root.right, key))
else:
return Node(root.val, insert(root.left, key), root.right)
def parent(root, node):
if root is None:
return root
elif (root.left is not None and root.left.val == node.val) or \
(root.right is not None and root.right.val == node.val):
return root
else:
if root.val < node.val:
return parent(root.right, node)
else:
return parent(root.left, node)
def delete(root, node):
p = parent(root, node)
if node.left is None or node.right is None:
replacement = node.left if node.left is None else node.right
if p is None:
root = replacement
elif p.val < node.val:
p.right = replacement
else:
p.left = replacement
else:
replacement = successor(root, node)
succ_parent = parent(root, replacement)
succ_parent.left = replacement.right
node.val = replacement.val
return root
Require code review in terms of correctness, style consistency, efficiency, code readability, and "pythonic"ness. (I, of course, welcome any other comments on the code as well)
I know that this a bit big to review completely. Reviewing only one or more methods will be appreciated.
Also, the WikiPage talks about designing persistent data structures. When should one consider such design and what are the advantages?
Github: https://github.com/abhisheksp/algorithms (the repo has the test cases)
3 Answers 3
docstrings
Docstrings matter. Delete the cryptic #BST
comment and spell it out in a PEP8 class docstring:
class Node:
"""Implements a Binary Search Tree."""
Either search
or __init__
needs a docstring that mentions the equality case (I'm looking at elif root.val < k:
, as opposed to <= k
).
correctness
I can't imagine successor
is correct. You didn't offer any unit tests like this, but I'm thinking of passing in a node with no right subtree, a node having the same value as root. Then we return None ?? This comes back to the <= k
detail I mentioned above. Anyway, predecessor
is nicely parallel, simple is good. And insert
is beautiful and concise.
public API design
For some methods, rather than passing in root
, it would be more natural to be working with an instance of your BST class, with self.root
implicitly available.
style
As a trivial style detail, in parent
you might use parentheses a level deeper to remove the backslash.
I didn't run flake8
myself, but overall it looks good. I have a weak preference for if node.left:
instead of if node.left is not None:
.
tests
I'm looking at this:
class BSTTest(unittest.TestCase):
def test_recursive_search(self):
root = build_bst()
actual = search(root, 35).val
expected = 35
self.assertEqual(expected, actual)
I would much rather see the method expressed concisely:
def test_recursive_search(self):
root = build_bst()
self.assertEqual(35, search(root, 35).val)
It might even come out as:
def test_recursive_search(self):
self.assertEqual(35, search(self.root, 35).val)
where build_bst()
is invoked by a setup method.
There are other tests, e.g. test_insert_on_empty_tree
, where of course hanging on to an actual
temp variable is the natural way to express it.
I am very glad to see these tests. They appear to give reasonable code coverage.
Another example tree that stores multiple copies of a given value (e.g. 30) would be of interest.
summary
Overall it looks good! Ship it.
Thanks for sharing your code,
Instead of dealing with nodes directly, I think it would be easier, and make a lot more sense to have code that looked like this.
bst = BST([1,6,3,3,5])
bst.insert(5)
min = bst.minimum()
max = bst.max()
etc.
the BST
class would be in charge of managing the Node
objects. This way I'm not passing around a root Node
to all my functions. For someone who wants to use this tree, I shouldn't need to know about how all these Node
s work. A Node
is an implementation detail of your BST
Make use of __eq__
, __hash__
and __str__
. Implementing these methods would allow you to compare and display your Node
objects very easily. You could also implement __lt__
and __gt__
to compare nodes via node1 < node2
rather than node1.val < node2.val
(although __gt__
may not actually be required here)
very small stylistic thing here. The code left = None, right = None
should be left=None, right=None
(no spaces) according to PEP8 guidelines.
Some additional methods you could add
post_order_traversal
pre_order_traversal
in_order_traversal
these could maybe just print the elements in the given order, or maybe return a list of the elements (not the nodes) in that order.
Overall I think it's looking very nice!
Keep it up
As you asked for persistence.
You can temporarily instrument your code like
def __init__(self, val, left = None, right = None):
print("new", self.val, self)
...
def __del__(self):
print("del", self.val, self)
to get an idea what is going on whe you are inserting nodes. Your tree is not persistent as nodes are deleted and recreated with the same value.
r = Node(0)
for i in range(4):
r=insert(r,i+1)
gives
new 0 <Node object at 0x7f6dc2d1b240>
new 1 <Node object at 0x7f6dc2d2e2e8>
new 0 <Node object at 0x7f6dc2d2e390>
del 0 <Node object at 0x7f6dc2d1b240>
new 2 <Node object at 0x7f6dc2d1b240>
new 1 <Node object at 0x7f6dc2d2e3c8>
new 0 <Node object at 0x7f6dc2d2e358>
del 0 <Node object at 0x7f6dc2d2e390>
del 1 <Node object at 0x7f6dc2d2e2e8>
new 3 <Node object at 0x7f6dc2d2e390>
new 2 <Node object at 0x7f6dc2d2e320>
new 1 <Node object at 0x7f6dc2d2e470>
new 0 <Node object at 0x7f6dc2d2e2b0>
del 0 <Node object at 0x7f6dc2d2e358>
del 1 <Node object at 0x7f6dc2d2e3c8>
del 2 <Node object at 0x7f6dc2d1b240>
so any other variable referencing a node or even the root node will possibly reference something old.
Explore related questions
See similar questions with these tags.
if not None
is very redundant. None type in python has a truthy value of false. You can jut sayif root.right:
and the code will then execute if it exists, and will not if it doesn't. This alone will make your code much more readable. \$\endgroup\$