I have been studying intermediate or more advanced data structures in Python, one of which I've decided to try out is the Tree Data Structure. I've been studying the theories as well as the implementations from Youtube courses and I might say that I've grasped quite a bit of it.
I decided to make my own original algorithm to append and access a data in a Tree without copying from the videos:
class TreeNode:
def __init__(self,data = None):
self.data = data
self.left = None
self.right = None
class Tree:
def __init__(self):
self.head = TreeNode()
def append(self, data):
cur = self.head
new_node = TreeNode(data)
while True:
if cur.data == None:
cur.data = data
return
if data < cur.data:
if cur.left == None:
cur.left = new_node
return
if cur.left != None:
cur = cur.right
if data > cur.data:
if cur.right == None:
cur.right = new_node
return
if cur.right != None:
cur = cur.right
def find_data(self, data):
cur = self.head
while cur.right != None and cur.left != None:
if data > cur.data:
cur = cur.right
if data < cur.data:
cur = cur.left
if data == cur.data:
return print(f'Data {data} has been found.')
return print("Im sorry, your data is not found!")
my_tree = Tree()
my_tree.append(50)
my_tree.append(100)
my_tree.append(30)
my_tree.find_data(30)
my_tree.find_data(20)
I would like to know if this is considered an optimal and efficient algorithm since I've been writing it only using the theories and logics from the videos. I would like to know if my understanding's clear enough.
If there are any, in what ways could the algorithms be improved?
3 Answers 3
A few suggestions:
- There is a typo in
Tree.append
where we move tocur.right
whencur.left
is notNone
. This leads to incorrect organization of the data. - It is rather hard to visualize the tree. You can add a
__str__
method toTreeNode
. - Testing more complex cases is usually necessary to establish correctness.
- The
append
method gets stuck in thewhile
loop when duplicate data is inserted. - The
find_data
method is not quite correct; it's perfectly fine to have the children beNone
. The concern is moving to a child that isNone
or when the current data isNone
. Notably, the current code will not work when we try to usefind_data
on aTree
with 1 insertion. new_node
is declared earlier than it needs to be.- Using
if
,elif
,else
can improve simplicity. - Try to use
is
instead of==
when comparing to None. - There is an inconsistency in style between
append
which checks the left branch first andfind_data
which checks the right branch first.
Here is the code with the suggestions implemented:
class TreeNode:
def __init__(self, data=None):
self.data = data
self.left = None
self.right = None
def __str__(self):
return "({}) <- {} -> ({})".format(
self.left, self.data, self.right)
class Tree:
def __init__(self):
self.head = TreeNode()
def append(self, data):
cur = self.head
while True:
if cur.data is None:
cur.data = data
return
if data < cur.data:
if cur.left is None:
cur.left = TreeNode(data)
return
cur = cur.left
elif data > cur.data:
if cur.right is None:
cur.right = TreeNode(data)
return
cur = cur.right
else:
return
def find_data(self, data):
cur = self.head
while True:
if cur.data is None:
return print("Im sorry, your data is not found!")
if data < cur.data:
if cur.left is None:
return print("Im sorry, your data is not found!")
cur = cur.left
elif data > cur.data:
if cur.right is None:
return print("Im sorry, your data is not found!")
cur = cur.right
else:
return print(f'Data {data} has been found.')
my_tree = Tree()
my_tree.find_data(50)
my_tree.append(50)
my_tree.find_data(50)
my_tree.append(50)
my_tree.append(100)
my_tree.append(30)
my_tree.append(20)
my_tree.append(10)
print(my_tree.head)
my_tree.find_data(30)
my_tree.find_data(10)
my_tree.find_data(5)
Hope this helps! Good work on the MVP.
3 quick minor suggestions:
- consider upgrading your class to a dataclass, then you can ditch
__init__
and possibly__repr__
as well - the match statement can be used to replace if statements
- In terms of behavior, seems to me a function like
find_data
should simply return whatever value was found, orNone
by convention. Let the calling function handle the result accordingly, rather than embed hardcoded prints which are less flexible
Also, I had a quick look at similar code on the Internet. Naming conventions vary from one author to another, thus there is one implementation that has an add_child
method to append "data", and there is a corresponding remove_child
method as well, which you haven't implemented but could be useful at some point.
address vs contents
if cur.data == None:
Prefer ... is None:
It's kind of a weird python culture thing, related to None being a singleton. Sorry. Your linter probably told you about it but then you ignored it. (E711)
Also, it's a bit odd that you accept arbitrary data
yet you prohibit a None
value.
Minimally, the append()
"""docstring""" should spell out that restriction.
else
if cur.left == None:
...
if cur.left != None:
This is just annoying.
It leaves the reader wondering how many cases there could possibly be.
Use an else
clause, please.
if data < cur.data:
...
if data > cur.data:
In the case of data == cur.data
we implicitly leave it unchanged,
which is maybe just fine.
I note in passing that it is possible for the ==
operator
between objects to return True
, even though an object
is noticeably different, e.g. an audit trail .modified
attribute has a more recent timestamp.
I would have been happier with a simple else
clause here.