I'm trying to write a better code in python. I don't know where to start checking. I want to make sure that I do write a best practice code.
This is the second problem in Hackerrank trie data structure:
Given N strings. Each string contains only lowercase letters from (both inclusive). The set of strings is said to be GOOD SET if no string is prefix of another string else, it is BAD SET. (If two strings are identical, they are considered prefixes of each other.)
For example, aab, abcde, aabcd is BAD SET because aab is prefix of aabcd.
Print GOOD SET if it satisfies the problem requirement. Else, print BAD SET and the first string for which the condition fails.
Input Format
First line contains , the number of strings in the set. Then next lines follow, where line contains string.
Output Format
Output GOOD SET if the set is valid. Else, output BAD SET followed by the first string for which the condition fails.
Sample Input00
7 aab defgab abcde aabcde cedaaa bbbbbbbbbb jabjjjadSample Output00
BAD SET aabcde
Sample Input01
4 aab aac aacghgh aabghghSample Output01
BAD SET aacghgh
from sys import stdin
class Node:
def __init__(self,char):
self.character = char
self.children = {}
self.counter = 0
self.end_word = False
class Trie:
def __init__(self):
self.root = Node('*')
def add(self,word):
current = self.root
fail = False
for char in word:
if char not in current.children:
new_node = Node(char)
current.children[char] = new_node
current = new_node
new_node.counter += 1
else:
current = current.children[char]
current.counter += 1
if current.end_word:
fail = True
current.end_word = True
# first word > second word : second word is prefix of first word
if current.counter >=2:
fail = True
return fail
if __name__ == "__main__":
tree = Trie()
n = stdin.readlines()
for i in n[1:]:
i = i.strip()
add_check_string = tree.add(i)
if add_check_string:
print("BAD SET")
print(i)
break
if not add_check_string:
print("GOOD SET")
3 Answers 3
I think the only change I would make is removing node.counter, and instead detecting if a previous word is a prefix of the current word by checking if current.fail==True.
-
\$\begingroup\$ Thank you for your raply. May I know Why did you mention taht there are Performance Issues? \$\endgroup\$Mohamed Saad– Mohamed Saad2018年03月22日 20:44:19 +00:00Commented Mar 22, 2018 at 20:44
-
\$\begingroup\$ oops, I combined 2 separate (and unrelated) questions in my head, and pulled out that you thought this was a bottleneck somehow. Edited. \$\endgroup\$Oscar Smith– Oscar Smith2018年03月22日 21:01:37 +00:00Commented Mar 22, 2018 at 21:01
- You could run
autopep8for minor style corrections (e.g. whitespaces). Naming variables correctly is always extremely important, especially in a dynamic language, without indications about object type:
- Your
trieis aTrie, not just atree. nsounds like an integer. It could be calledlines.ialso sounds like an integer. It could belineorword.
- Your
Using
for/elsesyntax, you can refactor your main method without any flag.
if __name__ == "__main__":
trie = Trie()
_header, *lines = stdin.readlines()
for line in lines:
word = line.strip()
if trie.add(word):
print("BAD SET")
print(word)
break
else:
print("GOOD SET")
- If you wish, you could replace the 3 last lines of
addbyreturn fail or current.counter >= 2
There is a lot of repetition in your add method.
This can be simplified, ditching the fail flag to
def add_unless_prefix(self,word):
current = self.root
for char in word:
if char not in current.children:
current.children[char] = Node(char)
current = current.children[char]
current.counter += 1
if current.end_word:
return True
current.end_word = True
return current.counter >=2
This can be simplified a bit using dict.setdefault.
if char not in current.children:
current.children[char] = Node(char)
current = current.children[char]
can become current = current.children.setdefault(char, Node(char))
-
\$\begingroup\$ I also tried to remove the
failflag, but .... I failed. By returning early, you preventwordfrom being added to thetrie.return failwas just some extra piece of information, not the main purpose of theaddmethod. At the very least, your new method should be called differently, e.g.add_unless_prefix. \$\endgroup\$Eric Duminil– Eric Duminil2018年03月25日 19:52:05 +00:00Commented Mar 25, 2018 at 19:52 -
\$\begingroup\$ Oops, small bug. That was meant to be
return True, signalling that it matches a prefix \$\endgroup\$Maarten Fabré– Maarten Fabré2018年03月26日 06:51:40 +00:00Commented Mar 26, 2018 at 6:51 -
\$\begingroup\$ It still doesn't add the word to the trie, though. Given the name of the class and method, it still sounds like a bug. \$\endgroup\$Eric Duminil– Eric Duminil2018年03月26日 06:57:52 +00:00Commented Mar 26, 2018 at 6:57
-
\$\begingroup\$ this isn't mean to be a complete Trie. This is meant to detect whether there is a prefix, so in the context of this HackerRank issue, this suffices. \$\endgroup\$Maarten Fabré– Maarten Fabré2018年03月26日 07:08:41 +00:00Commented Mar 26, 2018 at 7:08
-
\$\begingroup\$ Fine, then you should change the method name to make it clear. \$\endgroup\$Eric Duminil– Eric Duminil2018年03月26日 07:20:33 +00:00Commented Mar 26, 2018 at 7:20