I wrote the following prefix tree data structure for this Leet code challenge.
Implement a trie with
insert
,search
, andstartsWith
methods.Note:
You may assume that all inputs are consist of lowercase letters a-z.
However it keeps timing out on the long input. As far as I know the insert
, search
and startsWith
functions have \$\mathcal{O}(n)\$ time complexity, where \$n\$ is the length of the input string. Can someone review my code and see where I could improve the speed of the code?
class TrieNode(object):
def __init__(self):
self.next = [['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],
['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],
['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],['!',None],['!',None]]
self.isWord = False
class Trie(object):
def __init__(self):
self.root = TrieNode()
def insert(self, word):
"""
Inserts a word into the trie.
:type word: str
:rtype: void
"""
i = 0
wordLength = len(word)
node = self.root
while i < wordLength:
idx = ord(word[i]) - ord('a')
if node.next[idx][0] == '!':
node.next[idx][0] = word[i]
if i < wordLength-1 and node.next[idx][1] == None:
node.next[idx][1] = TrieNode()
elif i == wordLength-1:
node.isWord = True
return
i += 1
node = node.next[idx][1]
def search(self, word):
"""
Returns if the word is in the trie.
:type word: str
:rtype: bool
"""
i = 0
wordLength = len(word)
node = self.root
while i < wordLength and node != None:
idx = ord(word[i]) - ord('a')
if node.next[idx][0] == '!':
return False
if i == wordLength - 1 and node.next[idx][1] != None:
return node.isWord
node = node.next[idx][1]
i += 1
if node == None and i < wordLength:
return False
return True
def startsWith(self, prefix):
"""
Returns if there is any word in the trie
that starts with the given prefix.
:type prefix: str
:rtype: bool
"""
i = 0
wordLength = len(prefix)
node = self.root
while i < wordLength and node != None:
idx = ord(prefix[i]) - ord('a')
if node.next[idx][0] == '!':
return False
node = node.next[idx][1]
i+=1
if node == None and i < wordLength:
return False
return True
-
\$\begingroup\$ See this answer for some ideas on how to improve this code. \$\endgroup\$Gareth Rees– Gareth Rees2016年10月02日 12:51:49 +00:00Commented Oct 2, 2016 at 12:51
-
\$\begingroup\$ Looks good @Gareth Rees. I will meticulously look through that post and take some pointers. Nice find. \$\endgroup\$Rakib Hasan– Rakib Hasan2016年10月02日 17:51:53 +00:00Commented Oct 2, 2016 at 17:51
-
\$\begingroup\$ Thanks @GarethRees that post helped me solve the challenge. Apparently there was something logically wrong with my code. After adding the changes alexwichan suggested, It wasn't timing out, but it was returning true instead of false for one of the search calls. Then I took a look at the answer that Gareth sent and did some more changes and was finally able to pass the challenge. \$\endgroup\$Rakib Hasan– Rakib Hasan2016年10月03日 04:48:55 +00:00Commented Oct 3, 2016 at 4:48
2 Answers 2
Assuming the next
attribute of TrieNode
is just a lookup for letters of the alphabet, you might be better off replacing it with a dictionary that maps letters of the alphabet to their next value. Something like:
class TrieNode(object):
def __init__(self):
self.next = {letter: ['!', None] for letter in string.ascii_lowercase}
self.isWord = False
and now, rather than mucking around with ord()
in your insert()
and search()
methods, you can use the characters directly. That won’t fundamentally change the complexity, but will make a small performance saving.
I think insert and search of a trie is O(key_length), so this is probably the sort of saving you need. You can’t change the complexity.
A few other small suggestions:
- Comparisons to
None
should usefoo is None
andfoo is not None
, not direct equality. Once you’re using the characters as keys in the
next
attribute, you’d be better of replacing thosewhile i < wordLength
loops with something likefor i, char in enumerate(word)
.In general, PEP 8 convention is that Python variables are lowercase with underscores (e.g.
is_word
,starts_with
), and only classes are camel case. (Although as @Graipher notes in the comments, camel case is specified forstartsWith()
.)
-
2\$\begingroup\$
startsWith
is defined as the interface, though I agree onis_word
. \$\endgroup\$Graipher– Graipher2016年10月02日 09:16:07 +00:00Commented Oct 2, 2016 at 9:16 -
\$\begingroup\$ Thanks alexwichan I will attempt to implement these changes and get back to you. \$\endgroup\$Rakib Hasan– Rakib Hasan2016年10月02日 16:23:58 +00:00Commented Oct 2, 2016 at 16:23
After looking at the post that @Gareth sent, I was able to make a solution that works
from collections import defaultdict
I use a defaultdict which is a dictionary that adds and returns a default value for a key if a key is not in the dictionary. This is quite useful in the insert function.
class TrieNode(object):
def __init__(self):
self.next = defaultdict(TrieNode)
self.isEnd = False
class Trie(object):
def __init__(self):
self.root = TrieNode()
I use a simple loop in the insert function. For each letter in the word, I get the value for the letter key in the defaultdict. If the key is not present then by default it adds a letter and Trienode key value pair in the dictionary and returns the new Trienode. Also in the Trienode that is the value of the last letter in the word I set the isEnd variable to True. In the original code above I set the isEnd variable to true for the Trienode that contained the last letter.
def insert(self, word):
if not word:
return
node = self.root
for letter in word:
node = node.next[letter]
node.isEnd = True
In the search function I also loop through each letter in the word. I the letter is in the dictionary for that Trienode, then I set node to the node value of that letter key. If it is not then I return False. At the end of the loop I return the value of the isEnd variable of the Trienode value for the last letter of the word.
def search(self, word):
if not word:
return
node = self.root
for letter in word:
if letter in node.next:
node = node.next[letter]
else:
return False
return node.isEnd
In the startsWith function I loop through every letter in the prefix. If the letter is in the defaultdict then I set node to the Trienode value for that letter. If it is not then I return False. I keep track of the number of iterations of the loop. After the loop, if the count is the same as the length of the prefix then I return True otherwise I return False.
def startsWith(self, prefix):
node = self.root
i = 0
for letter in prefix:
if letter in node.next:
node = node.next[letter]
else:
return False
i += 1
return True if i == len(prefix) else False
Explore related questions
See similar questions with these tags.