Kata: https://www.codewars.com/kata/linked-lists-remove-duplicates/train/python
Description
Write a RemoveDuplicates() function which takes a list sorted in increasing order and deletes any duplicate nodes from the list. Ideally, the list should only be traversed once. The head of the resulting list should be returned.
var list = 1 -> 2 -> 3 -> 3 -> 4 -> 4 -> 5 -> null removeDuplicates(list) === 1 -> 2 -> 3 -> 4 -> 5 -> null
If the passed in list is null/None/nil, simply return null.
Note: Your solution is expected to work on long lists. Recursive solutions may fail due to stack size limitations.
My Solution
class Node(object):
def __init__(self, data, nxt=None):
self.data = data
self.next = nxt
def remove_duplicates(head):
node = Node(None, head)
st = set()
while node.next:
if node.next.data in st:
tmp = node.next
node.next = node.next.next
del tmp
else:
st.add(node.next.data)
node = node.next
return head
I am interested in performance and adhering to best practices.
1 Answer 1
The space complexity of your solution is \$O(n)\$. What if I told you that given the constraints in the question, you could solve this problem in \$O(1)\$ space?
Consider that you receive an already sorted list. This implies that all duplicates must be adjacent. This in turn means that for deduplicating these, you do not need to keep track of which values you already encountered.
Note that del
is not necessary here. You never use tmp
aside from assigning node.next
to it in any case. del
is not clearing the memory allocated to the variable (or the object it refers to), it only destroys the variable itself, the object is unaffected. As such it does not have any effect on the garbage-collection that python performs...
The following should already work just fine:
def remove_duplicates(head):
node = Node(None, head)
while node.next:
if node.next.data == node.data:
node.next = node.next.next
else:
node = node.next
return head
Small sidenote: I really like the hack you used to ensure that you handle the head is None
case, but you should be aware that it is just a hack. The cleaner solution to this would be:
def remove_duplicates(head):
if not head:
return head
node = head
while node.next:
# Rest of the code...
-
1\$\begingroup\$ Surely
if head is None
would be better thanif not head
, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code. \$\endgroup\$wizzwizz4– wizzwizz42019年03月10日 19:04:23 +00:00Commented Mar 10, 2019 at 19:04 -
\$\begingroup\$ Thanks for this, yours is really neat. It still looks like it's \$O(n)\$ to me as you have to iterate through the list? \$\endgroup\$Tobi Alafin– Tobi Alafin2019年03月10日 19:39:56 +00:00Commented Mar 10, 2019 at 19:39
-
\$\begingroup\$ I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form? \$\endgroup\$Tobi Alafin– Tobi Alafin2019年03月10日 19:40:49 +00:00Commented Mar 10, 2019 at 19:40
-
2\$\begingroup\$ it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches \$\endgroup\$Vogel612– Vogel6122019年03月10日 19:54:57 +00:00Commented Mar 10, 2019 at 19:54
Explore related questions
See similar questions with these tags.