3
\$\begingroup\$

Kata: https://www.codewars.com/kata/linked-lists-remove-duplicates/train/python

Description

Write a RemoveDuplicates() function which takes a list sorted in increasing order and deletes any duplicate nodes from the list. Ideally, the list should only be traversed once. The head of the resulting list should be returned.

var list = 1 -> 2 -> 3 -> 3 -> 4 -> 4 -> 5 -> null
removeDuplicates(list) === 1 -> 2 -> 3 -> 4 -> 5 -> null

If the passed in list is null/None/nil, simply return null.

Note: Your solution is expected to work on long lists. Recursive solutions may fail due to stack size limitations.

My Solution

class Node(object):
 def __init__(self, data, nxt=None):
 self.data = data
 self.next = nxt
def remove_duplicates(head):
 node = Node(None, head)
 st = set()
 while node.next:
 if node.next.data in st:
 tmp = node.next
 node.next = node.next.next
 del tmp
 else:
 st.add(node.next.data)
 node = node.next
 return head

I am interested in performance and adhering to best practices.

asked Mar 10, 2019 at 14:26
\$\endgroup\$

1 Answer 1

8
\$\begingroup\$

The space complexity of your solution is \$O(n)\$. What if I told you that given the constraints in the question, you could solve this problem in \$O(1)\$ space?

Consider that you receive an already sorted list. This implies that all duplicates must be adjacent. This in turn means that for deduplicating these, you do not need to keep track of which values you already encountered.

Note that del is not necessary here. You never use tmp aside from assigning node.next to it in any case. del is not clearing the memory allocated to the variable (or the object it refers to), it only destroys the variable itself, the object is unaffected. As such it does not have any effect on the garbage-collection that python performs...

The following should already work just fine:

def remove_duplicates(head):
 node = Node(None, head)
 while node.next:
 if node.next.data == node.data:
 node.next = node.next.next
 else:
 node = node.next
 return head

Small sidenote: I really like the hack you used to ensure that you handle the head is None case, but you should be aware that it is just a hack. The cleaner solution to this would be:

def remove_duplicates(head):
 if not head:
 return head
 node = head
 while node.next:
 # Rest of the code...
Tobi Alafin
1,7921 gold badge15 silver badges31 bronze badges
answered Mar 10, 2019 at 14:49
\$\endgroup\$
4
  • 1
    \$\begingroup\$ Surely if head is None would be better than if not head, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code. \$\endgroup\$ Commented Mar 10, 2019 at 19:04
  • \$\begingroup\$ Thanks for this, yours is really neat. It still looks like it's \$O(n)\$ to me as you have to iterate through the list? \$\endgroup\$ Commented Mar 10, 2019 at 19:39
  • \$\begingroup\$ I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form? \$\endgroup\$ Commented Mar 10, 2019 at 19:40
  • 2
    \$\begingroup\$ it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches \$\endgroup\$ Commented Mar 10, 2019 at 19:54

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.