Removing duplicates from a linked list without a buffer

Question 1

I am practicing a problem in "Cracking the Coding Interview". The problem is to remove duplicates in a linked list without the use of a buffer. I interpreted this as not using a list or any large data structure such as a hash to store unique nodes.

My algorithm is inefficient I think. It iterates an anchor across the linked list. From that anchor, a second sub-iteration occurs which removes any nodes that is the same as the anchor. Thus, there are a total of two loops. I think the time complexity is either O(n^2) or O(nlogn)

Any better algorithms are welcome. Since I am a novice in Python, please give me suggestions on my coding style.

class Node(object):
 def __init__(self, data):
 self.data = data
 self.next = None
 def getData(self):
 return self.data
 def setNext(self, node):
 self.next = node
 def getNext(self):
 return self.next
class LinkedList(object):
 def __init__(self, dataList):
 assert len(dataList) > 0
 self.head = Node(dataList[0])
 iterNode = self.head
 for i in range(1, len(dataList)):
 iterNode.setNext(Node(dataList[i]))
 iterNode = iterNode.getNext()
 iterNode.setNext(None)
 def printList(self):
 iterNode = self.head
 while iterNode is not None:
 print(iterNode.getData())
 iterNode = iterNode.getNext()
 def removeDuplicates(self):
 assert self.head is not None
 anchor = self.head
 while anchor is not None:
 iterator = anchor
 while iterator is not None:
 prev = iterator
 iterator = iterator.getNext()
 if iterator is None:
 break
 if iterator.getData() == anchor.getData():
 next = iterator.getNext()
 prev.setNext(next)
 anchor = anchor.getNext()
dataList = ["hello", "world", "people", "hello", "hi", "hi"]
linkedList = LinkedList(dataList)
linkedList.printList()
linkedList.removeDuplicates()
print("\n")
linkedList.printList()

Output:

hello
world
people
hello
hi
hi
hello
world
people
hi

Question 2

Sensible overloading

String representation

Think about a native list like [1, 2, 3] would think it would be user-friendly if the repl behaved something like this?

>>> [1, 2, 3]
<__main__.list object at 0x7f9b399475c0>

But your class behaves exactly like that:

>>> LinkedList([1,2,3])
<__main__.LinkedList object at 0x7f9b3a69e908>

So instead of that weird printList(self) implement a __repr__ method for usability sake.

Iteration

A list that you cannot iterate over makes no sense.

In Python each container must expose a way to be iterated over with a for loop, like:

>>> for i in LinkedList([1,2,3]):
 print(i)

That does not work for your implementation.

Manual looping like you do in printList is out of the question in Python. Such low level details should be written once in an __iter__ method and then forgotten.

Question 3

Problem statement

The problem is underdefined. In a real-life interview I'd ask immediately, do we care about the stability? If yes, your solution is good. If no, sort the list first.
Complexity

\$O(n^2)\$ and \$O(n\log n)\$ are quite different beasts. This algorithm is \$O(n^2)\$ obviously. If sorting is allowed, there is a \$O(n \log n)\$ solution.
getters and setters

Python doesn't have a concept of public and private. At the end of the day everything is public. getData, getNext and setNext do not serve any purpose but generating entropy.

Even in the language which encourages such approach it only makes sense when there is an invariant to maintain. Unrestricted getter and setter are as good as a public member.

Caridorc Caridorc 28.1k7 gold badges54 silver badges137 bronze badges · Accepted Answer · 2016-01-16 10:28:07Z

Sensible overloading

String representation

Think about a native list like [1, 2, 3] would think it would be user-friendly if the repl behaved something like this?

>>> [1, 2, 3]
<__main__.list object at 0x7f9b399475c0>

But your class behaves exactly like that:

>>> LinkedList([1,2,3])
<__main__.LinkedList object at 0x7f9b3a69e908>

So instead of that weird printList(self) implement a __repr__ method for usability sake.

Iteration

A list that you cannot iterate over makes no sense.

In Python each container must expose a way to be iterated over with a for loop, like:

>>> for i in LinkedList([1,2,3]):
 print(i)

That does not work for your implementation.

Manual looping like you do in printList is out of the question in Python. Such low level details should be written once in an __iter__ method and then forgotten.

Stack Exchange Network

Removing duplicates from a linked list without a buffer

2 Answers 2

Sensible overloading

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Removing duplicates from a linked list without a buffer

2 Answers 2

Sensible overloading

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions