Remove duplicate elements from linked list.
#include<iostream>
using namespace std;
class LinkedList{
private:
int data;
LinkedList *next;
public:
void insert(LinkedList **start, int data){
LinkedList *p = new LinkedList;
if (*start == NULL){
p->data = data;
p->next = NULL;
*start = p;
}
else{
LinkedList *temp = *start;
while (temp->next != NULL){
temp = temp->next;
}
temp->next = p;
p->data = data;
p->next = NULL;
}
}
void removeDuplicates(LinkedList **start){//This function removes the duplicates using the standard runner method taking 2 pointers
LinkedList *temp = *start;
LinkedList *temp1 = (*start);
while (temp != NULL){
while (temp1->next!=NULL){
if (temp->data == temp1->next->data){
LinkedList *p;
p = temp1->next;
temp1->next = temp1->next->next;
delete(p);
}
else{
temp1 = temp1->next;
}
}
temp = temp->next;
temp1 = temp;
}
}
void traverse(LinkedList **start){
LinkedList *temp = *start;
while (temp != NULL){
cout << temp->data;
temp = temp->next;
}
}
};
int main(){
LinkedList *start = NULL;
LinkedList p1;
p1.insert(&start, 9);
p1.insert(&start, 8);
p1.insert(&start, 7);
p1.insert(&start, 9);
p1.insert(&start, 8);
p1.traverse(&start);
p1.removeDuplicates(&start);
cout << "\n";
p1.traverse(&start);
getchar();
return 0;
}
This takes \$O(N^2)\$ time. Can we do the same in \$O(N)\$ without the use of hash table? Or if mandatory how can we implement it in C++?
How is the overall code quality of the above code? Is there any scope of improvement?
5 Answers 5
I see a number of things that may help you improve your code.
Make sure you have all required #include
s
The code uses getchar
but doesn't #include <cstdio>
. Also, carefully consider which #include
s are part of the interface (and belong in the .h
file) and which are part of the implementation.
Avoid raw pointers
In modern C++, we tend not to use raw pointers very often. In this case, It would probably be better to have two different classes, one would be a LinkedList
class and the other would be a Node
class. That way, instead of starting with a pointer, you can start with an object.
Use nullptr
rather than NULL
Modern C++ uses nullptr
rather than NULL
. See this answer for why and how it's useful.
Match new
with delete
If you allocate memory using new
, you must also free it using delete
or your program will leak memory. Since you use new
in insert()
, you should use delete
in the LinkedList
destructor which you have not yet written.
Don't abuse using namespace std
Putting using namespace std
within your program is generally a bad habit that you'd do well to avoid.
Use more descriptive names
The code's traverse
function actually prints the nodes. For that reason it should probably be named something like print()
. Even better would be to have such a function take a std::ostream&
argument so it would be possible to print to something other than std::cout
.
Omit return 0
If your program completes successfully, the return 0
at the end of main()
will be generated automatically, so it's not needed in C++ programs.
-
\$\begingroup\$ Thank you very much on your comments. I couldn't exactly get the destructor part. Do I need to delete all the pointers which I created using new? If yes can you help me with doing that? \$\endgroup\$Dhaval Doshi– Dhaval Doshi2015年12月28日 02:43:39 +00:00Commented Dec 28, 2015 at 2:43
-
\$\begingroup\$ Yes, you need to delete all the pointers created with
new
in your destructor. However, in your current implementation there is an odd mix of pointers and objects. Look at this C++ linked list and especially the comments on it to see how it might be done. \$\endgroup\$Edward– Edward2015年12月28日 03:15:26 +00:00Commented Dec 28, 2015 at 3:15 -
\$\begingroup\$ @DhavalDoshi You might want to read about the Stack and the Heap (just search for it). The keyword new reserves memory on the Heap. Until delete is called, this part stays reserved, since the program does not know if the data is to be needed in the future or not. This is called a memory leak, another term you might want to search for. Lastly, if you keep on working with raw pointers, consider to learn how to use the tool valgrind, which is quite adept at detecting memory leaks. \$\endgroup\$Aziuth– Aziuth2017年07月28日 09:22:38 +00:00Commented Jul 28, 2017 at 9:22
In both pathes of if .. else you have the identical Code. Take it out of the condition, DRY:
void insert(LinkedList **start, int data)
{
LinkedList *p = new LinkedList;
p->data = data;
p->next = NULL;
if (*start == NULL)
{
*start = p;
}
else
{
LinkedList *temp = *start;
while (temp->next != NULL)
{
temp = temp->next;
}
temp->next = p;
}
}
I don't know how to identify duplicate values in an unsorted container allowing duplicates in \$O(N)\$ time without hashing.
Nits:
- state upfront your goal coding what you present
- you tagged the question oop: move non-List members to a class
LinkedNode
- in
insert()
, both branches of theif
-statement share code - Use references (there should be a trivial way "in CR" to refer to
Scott Meyers "Effective C++"/"Effective Modern C++" or equiv.) - don't use different notations for the same "thing"
(initialisers fortemp
andtemp1
) - with loops having a "while"-condition, an advance to next iteration, and possibly an initialise, use
for
shot:
void removeDuplicates(LinkedNode *node) {
for ( ; nullptr != node ; node = node->next) {
LinkedNode *prev = node, *other;
while (nullptr != (other = prev->next))
if (node->data == other->data) {
prev->next = other->next;
delete(other);
} else
prev = other;
}
}
Step 1: Sort the list using merge sort - O(nlog n). Step 2: Remove duplicate elements iterating from second position by comparing current node against the previous one - O(n).
Overall complexity will be O(nlog n) + O(n).
I'd just like to mention the naming in your code. First, the word data
is meaningless. Every byte in the computer is some sort of data. I realize this is a contrived example since the data doesn't actually represent anything. But in real code, you should avoid generic terms like "data," "value," "info," etc.
Secondly, naming variables temp
and temp1
is even less meaningful. temp
implies a temporary variable. If these were temporary variables you'd want to know what they were temporaries of. For example, tempPhoneNumber
or tempAddress
. But in this case, they aren't temporaries. They are the thing you are operating on. They represent the node you are trying to match, and the node which you are testing for a match. So you should name them something like nodeToMatch
and nodeToTest
or matchNode
and testNode
, or something like that. It's much clearer. In traverse()
we see the same problem. You could at least call it nextNode
or something like at least slightly more meaningful than temp
.
Explore related questions
See similar questions with these tags.