Remove duplicates from unsorted linked list

Question 1

Remove duplicate elements from linked list.

#include<iostream>
using namespace std;
class LinkedList{
private:
 int data;
 LinkedList *next;
public:
 void insert(LinkedList **start, int data){
 LinkedList *p = new LinkedList;
 if (*start == NULL){
 p->data = data;
 p->next = NULL;
 *start = p;
 }
 else{
 LinkedList *temp = *start;
 while (temp->next != NULL){
 temp = temp->next;
 }
 temp->next = p;
 p->data = data;
 p->next = NULL;
 }
 }
 void removeDuplicates(LinkedList **start){//This function removes the duplicates using the standard runner method taking 2 pointers
 LinkedList *temp = *start;
 LinkedList *temp1 = (*start);
 while (temp != NULL){
 while (temp1->next!=NULL){
 if (temp->data == temp1->next->data){
 LinkedList *p;
 p = temp1->next;
 temp1->next = temp1->next->next;
 delete(p);
 }
 else{
 temp1 = temp1->next;
 }
 }
 temp = temp->next;
 temp1 = temp;
 }
 }
 void traverse(LinkedList **start){
 LinkedList *temp = *start;
 while (temp != NULL){
 cout << temp->data;
 temp = temp->next;
 }
 }
};
int main(){
 LinkedList *start = NULL;
 LinkedList p1;
 p1.insert(&start, 9);
 p1.insert(&start, 8);
 p1.insert(&start, 7);
 p1.insert(&start, 9);
 p1.insert(&start, 8);
 p1.traverse(&start);
 p1.removeDuplicates(&start);
 cout << "\n";
 p1.traverse(&start);
 getchar();
 return 0;
}

This takes \$O(N^2)\$ time. Can we do the same in \$O(N)\$ without the use of hash table? Or if mandatory how can we implement it in C++?

How is the overall code quality of the above code? Is there any scope of improvement?

Question 2

I see a number of things that may help you improve your code.

Make sure you have all required `#include`s

The code uses getchar but doesn't #include <cstdio>. Also, carefully consider which #includes are part of the interface (and belong in the .h file) and which are part of the implementation.

Avoid raw pointers

In modern C++, we tend not to use raw pointers very often. In this case, It would probably be better to have two different classes, one would be a LinkedList class and the other would be a Node class. That way, instead of starting with a pointer, you can start with an object.

Use `nullptr` rather than `NULL`

Modern C++ uses nullptr rather than NULL. See this answer for why and how it's useful.

Match `new` with `delete`

If you allocate memory using new, you must also free it using delete or your program will leak memory. Since you use new in insert(), you should use delete in the LinkedList destructor which you have not yet written.

Don't abuse `using namespace std`

Putting using namespace std within your program is generally a bad habit that you'd do well to avoid.

Use more descriptive names

The code's traverse function actually prints the nodes. For that reason it should probably be named something like print(). Even better would be to have such a function take a std::ostream& argument so it would be possible to print to something other than std::cout.

Omit `return 0`

If your program completes successfully, the return 0 at the end of main() will be generated automatically, so it's not needed in C++ programs.

Question 3

Thank you very much on your comments. I couldn't exactly get the destructor part. Do I need to delete all the pointers which I created using new? If yes can you help me with doing that?

Question 4

Yes, you need to delete all the pointers created with new in your destructor. However, in your current implementation there is an odd mix of pointers and objects. Look at this C++ linked list and especially the comments on it to see how it might be done.

Question 5

@DhavalDoshi You might want to read about the Stack and the Heap (just search for it). The keyword new reserves memory on the Heap. Until delete is called, this part stays reserved, since the program does not know if the data is to be needed in the future or not. This is called a memory leak, another term you might want to search for. Lastly, if you keep on working with raw pointers, consider to learn how to use the tool valgrind, which is quite adept at detecting memory leaks.

Question 6

In both pathes of if .. else you have the identical Code. Take it out of the condition, DRY:

void insert(LinkedList **start, int data)
{
 LinkedList *p = new LinkedList;
 p->data = data;
 p->next = NULL;
 if (*start == NULL) 
 {
 *start = p;
 }
 else
 {
 LinkedList *temp = *start;
 while (temp->next != NULL)
 {
 temp = temp->next;
 }
 temp->next = p;
 }
}

Question 7

I don't know how to identify duplicate values in an unsorted container allowing duplicates in \$O(N)\$ time without hashing.
Nits:

state upfront your goal coding what you present
you tagged the question oop: move non-List members to a class LinkedNode
in insert(), both branches of the if-statement share code
Use references (there should be a trivial way "in CR" to refer to
Scott Meyers "Effective C++"/"Effective Modern C++" or equiv.)
don't use different notations for the same "thing"
(initialisers for temp and temp1)
with loops having a "while"-condition, an advance to next iteration, and possibly an initialise, use for

shot:

void removeDuplicates(LinkedNode *node) {
 for ( ; nullptr != node ; node = node->next) {
 LinkedNode *prev = node, *other;
 while (nullptr != (other = prev->next))
 if (node->data == other->data) {
 prev->next = other->next;
 delete(other);
 } else
 prev = other;
 }
}

Question 8

Step 1: Sort the list using merge sort - O(nlog n). Step 2: Remove duplicate elements iterating from second position by comparing current node against the previous one - O(n).

Overall complexity will be O(nlog n) + O(n).

Question 9

I'd just like to mention the naming in your code. First, the word data is meaningless. Every byte in the computer is some sort of data. I realize this is a contrived example since the data doesn't actually represent anything. But in real code, you should avoid generic terms like "data," "value," "info," etc.

Secondly, naming variables temp and temp1 is even less meaningful. temp implies a temporary variable. If these were temporary variables you'd want to know what they were temporaries of. For example, tempPhoneNumber or tempAddress. But in this case, they aren't temporaries. They are the thing you are operating on. They represent the node you are trying to match, and the node which you are testing for a match. So you should name them something like nodeToMatch and nodeToTest or matchNode and testNode, or something like that. It's much clearer. In traverse() we see the same problem. You could at least call it nextNode or something like at least slightly more meaningful than temp.

Edward Edward 67.2k4 gold badges120 silver badges284 bronze badges · Answer 1 · 2015-12-28 00:25:14Z

I see a number of things that may help you improve your code.

Make sure you have all required `#include`s

The code uses getchar but doesn't #include <cstdio>. Also, carefully consider which #includes are part of the interface (and belong in the .h file) and which are part of the implementation.

Avoid raw pointers

In modern C++, we tend not to use raw pointers very often. In this case, It would probably be better to have two different classes, one would be a LinkedList class and the other would be a Node class. That way, instead of starting with a pointer, you can start with an object.

Use `nullptr` rather than `NULL`

Modern C++ uses nullptr rather than NULL. See this answer for why and how it's useful.

Match `new` with `delete`

If you allocate memory using new, you must also free it using delete or your program will leak memory. Since you use new in insert(), you should use delete in the LinkedList destructor which you have not yet written.

Don't abuse `using namespace std`

Putting using namespace std within your program is generally a bad habit that you'd do well to avoid.

Use more descriptive names

The code's traverse function actually prints the nodes. For that reason it should probably be named something like print(). Even better would be to have such a function take a std::ostream& argument so it would be possible to print to something other than std::cout.

Omit `return 0`

If your program completes successfully, the return 0 at the end of main() will be generated automatically, so it's not needed in C++ programs.

Thank you very much on your comments. I couldn't exactly get the destructor part. Do I need to delete all the pointers which I created using new? If yes can you help me with doing that?
Yes, you need to delete all the pointers created with new in your destructor. However, in your current implementation there is an odd mix of pointers and objects. Look at this C++ linked list and especially the comments on it to see how it might be done.
@DhavalDoshi You might want to read about the Stack and the Heap (just search for it). The keyword new reserves memory on the Heap. Until delete is called, this part stays reserved, since the program does not know if the data is to be needed in the future or not. This is called a memory leak, another term you might want to search for. Lastly, if you keep on working with raw pointers, consider to learn how to use the tool valgrind, which is quite adept at detecting memory leaks.

milbrandt milbrandt 3872 silver badges6 bronze badges · Answer 2 · 2017-07-28 18:52:28Z

In both pathes of if .. else you have the identical Code. Take it out of the condition, DRY:

void insert(LinkedList **start, int data)
{
 LinkedList *p = new LinkedList;
 p->data = data;
 p->next = NULL;
 if (*start == NULL) 
 {
 *start = p;
 }
 else
 {
 LinkedList *temp = *start;
 while (temp->next != NULL)
 {
 temp = temp->next;
 }
 temp->next = p;
 }
}

greybeard greybeard 7,4313 gold badges21 silver badges55 bronze badges · Answer 3 · 2016-01-11 17:17:21Z

I don't know how to identify duplicate values in an unsorted container allowing duplicates in \$O(N)\$ time without hashing.
Nits:

state upfront your goal coding what you present
you tagged the question oop: move non-List members to a class LinkedNode
in insert(), both branches of the if-statement share code
Use references (there should be a trivial way "in CR" to refer to
Scott Meyers "Effective C++"/"Effective Modern C++" or equiv.)
don't use different notations for the same "thing"
(initialisers for temp and temp1)
with loops having a "while"-condition, an advance to next iteration, and possibly an initialise, use for

shot:

void removeDuplicates(LinkedNode *node) {
 for ( ; nullptr != node ; node = node->next) {
 LinkedNode *prev = node, *other;
 while (nullptr != (other = prev->next))
 if (node->data == other->data) {
 prev->next = other->next;
 delete(other);
 } else
 prev = other;
 }
}

Dignesh P R Dignesh P R 1111 bronze badge · Answer 4 · 2017-07-28 04:40:40Z

Step 1: Sort the list using merge sort - O(nlog n). Step 2: Remove duplicate elements iterating from second position by comparing current node against the previous one - O(n).

Overall complexity will be O(nlog n) + O(n).

user1118321 user1118321 11.9k1 gold badge20 silver badges46 bronze badges · Answer 5 · 2017-07-28 04:51:31Z

I'd just like to mention the naming in your code. First, the word data is meaningless. Every byte in the computer is some sort of data. I realize this is a contrived example since the data doesn't actually represent anything. But in real code, you should avoid generic terms like "data," "value," "info," etc.

Secondly, naming variables temp and temp1 is even less meaningful. temp implies a temporary variable. If these were temporary variables you'd want to know what they were temporaries of. For example, tempPhoneNumber or tempAddress. But in this case, they aren't temporaries. They are the thing you are operating on. They represent the node you are trying to match, and the node which you are testing for a match. So you should name them something like nodeToMatch and nodeToTest or matchNode and testNode, or something like that. It's much clearer. In traverse() we see the same problem. You could at least call it nextNode or something like at least slightly more meaningful than temp.

Stack Exchange Network

Remove duplicates from unsorted linked list

5 Answers 5

Make sure you have all required `#include`s

Avoid raw pointers

Use `nullptr` rather than `NULL`

Match `new` with `delete`

Don't abuse `using namespace std`

Use more descriptive names

Omit `return 0`

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Remove duplicates from unsorted linked list

5 Answers 5

Make sure you have all required #includes

Avoid raw pointers

Use nullptr rather than NULL

Match new with delete

Don't abuse using namespace std

Use more descriptive names

Omit return 0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Make sure you have all required `#include`s

Use `nullptr` rather than `NULL`

Match `new` with `delete`

Don't abuse `using namespace std`

Omit `return 0`