Remove duplicates from a sorted array

Question 1

I have solved the LeetCode Remove Duplicates From Sorted Array problem:

Given a sorted array, remove the duplicates in place such that each element appear only once and return the new length. It doesn't matter what you leave beyond the new length.

I was able to come up with a approach but only when I submitted this solution and got it accepted I found that my solution ranked somewhere around 60%, ranked by run-time. i.e) My solution was only faster that 40% of other contestants solution for the same language.

 int removeDuplicates(vector<int>& nums) {
 if (nums.size() == 0)
 {
 return 0;
 }
 auto start = nums.begin(), end = nums.end();
 int last_value = *(nums.end() - 1);
 for (auto it = nums.begin(); it != end;)
 {
 int find_value = *it;
 auto upper = upper_bound(start, end, find_value); //Find the last occurring index of this number 
 iter_swap(start, it);
 ++start;
 if ((upper != end) || ((upper == end) && (find_value == last_value)))
 {
 it = upper;
 }
 else
 {
 ++it;
 }
 }
 return distance(nums.begin(), start);
}

Could you please suggest any performance improvements to this code. And also please go easy on the naming convention and code formatting. This was written for a programming challenge :)

Edit.

The worst case time complexity i believe is O(n Log n) which happens when the entire array is unique. And the best case run time should be O(log n) when the entire array contains duplicates.

Question 2

İf it ran faster than others, why are you looking for further improvements?

Question 3

@CEGRD Because some other solutions were faster, which indicates that there is room for improvement.

Question 4

What do you think is the time complexity of your solution? Can you think of a different algorithm with better time complexity?

Question 5

You have too many special cases.

Why do you care about what the last_value is? Just checking it != end should be sufficient to terminate the loop.

Then, when you no longer need last_value = *(nums.end() - 1), you can also get rid of the nums.size() == 0 special case.

You have iterators it and start. start is a rather confusing name, since it refers to an iterator that moves along. You could call them src and dest, or in and out, or fast and slow. I'm going with rabbit and turtle.

Swapping with iter_swap() isn't necessary; just a simple assignment will do.

int removeDuplicates(std::vector<int>& nums) {
 std::vector<int>::iterator rabbit, turtle, end = nums.end();
 for ( rabbit = turtle = nums.begin();
 rabbit != end;
 rabbit = std::upper_bound(rabbit, end, *turtle), turtle++ ) {
 *turtle = *rabbit;
 }
 return std::distance(nums.begin(), turtle);
}

I have a suspicion that much of the time, the very next element is already going to be different. It might be worth it to do a quick peek at the next element before going through the trouble of launching a binary search.

Question 6

rabbit = std::upper_bound(rabbit, end, *turtle), turtle++ Are you serious?

Question 7

@vnp Sorry, was that too long? Would you have preferred rabbit = std::upper_bound(rabbit, end, *turtle++)?

Question 8

No. , is a comma operator. It evaluates left to right and produces a rightmost expression. Your continuation amounts to rabbit = turtle++, and I am sure it is not what you've meant.

Question 9

@vnp The comma has lowest precedence. It's rabbit = std::upper_bound(...), followed by turtle++.

Question 10

Lower precedence than what? rabbit = x, y; it is. Try a debugger if you don't trust the Standard.

Question 11

So the simplest solutions is to use the available stl function. However there is obviously less to learn there.

#include <algorithm>
#include <vector>
int removeDuplicates(std::vector<int>& nums) {
 nums.erase(std::unique(nums.begin(), nums.end()), nums.end()); ;
 return nums.size();
}

Question 12

This is definitely the most c++ way to do this. It also makes use of the erase-remove idom. std::unique moves the duplicates to the end of the list and returns a new "end" iterator. Erase cleans those out. Note the size of the container remains the same.

Question 13

Note that .erase() is not strictly required in this problem statement. std::unique() would do a linear scan.

Question 14

@200_success This is correct, however It seems bad practice, to keep memory around, that is possibly undefined.

Question 15

A single line solution that runs in just 25ms . Brilliant :)

200_success 200_success 146k22 gold badges190 silver badges479 bronze badges · Accepted Answer · 2016-10-28 07:23:17Z

You have too many special cases.

Why do you care about what the last_value is? Just checking it != end should be sufficient to terminate the loop.

Then, when you no longer need last_value = *(nums.end() - 1), you can also get rid of the nums.size() == 0 special case.

You have iterators it and start. start is a rather confusing name, since it refers to an iterator that moves along. You could call them src and dest, or in and out, or fast and slow. I'm going with rabbit and turtle.

Swapping with iter_swap() isn't necessary; just a simple assignment will do.

int removeDuplicates(std::vector<int>& nums) {
 std::vector<int>::iterator rabbit, turtle, end = nums.end();
 for ( rabbit = turtle = nums.begin();
 rabbit != end;
 rabbit = std::upper_bound(rabbit, end, *turtle), turtle++ ) {
 *turtle = *rabbit;
 }
 return std::distance(nums.begin(), turtle);
}

I have a suspicion that much of the time, the very next element is already going to be different. It might be worth it to do a quick peek at the next element before going through the trouble of launching a binary search.

rabbit = std::upper_bound(rabbit, end, *turtle), turtle++ Are you serious?
@vnp Sorry, was that too long? Would you have preferred rabbit = std::upper_bound(rabbit, end, *turtle++)?
No. , is a comma operator. It evaluates left to right and produces a rightmost expression. Your continuation amounts to rabbit = turtle++, and I am sure it is not what you've meant.
@vnp The comma has lowest precedence. It's rabbit = std::upper_bound(...), followed by turtle++.
Lower precedence than what? rabbit = x, y; it is. Try a debugger if you don't trust the Standard.

Stack Exchange Network

Remove duplicates from a sorted array

Edit.

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Remove duplicates from a sorted array

Edit.

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions