1

My homework is remove duplicates in a random string. My idea is use 2 loops to solve the problem.

1st one will scan every character in the string. 2nd one will check that character is duplicated or not. If so, remove the character.

string content = "Blah blah..."
 for (int i = 0; i < content.size(); ++i) {
 char check = content.at(i);
 for (int j = i + 1; j < content.size() - 1; ++j) {
 if (check == content.at(j)) {
 content.erase(content.begin()+j);
 }
 }
 }

The problem is it doesn't work. It always removes the wrong character. Seems an indices problem but I don't understand why.

A temporary fix is change content.erase(content.begin()+j); to content.erase( remove(content.begin() + i+1, content.end(), check),content.end());

But I think trigger a "remove by value" scan isn't a nice way. I want to do it with 2 loops or fewer.

Any ideas will be appreciated :)

melpomene
86.1k8 gold badges95 silver badges154 bronze badges
asked Jun 30, 2014 at 17:54
5
  • 1
    One simple way is to use the double-loop brute-force variant you have, but copy non-duplicates to a new string. Commented Jun 30, 2014 at 17:58
  • 1
    Potential duplicate: stackoverflow.com/questions/2286860/… Commented Jun 30, 2014 at 18:00
  • @JoachimPileborg: I like your idea, but can you provide some pseudo code? I can't implement it right now, no idea :( Commented Jun 30, 2014 at 18:09
  • @djikay:thank for the link, but can you show me what's wrong in my code? I can't understand why the indices are wrong. Commented Jun 30, 2014 at 18:10
  • 1
    @Tiana987642 See my answer that to understand what is wrong in your code. Commented Jun 30, 2014 at 20:14

6 Answers 6

5

Your loops could look the following way

#include <iostream>
#include <string>
int main() 
{
 std::string s = "Blah blah...";
 std::cout << '\"' << s << '\"' << std::endl;
 for ( std::string::size_type i = 0; i < s.size(); i++ )
 {
 std::string::size_type j = i + 1;
 while ( j < s.size() )
 {
 if ( s[i] == s[j] )
 {
 s.erase( j, 1 );
 }
 else
 {
 ++j;
 }
 }
 }
 std::cout << '\"' << s << '\"' << std::endl;
 return 0;
}

The output is

"Blah blah..."
"Blah b."

There are many other approaches using standard algorithms. For example

#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
int main() 
{
 std::string s = "Blah blah...";
 std::cout << '\"' << s << '\"' << std::endl;
 auto last = s.end();
 for ( auto first = s.begin(); first != last; ++first )
 {
 last = std::remove( std::next( first ), last, *first );
 }
 s.erase( last, s.end() );
 std::cout << '\"' << s << '\"' << std::endl;
 return 0;
}

The output is the same as for the previous code example

"Blah blah..."
"Blah b."
answered Jun 30, 2014 at 19:53

3 Comments

Thank you for your suggestion :) I learnt a few thing from your code :) If you have free time, please tell me my what's wrong in my code and workaround for it, like you mentioned in your comment :)
@Tiana987642 I think that it is enough to compare your loops with my loops in my first code example. It is obvious that this loop for (int j = i + 1; j < content.size() - 1; ++j) is invalid. It is not clear why condition content.size() - 1 is used and secondly when a character was deleted you must not increment i.
My idea is the first loop take a character (let's call it A) , the second will compare A with the rest and delete. So the 2nd will start after the position of A and is ended at the last of the string. At the time I wrote the code I didn't think carefully about this. Just don't want to out of bound. Thank you for point out my mistake :)
5

If use of STL is a possible option, you could use an std::unordered_set to keep the characters seen so far and the erase-remove idiom with std::remove_if, like in the following example:

#include <iostream>
#include <string>
#include <unordered_set>
#include <algorithm>
int main() {
 std::string str("Hello World!");
 std::unordered_set<char> log;
 std::cout << "Before: " << str << std::endl;
 str.erase(std::remove_if(str.begin(), str.end(), [&] (char const c) { return !(log.insert(c).second); }), str.end());
 std::cout << "After: " << str << std::endl;
}

LIVE DEMO

answered Jun 30, 2014 at 18:14

3 Comments

Nice idea :) Of course use STL is an option, but I want to improve my skill a bit so I try to think an algorithm first, use <algorithm> later :) Just my opinion
@Tiana987642 if it's for educational purposes I'm with you 100%. But when you really want to do serious programming, STL is not only highly recommended, but IMHO its use is mandatory.
Thank, I appreciate your advice :)
3

I recommend a two pass approach. The first pass identifies the positions of the duplicated characters; the second pass removes them.

I recommend using a std::set and a std::vector<unsigned int>. The vector contains letters that are in the string. The vector contains the positions of the duplicated letters.

The first pass detects if a letter is present in the set. If the letter exists, the position is appended to the vector. Otherwise the letter is inserted into the set.

For the second pass, sort the vector in descending order.
Erase the character at the position in the vector, then remove the position from the vector.

By erasing characters from the end of the string towards the front, the positions of the remaining duplicates won't change when the character is erased from the string.

answered Jun 30, 2014 at 19:56

2 Comments

If the answer is useful, please click on the Check Mark.
I ain't forget who helped me :) I already voted up your answer :) I just want to be fair with people, who also answer this question :D
1

I am not sure that this is what is causing your problem, but another problem that I see with your code is in your second for loop. Your j < content.size() - 1 statement should just be

j < content.size().

The reasoning for this is a little tricky to see at first, but in this case you are not just getting the size of your vector to act as the size, but to act as the ending indices of your string. You are shortening the last indices by one which means you wont hit the last char in your string. I don't know if this will help your initial problem, but who knows?

answered Jun 30, 2014 at 18:11

Comments

1

Note: Your actual problem is maintaining a proper index to the next element in question:

  • If you do not erase a character, the next element is at the next position.
  • If you erase a character, the next element will move into the place of the current position (the position stays the same).

Also: There are more efficient solutions (eg.: utilizing a set)

answered Jun 30, 2014 at 18:21

1 Comment

Now I understand. So can you give me a workaround to avoid situation like this?
1

sort first

then unique move all unique chars to the beginning and return past-the-end iterator.

then erase the useless chars

string digits("1a2b3c3c2b1a");
sort(digits.begin(), digits.end());
digits.erase( unique( digits.begin(), digits.end() ), digits.end() );
cout << digits << endl;

output

123abc
answered Apr 9, 2023 at 4:13

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.