My homework is remove duplicates in a random string. My idea is use 2 loops to solve the problem.
1st one will scan every character in the string. 2nd one will check that character is duplicated or not. If so, remove the character.
string content = "Blah blah..."
for (int i = 0; i < content.size(); ++i) {
char check = content.at(i);
for (int j = i + 1; j < content.size() - 1; ++j) {
if (check == content.at(j)) {
content.erase(content.begin()+j);
}
}
}
The problem is it doesn't work. It always removes the wrong character. Seems an indices problem but I don't understand why.
A temporary fix is change content.erase(content.begin()+j);
to content.erase( remove(content.begin() + i+1, content.end(), check),content.end());
But I think trigger a "remove by value" scan isn't a nice way. I want to do it with 2 loops or fewer.
Any ideas will be appreciated :)
-
1One simple way is to use the double-loop brute-force variant you have, but copy non-duplicates to a new string.Some programmer dude– Some programmer dude2014年06月30日 17:58:35 +00:00Commented Jun 30, 2014 at 17:58
-
1Potential duplicate: stackoverflow.com/questions/2286860/…djikay– djikay2014年06月30日 18:00:26 +00:00Commented Jun 30, 2014 at 18:00
-
@JoachimPileborg: I like your idea, but can you provide some pseudo code? I can't implement it right now, no idea :(Tiana987642– Tiana9876422014年06月30日 18:09:14 +00:00Commented Jun 30, 2014 at 18:09
-
@djikay:thank for the link, but can you show me what's wrong in my code? I can't understand why the indices are wrong.Tiana987642– Tiana9876422014年06月30日 18:10:42 +00:00Commented Jun 30, 2014 at 18:10
-
1@Tiana987642 See my answer that to understand what is wrong in your code.Vlad from Moscow– Vlad from Moscow2014年06月30日 20:14:37 +00:00Commented Jun 30, 2014 at 20:14
6 Answers 6
Your loops could look the following way
#include <iostream>
#include <string>
int main()
{
std::string s = "Blah blah...";
std::cout << '\"' << s << '\"' << std::endl;
for ( std::string::size_type i = 0; i < s.size(); i++ )
{
std::string::size_type j = i + 1;
while ( j < s.size() )
{
if ( s[i] == s[j] )
{
s.erase( j, 1 );
}
else
{
++j;
}
}
}
std::cout << '\"' << s << '\"' << std::endl;
return 0;
}
The output is
"Blah blah..."
"Blah b."
There are many other approaches using standard algorithms. For example
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
int main()
{
std::string s = "Blah blah...";
std::cout << '\"' << s << '\"' << std::endl;
auto last = s.end();
for ( auto first = s.begin(); first != last; ++first )
{
last = std::remove( std::next( first ), last, *first );
}
s.erase( last, s.end() );
std::cout << '\"' << s << '\"' << std::endl;
return 0;
}
The output is the same as for the previous code example
"Blah blah..."
"Blah b."
3 Comments
If use of STL is a possible option, you could use an std::unordered_set
to keep the characters seen so far and the erase-remove idiom with std::remove_if
, like in the following example:
#include <iostream>
#include <string>
#include <unordered_set>
#include <algorithm>
int main() {
std::string str("Hello World!");
std::unordered_set<char> log;
std::cout << "Before: " << str << std::endl;
str.erase(std::remove_if(str.begin(), str.end(), [&] (char const c) { return !(log.insert(c).second); }), str.end());
std::cout << "After: " << str << std::endl;
}
LIVE DEMO
3 Comments
<algorithm>
later :) Just my opinionI recommend a two pass approach. The first pass identifies the positions of the duplicated characters; the second pass removes them.
I recommend using a std::set
and a std::vector<unsigned int>
. The vector contains letters that are in the string. The vector contains the positions of the duplicated letters.
The first pass detects if a letter is present in the set. If the letter exists, the position is appended to the vector. Otherwise the letter is inserted into the set.
For the second pass, sort the vector in descending order.
Erase the character at the position in the vector, then remove the position from the vector.
By erasing characters from the end of the string towards the front, the positions of the remaining duplicates won't change when the character is erased from the string.
2 Comments
I am not sure that this is what is causing your problem, but another problem that I see with your code is in your second for loop. Your j < content.size() - 1
statement should just be
j < content.size()
.
The reasoning for this is a little tricky to see at first, but in this case you are not just getting the size of your vector to act as the size, but to act as the ending indices of your string. You are shortening the last indices by one which means you wont hit the last char in your string. I don't know if this will help your initial problem, but who knows?
Comments
Note: Your actual problem is maintaining a proper index to the next element in question:
- If you do not erase a character, the next element is at the next position.
- If you erase a character, the next element will move into the place of the current position (the position stays the same).
Also: There are more efficient solutions (eg.: utilizing a set)
1 Comment
sort first
then unique
move all unique chars to the beginning and return past-the-end iterator.
then erase
the useless chars
string digits("1a2b3c3c2b1a");
sort(digits.begin(), digits.end());
digits.erase( unique( digits.begin(), digits.end() ), digits.end() );
cout << digits << endl;
output
123abc