I saw this interview question in the book Cracking the Coding Interview:
Implement an algorithm to determine if a string has all unique characters
The authors solution uses bit-shifting for a pretty efficient solution.
I wanted to give it a shot to discover some new things and practice some. I'd like for this to be a discussion of how I can improve my programming and also, what the correct answer may be. Did I leave out any edge-cases? Is there a more efficient way of doing this?
#include <iostream>
#include <string>
#include <map>
using namespace std;
// Uses an iterator to display the map.
void displayMap(map<char, int> displayCharacters)
{
for (map<char, int>::iterator itr = displayCharacters.begin(), end = displayCharacters.end(); itr != end; ++itr)
{
cout << itr->first << " --> " << itr->second << endl;
}
}
// Checks for uniqueness for a given string and returns a boolean.
bool isUnique(int stringLength, string alphabet, map<char, int> uniqueCharacters)
{
for (int i = 0; i < stringLength; i++)
{
if (uniqueCharacters[alphabet[i]] == 1)
{
return false;
}
uniqueCharacters[alphabet[i]]++;
}
return true;
}
// Maps individual characters found in stringRead and sets a value to them.
// Returns the mapped map.
map<char, int> readString(string stringRead, map<char, int> uniqueCharacters)
{
for (int i = 0, strLen = stringRead.length(); i < strLen; i++)
{
uniqueCharacters[stringRead[i]] = 0;
}
return uniqueCharacters;
}
int main()
{
// Create a map to store a key character and value int.
// Create a string to hold a variable string to test.
map<char, int> uniqueCharacters;
string testString = "aaabbbccc";
int stringLength = testString.length();
// uniqueCharacters is mapped to the above string and the displayed using displayMap.
uniqueCharacters = readString(testString, uniqueCharacters);
displayMap(uniqueCharacters);
if (isUnique(stringLength, testString, uniqueCharacters))
{
cout << "Unique string!\n";
}
else
{
cout << "Not unique!\n";
}
cin.ignore();
cin.get();
}
-
\$\begingroup\$ @rafa you can use a bool array of uniqueCharacters instead of integer since you are not keeping any record of frequency of letters. \$\endgroup\$aakansha– aakansha2015年07月25日 16:25:59 +00:00Commented Jul 25, 2015 at 16:25
-
\$\begingroup\$ @aa1992 I was thinking of doing that as well. In any case, in an interview setting, if asked, 'how would you improve this?', that would be a good answer :) \$\endgroup\$rafa– rafa2015年07月25日 16:27:51 +00:00Commented Jul 25, 2015 at 16:27
-
\$\begingroup\$ Knowing whether the requirements need to handle otherwise-trivial character sets makes a big difference. For a simple ascii string, a lookup table like this would probably be sufficient and near-optimal. \$\endgroup\$WhozCraig– WhozCraig2015年07月25日 16:33:40 +00:00Commented Jul 25, 2015 at 16:33
1 Answer 1
This solution is inefficient:
- There is no point counting the characters in the entire string: once you found a duplicate, you can stop counting, because you know the result is false
- This also implies that the map of counts is wasted space
- Iterating over the alphabet to check character counts is inefficient: it would be better to iterate over the map values
Instead of a map
, it would be more efficient to use a set
.
Instead of a set
, you could also consider a boolean array of the size of the alphabet, values initialized to false
,
and use that to track characters seen so far.
The disadvantage of this approach over a set
is that it might use more space.
You are also violating some good practices:
using namespace std
is considered bad practicereadString
is poorly named: it doesn't "read a string", it initializes a map of character counts to all zero values- No need to pass
stringLength
together withtestString
. You can derive that value fromtestString