I'm studying with the Cracking the Code Interview book. I'm just looking for some feedback on this method:
C++
bool areAnagrams(const char*s1, const char*s2){
unordered_map <char,int> m1,m2;
for(size_t i = 0; i < strlen(s1); i++){
m1[s1[i]] += 1;
}
for(size_t j = 0; j < strlen(s2); j++){
m2[s2[j]] += 1;
}
for(auto i = m1.begin(); i != m1.end(); i++){
if(m2[(*i).first] != (*i).second)
return false;
}
return true;
}
2 Answers 2
Bug
Your algorithm is not quite correct. For example:
areAnagrams("abc", "abc") -> true (correct)
areAnagrams("abc", "abcd") -> true (wrong)
areAnagrams("abcd", "abc") -> false (correct)
The problem is that you are checking that the characters in s1
exist with the same frequency in s2
. But there can be characters in s2
that do not exist in s1
and you don't check for those.
An easy way to remedy this would be to check that both strings have the same length. This plus the frequency check you are already doing should ensure that the strings are anagrams.
In addition to the bug JS1 discovered, namely that you only check whether all elements of the first string have equal frequency in the second string, there's the question why you are building a complete histogram of both strings. Doing so is inefficient. Instead, increment for one string, and decrement for the other one. The result should be zero.
Your compiler hopefully at least memoizes the result of strlen
for you, or eliminates even that superfluous extra traversal. If it doesn't, that's a factor of n
for your algorithms complexity right there. Shlemiel The Painter greets you.
bool areAnagrams(const char* s1, const char* s2) {
std::unordered_map map<char, int>;
while(*s1)
++map[*s1++];
while(*s2)
--map[*s2++];
for(auto&& e : map)
if(e.second)
return false;
return true;
}
It's probably far more efficient to use a plain array of appropriate size on the stack than a complicated std::unordered_map
with dynamic allocation. Just beware of plain char
s signedness.
bool areAnagrams(const char* s1, const char* s2) {
int map[UCHAR_MAX + 1] = {};
while(*s1)
++map[(unsigned char)*s1++];
while(*s2)
--map[(unsigned char)*s2++];
for(auto e : map)
if(e)
return false;
return true;
}
Also, avoid using namespace std;
. I don't have conclusive evidence you used it, but to be safe, here it is: Why is "using namespace std;" considered bad practice?
return m1==m2;
? \$\endgroup\$std::is_permutation()
is \$O(n^2)\$. \$\endgroup\$m1==m2
have anything to do with permutations?!?! Anyways, the C++ standard guarantees the complexity of equality testing containers is linear. \$\endgroup\$