Parameters:
first
- the first string.
second
- the second string.
similarCharacters
- outputs the number of similar characters.
tolerance
- the number of mistakes allowed to consider the strings "similar."
Function:
bool Str_Similar(std::string first, std::string second, unsigned int* similarCharacters = nullptr, int tolerance = INT_MIN)
{
// Don't even check if either strings are empty.
if(first.empty() || second.empty()) return false;
// Determine if the first is greater than or equal to the second.
const bool firstGreaterOrEqualToSecond = first.length() >= second.length();
// By default, set the tolerance to half the length of the smaller string.
if(tolerance == INT_MIN) tolerance = (firstGreaterOrEqualToSecond ? (second.length() / 2) : (first.length() / 2));
if(tolerance < 0) tolerance = 0;
// Start off with any length difference, which are considered mistakes.
unsigned int mistakes = (unsigned int)abs(first.length() - second.length());
// Search only the length of the smaller string.
const size_t searchLength = (firstGreaterOrEqualToSecond ? second.length() : first.length());
// Do the search.
for(size_t i = 0, max = searchLength; i < max; i++)
{
if(first.at(i) != second.at(i)) mistakes++;
}
// Output the similar characters.
if(similarCharacters != nullptr) *similarCharacters = (unsigned int)abs(searchLength - mistakes);
// Compare the mistakes to the tolerance.
return (mistakes <= tolerance);
}
-
1\$\begingroup\$ @OliverYasuna I also provided a compiling version. May be you should use that one to be reviewed. \$\endgroup\$πάντα ῥεῖ– πάντα ῥεῖ2017年01月20日 10:27:50 +00:00Commented Jan 20, 2017 at 10:27
-
\$\begingroup\$ @πάνταῥεῖ My bad, I misunderstood you. Thank you for the suggestions. Anything else you could suggest? Please post that here so I can consider it solved, unless another, more detailed response is posted. \$\endgroup\$Oliver Yasuna– Oliver Yasuna2017年01月20日 10:34:57 +00:00Commented Jan 20, 2017 at 10:34
1 Answer 1
1. Prefer passing parameters by const
reference
The std::string
parameters should be passed by const
reference rather than by value.
Even if passing by value would work properly, it makes the function signature clearer for the caller semantically and may be more efficient.
2. Fix all warnings
The line
return (mistakes <= tolerance);
results in a compiler warning:
warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
3. Prefer to use numeric_limits
over the C-style INT_MIN
For C++ code you should prefer to use std::numeric_limits<int>::min()
instead of the INT_MIN
macro (I couldn't even get that to compile, though stdint.h
was included).
4. Always use {}
braces for conditional code sections
You always should use braces to enclose conditional code sections
if(similarCharacters != nullptr) {
*similarCharacters = (unsigned int)abs(searchLength - mistakes);
}
Not only it improves the readability of the code, omitting the braces may make the code error prone for changes.
My compiling version can be found here.
-
\$\begingroup\$ 1) I passed them by value because I added code to the function which modifies these values. 2) Good point. 3) Thank you, my mind is still in C mode. \$\endgroup\$Oliver Yasuna– Oliver Yasuna2017年01月20日 10:44:49 +00:00Commented Jan 20, 2017 at 10:44
-
\$\begingroup\$ @OliverYasuna Sorry, I've been overlooking that was intended in your case. \$\endgroup\$πάντα ῥεῖ– πάντα ῥεῖ2017年01月20日 10:46:46 +00:00Commented Jan 20, 2017 at 10:46
-
\$\begingroup\$ Not your fault. I originally had posted the full function, which used the value strings, and forgot to change them after editing. \$\endgroup\$Oliver Yasuna– Oliver Yasuna2017年01月20日 10:47:55 +00:00Commented Jan 20, 2017 at 10:47
-
\$\begingroup\$ @OliverYasuna As CodyGray already mentioned, what you had there won't work properly. \$\endgroup\$πάντα ῥεῖ– πάντα ῥεῖ2017年01月20日 11:01:59 +00:00Commented Jan 20, 2017 at 11:01
-
\$\begingroup\$
INT_MIN
is of course in<climits>
, not<cstdint>
. \$\endgroup\$Toby Speight– Toby Speight2019年07月24日 09:14:31 +00:00Commented Jul 24, 2019 at 9:14