A small, simple yet always needed function to compare two strings for equality.
int strequal (const char *str1, const char *str2)
{
while(1)
{
if((*str1) != (*str2))
{
return 1; /* Not Equal*/
}
else if((*str1) == '0円')
{
return 0; /* Equal */
}
str1++;
str2++;
}
return 0; /* Equal*/
}
int main ()
{
const char *str1 = "Hello";
const char *str2 = "Hi";
printf("Strings are equal? %s\n", strequal(str1, str2)? "NO":"YES");
return 0;
}
The performance is the main focus.
1 Answer 1
If you want better performance, you should be using the standard library function strcmp()
:
#include <string.h>
int str_unequal(const char *str1, const char *str2)
{
return strcmp(str1, str2) != 0;
}
The library function can take advantage of the target processor, possibly comparing multiple characters per iteration, which you can't easily do in a portable C program.
Other issues in the function:
strequal()
is a name reserved for future library extension, as is any identifier beginningstr
followed immediately by a letter.equal
is misleading in the name, as it returns true only when the strings are unequal.while (1)
is dubious practice, especially given that there's a natural terminating condition (end of one of the strings).- Unnecessary parentheses around the result of dereference operator
*
- that's higher precedence than comparisons. - The final
return 0;
is unreachable.
Problems with the test program:
- Uses
printf
without including<stdio.h>
. - Should explicitly state that
main()
accepts no arguments (i.e.int main(void)
). - Only tests a small portion of the functionality (no tests of two equal strings, or one that's a prefix of the other).
- Always returns a success status, even when the function is wrong.
Modified function and tests:
int str_equal(const char *s1, const char *s2)
{
while (*s1) {
if (*s1++ != *s2++) {
return 0;
}
}
return !*s2;
}
#include <stdio.h>
int test_str_equal(int expected, const char *a, const char *b)
{
int actual = str_equal(a, b);
if (actual == expected) { return 0; }
fprintf(stderr, "\"%s\"==\"%s\" should return %d\n", a, b, expected);
return 1;
}
int main(void)
{
return test_str_equal(1, "", "")
+ test_str_equal(0, "", "x")
+ test_str_equal(0, "x", "")
+ test_str_equal(1, "x", "x")
+ test_str_equal(0, "x", "y")
+ test_str_equal(0, "x", "xy")
+ test_str_equal(0, "xy", "x")
+ test_str_equal(0, "xx", "xy")
+ test_str_equal(1, "xy", "xy");
}
-
1\$\begingroup\$ The result of main can also be a bitmask representation of the test comparisons. Each rval of
test_str_equal
can be bit shifted by its enumerator and OR'd on a value.strcmp
will of course always be faster and in normal practice should always be used. That applies to all standard functions where optimized asm branching is present. I am not sure ifstrcmp
would usually compare two bytes a time, but I think it would probably use inline memcmp calls. The last return even unreachable is not an anti-pattern in my opinion. \$\endgroup\$Edenia– Edenia2022年10月22日 02:13:47 +00:00Commented Oct 22, 2022 at 2:13 -
1\$\begingroup\$ Yes, bitmask is another option (for a small number of tests; observe that we already exceed the number of bits we can portably report in POSIX) - my main point was to not return zero if the tests fail. I haven't looked closely at any
strcmp()
implementation, but I wouldn't be surprised to see it working in larger units than two bytes - more like the platform'sint
orlong
at a time, and perhaps more when targeting ISAs such as AVX2. \$\endgroup\$Toby Speight– Toby Speight2022年10月22日 07:11:29 +00:00Commented Oct 22, 2022 at 7:11 -
\$\begingroup\$ To me, unreachable code is a problem because it represents untested code. And the code quality tools I use complain about it; the easiest way to shut them up is to remove the problem. :-) \$\endgroup\$Toby Speight– Toby Speight2022年10月22日 07:13:19 +00:00Commented Oct 22, 2022 at 7:13
-
\$\begingroup\$ I also haven't, but my guess is that so long as the strings are not aligned, it will compare them byte-by-byte and memcmp them otherwise. It should probably also inline the entire function it if the compared string is less than 4 characters or something (we all know the overhead of a call is significant). \$\endgroup\$Edenia– Edenia2022年10月22日 15:52:11 +00:00Commented Oct 22, 2022 at 15:52
-
1\$\begingroup\$ A singular weakness to
test_str_equal(1, "x", "x")
is that both"x"
may point to the same string. Test code needs to insure different string addresses compare as equal. Overall, good answer. \$\endgroup\$chux– chux2022年11月09日 06:12:25 +00:00Commented Nov 9, 2022 at 6:12