This function checks whether or not a substring needle
exists in another string haystack
and returns the position if it does or 0 if it doesn't, unless the position is 0 in which case, it won't be located.
Looking for ways to improve this code, specifically better error handling.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
size_t contains(const char * needle, const char *haystack);
int main(void)
{
char *needle = "test";
char *haystack = "This is a dinosaurtest.";
printf("Position: %lu", contains(needle, haystack));
return EXIT_SUCCESS;
}
size_t contains(const char * needle, const char *haystack)
{
if(needle == NULL || haystack == NULL)
{
return 0;
}
long int first_char_pos = -1;
size_t len_h = strlen(haystack);
size_t len_n = strlen(needle);
size_t i, j;
size_t exist_count = 0;
// Find the first character. If it doesn't exist, we're done.
for(i = 0; i < len_h; i++)
{
if((haystack[i] == needle[0]) && (first_char_pos == -1))
{
first_char_pos = i;
exist_count++;
}
}
if(first_char_pos == -1)
{
return 0;
}
printf("First char match index: %li\n", first_char_pos);
printf("Char: %c\n", haystack[first_char_pos]);
size_t current_index = (size_t) first_char_pos;
for(i = first_char_pos; i < len_h; i++)
{
if(haystack[i] == needle[exist_count] && (i == (current_index + 1)))
{
current_index = i;
exist_count++;
}
printf("Exist count: %lu\n", exist_count); //<--Debugging
if(exist_count == len_n)
{
return first_char_pos;
}
}
return 0;
}
4 Answers 4
Just a couple of remarks:
You should add a newline after the last line:
$ ./nh First char match index: 18 Char: t Exist count: 1 Exist count: 2 Exist count: 3 Exist count: 4 Position: 18 $
I don't know what compiler you use but with when compiled with
gcc
and-Wall -Wextra -pedantic
you get:gcc -O2 nh.c -lm -o nh -Wall -Wextra -pedantic nh.c: In function ‘contains’: nh.c:25:15: warning: unused variable ‘j’ [-Wunused-variable] size_t i, j; ^
Code formatting should be more consistent. For example, in this line you put a whitespace before
needle
but don't put a whitespace beforehaystack
:size_t contains(const char * needle, const char *haystack);
%lu
is not a portable specifier forsize_t
type, you should use%zu
introduced in C99.You said:
returns the position if it does or 0 if it doesn't, unless the position is 0 in which case, it won't be located.
This is really not good. For example, with this it returns 0:
char *needle = "This";
char *haystack = "This is a dinosaurtest.";
With this, it also returns zero:
char *needle = "non-existent";
char *haystack = "This is a dinosaurtest.";
You can't tell the difference between success and failure in this two
examples. Actually, atoi()
has the same problem. I don't know what
operating system you use but maybe you could use ssize_t
as the
return type if it's available and return -1 in case of failure.
Adding on to the previous answer by @Arkadiusz Drabczyk:
A simple, trivial implementation of contains
could be done like this:
ssize_t contains(const char * needle, const char *haystack)
{
char *needle_in_haystack;
if(!needle || !haystack) return -1;
needle_in_haystack = strstr(haystack, needle);
return needle_in_haystack ? needle_in_haystack - haystack : -1;
}
Then, this program (with a few changes as mentioned above) should work:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
ssize_t contains(const char * needle, const char *haystack)
{
char *needle_in_haystack;
if(!needle || !haystack) return -1;
needle_in_haystack = strstr(haystack, needle);
return needle_in_haystack ? needle_in_haystack - haystack : -1;
}
int main(void)
{
char *needle = "test";
char *haystack = "This is a dinosaurtest.";
char *haystack2 = "This does not contain the string.";
printf("Position: %zd\n", contains(needle, haystack));
printf("Position: %zd\n", contains(needle, haystack2));
return EXIT_SUCCESS;
}
Output:
Position: 18
Position: -1
-
\$\begingroup\$ I would remove the check that the input is not NULL, and just use a language extension for that (
__attribute__((nonnull))
in GCC).NULL
is something that you would never expect as input for this function, and it adds one or two unnecessary lines of code. I would prefer to write in the documentation of the function something like: "If the input to this function is a NULL pointer, the behaviour is undefined.". \$\endgroup\$alx - recommends codidact– alx - recommends codidact2019年08月11日 23:50:23 +00:00Commented Aug 11, 2019 at 23:50 -
1\$\begingroup\$ @CacahueteFrito The original code did it, and I want to strive for compatibility (who knows how the OP was using it?). \$\endgroup\$S.S. Anne– S.S. Anne2019年08月11日 23:51:50 +00:00Commented Aug 11, 2019 at 23:51
-
\$\begingroup\$ Missing include for
ssize_t
:#include <sys/types.h>
. Another option would be to useptrdiff_t
instead, from#include <stddef.h>
; you are actually returning a pointer difference:? needle_in_haystack - haystack :
\$\endgroup\$alx - recommends codidact– alx - recommends codidact2019年08月12日 11:26:43 +00:00Commented Aug 12, 2019 at 11:26
Your code doesn't work. It returns 0
for haystack
"abbc"
and needle
"bc"
, even though haystack
contains needle
.
You don't need the first loop and all the length calculations. Btw., the function doesn't succeed, if the first char is found, but only the second occourrence of the first char fits with needle.
The task can be reduced to a few lines:
int contains(char *buf, char *needle) {
char *src, *srch, *srcp;
for(src=buf; *src; src++) {
for(srch = needle, srcp = src; *srch && *srcp && *srch == *srcp; srch++, srcp++);
if(!*srch) return src - buf;
}
return -1;
}
-
\$\begingroup\$ What is a good way to get better at writing more compact, efficient C code like this? This sorta reminds me of K&R C. \$\endgroup\$the_endian– the_endian2019年08月14日 17:58:27 +00:00Commented Aug 14, 2019 at 17:58
strstr()
. There's a safer version calledstrnstr()
. You can find an implementation here: github.com/lattera/freebsd/blob/master/lib/libc/string/… \$\endgroup\$contains("tt", "test")
return true? \$\endgroup\$