I was recently porting an application that uses memmem
from Linux to Windows.
While memmem
is available as a library function on Linux, MSDN / Windows does not offer it. I could have used strstr
but then it has a very different prototype and expects only null-terminated strings as arguments. Googling around, I discovered a few implementations of memmem
but none appeared to truly mimic memmem
.
I therefore wrote one of my own:
void *memmem(const void *haystack_start, size_t haystack_len, const void *needle_start, size_t needle_len)
{
const unsigned char *haystack = (const unsigned char *) haystack_start;
const unsigned char *needle = (const unsigned char *) needle_start;
const unsigned char *h = NULL;
const unsigned char *n = NULL;
size_t x = needle_len;
/* The first occurrence of the empty string is deemed to occur at
the beginning of the string. */
if (needle_len == 0)
return (void *) haystack_start;
/* Sanity check, otherwise the loop might search through the whole
memory. */
if (haystack_len < needle_len)
return NULL;
for (; *haystack && haystack_len--; haystack++) {
x = needle_len;
n = needle;
h = haystack;
if (haystack_len < needle_len)
break;
if ((*haystack != *needle) || ( *haystack + needle_len != *needle + needle_len))
continue;
for (; x ; h++ , n++) {
x--;
if (*h != *n)
break;
if (x == 0)
return (void *)haystack;
}
}
return NULL;
}
The code does work, but my question is: could it be done more efficiently, or faster?
I hope one of the awesome hacks here will also be kind enough to point out if this implementation has a vulnerability that I may have not noticed.
2 Answers 2
You wrote:
for (; *haystack && haystack_len--; haystack++) { x = needle_len; n = needle; h = haystack; if (haystack_len < needle_len) break;
A better way would be:
for (; --haystack_len >= needle_len; haystack++) {
x = needle_len;
n = needle;
h = haystack;
The *haystack
in the condition expression is actually a bug as it prevents searching if there are any zero bytes in the range.
I think it is a bad idea to have the second clause in:
if ((*haystack != *needle) || ( *haystack + needle_len != *needle + needle_len))
because you might bust the cache. Keep data access local. Checking the last character is an unnecessary complexity.
Declare the Variables When They are Needed
The first five lines of the function memmem()
are:
const unsigned char *haystack = (const unsigned char *) haystack_start;
const unsigned char *needle = (const unsigned char *) needle_start;
const unsigned char *h = NULL;
const unsigned char *n = NULL;
size_t x = needle_len;
Since x, h and n are loop variants, it would be better to delcare them where their values are assigned:
for (; *haystack && haystack_len--; haystack++) {
size_t x = needle_len;
unsigned char *n = needle;
unsigned char *h = haystack;
The pointers n and h are variable, they will change inside the loop, the characters won't change.
The variables haystack and needle start aren't needed utill just before the loop, because they also are only used in the loop..
unsigned char *haystack = (unsigned char *) haystack_start;
unsigned char *needle = (unsigned char *) needle_start;
for (; *haystack && haystack_len--; haystack++) {
size_t x = needle_len;
unsigned char *n = needle;
unsigned char *h = haystack;
Simplify the Code
There is no reason to stay in the function or the outer loop once haystack_len
is less than needle_len
.
if (haystack_len < needle_len)
{
return NULL;
}
Make Maintenance Easier While the following is perfectly legitimate, it's not the best programming practice, if someone needs to add a second statement to the THEN clause, it might be better to add braces around all THEN and ELSE clauses:
if (haystack_len < needle_len)
return NULL;
would be better as:
if (haystack_len < needle_len)
{
return NULL;
}
One Possible Optimization
Most computers have a decrement and test opcode, the for loop
for (; x ; h++ , n++) {
x--;
if (*h != *n)
break;
if (x == 0)
return (void *)haystack;
}
would probably be faster this way:
for (x = needle_len + 1; --x ; )
{
if (*h++ != *n++)
break;
}
if (x == 0)
{
return (void *)haystack;
}
The variable x needs to be one greater than the length because --x is tested at the top of the loop.
Θ(nm)
performance of the naive algorithm you're using. \$\endgroup\$