I am trying to write my own implementation of indexOf
for learning purposes. I have an implementation of as follows. I'm sure a more efficient and shorter implementation exists.
function _indexOf(needle, haystack) {
var n = needle.split("");
var h = haystack.split("");
var n_len = n.length;
var h_len = h.length;
for (var i = 0; i < h_len; i++) {
var j = 0;
var flag = false;
while (h[i] == n[j] || flag) {
if (flag)
j = 0;
flag = false;
i++;
j++;
if (h[i] == n[0])
flag = true;
if (j >= n_len)
return i - n_len;
}
}
return -1;
}
Here's a test: assert(_indexOf("ced","ccceccecedeeced")==7) // true
-
\$\begingroup\$ Told you so! Codereview does give good, high quality answers! \$\endgroup\$Ethan Bierlein– Ethan Bierlein2015年06月11日 05:08:02 +00:00Commented Jun 11, 2015 at 5:08
3 Answers 3
Your code is pretty neat, and well structured. I like the variable names. There are a number of problems to point out, though...
- splitting the string is a huge "cheat".... if you're going to use regex, you may as well just match for the substring.
- splitting the strings is also slow. Just use an index in to the string and the
string[indx]
operator. You can remove theh
andn
andh_len
andn_len
variables as well then. - your outer loop can terminate at
h_len - n_len
- there's no need to loop to the end when a match would be impossible
You have some bugs.... like:
- your needle cannot contain the same letter as the start letter in a another place... For example, your code will fail to find
cec
. This is because you do weird logic here:if (h[i] == n[0]) ...
I am not sure why that's needed.... other than to reset the search? - related to the above, you cannot backtrack and reset a search in the event that the search pattern starts part way through a partial match....
Even when you resolve the above issues, your algorithm is considered 'naive'. There are two commonly used algorithms that are considered better. The Rabin–Karp algorithm is simpler to implement because you just create a hash of the needle, and then a matching rolling-hash in the haystack. If you find the hash in the haystack you can confirm a match, and return, in what is effectively \$O(n+m )\$. A more complicated, but also more efficient algorithm is the Boyer-Moore algorithm which creates a "jump table" from the needle, which allows you to 'skip' values in the haystack based on the details in the needle. It is efficient, and, the longer the needle, the more you can skip.
- I don't like the variable names
needle
andhaystack
. I'd would uselist
anditem
. This makes them a little clearer. Put braces around your
if
statements. Even if you're only ever running one statement, you should still put braces around them. Read this and this. The Apple bug happened because curly braces weren't used. Here's where the bug happened:if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) goto fail; goto fail; /* MISTAKE! THIS LINE SHOULD NOT BE HERE */
While in this small scenario, it's not a huge deal, it's a good practice to get into.
- Add some comments into here to describe what the code is doing. There are no comments here, so some parts of the code could be unclear to some.
- You have a lot of extra blank lines in here. Find the ones can be removed to increase clarity.
- I'm not sure why you're incrementing
i
. The loop is already incrementing it. This may be a bug.
Anyways, if there's anything else you want me to cover, just ask in the comments, and I'll cover it. Hope this helps!
-
6\$\begingroup\$ I like
needle
andhaystack
very much. It makes it very clear which parameter play which role. \$\endgroup\$200_success– 200_success2015年06月10日 19:35:03 +00:00Commented Jun 10, 2015 at 19:35 -
\$\begingroup\$ Don't take for granted that the reader will be familiar with the 'needle in a haystack' idiom. A non-native English reader might have a hard time making sense out of those words if they translated them literally. \$\endgroup\$Ben Collier– Ben Collier2015年06月12日 19:31:34 +00:00Commented Jun 12, 2015 at 19:31
1) I don't think that splitting a string to an array and accessing its elements is faster than using String's native method charAt
.
2) You don't need to iterate through all characters of haystack
, unless needle
's length is 1
.
3) Since indexOf
is within String
prototype scope you don't need to prefix your function name with underscore, just use indexOf
as is.
function indexOf(needle, haystack) {
var len = needle.length, lim = haystack.length - len;
for (var i = 0; i < lim; i++) {
for (var j = 0; j < len && haystack.charAt(i+j) == needle.charAt(j); j++);
if (j >= len) return i;
}
return -1;
}
Explore related questions
See similar questions with these tags.