0
\$\begingroup\$

I have invented the fastest string search algorithm. Can someone prove that it is not fastest?

I am not considering worst case scenarios.

The algorithm can be found here: https://sourceforge.net/projects/fastest-string-search-algo/files/


 /*
 * Choudhary algorithm
 */
 public static int choudharyStringSearchAlgorithm(char[] text, char[] pattern) {
 int i = 0;
 int index = 0;
 int end_index = 0;
 boolean not_found = false;
 int text_len = text.length;
 int pattern_len = pattern.length;
 int pi_44 = pattern_len - 1;
 int pi_34 = (3 * pattern_len) / 4;
 int pi_24 = pattern_len / 2;
 int pi_14 = pattern_len / 4;
 int[] skip_table = new int[ASIZE];
 // preprocess pattern and fill skip_table
 for (i = 0; i < pattern_len; i++) {
 skip_table[pattern[i]] = pattern_len - 1 - i;
 }
 // now search
 for (i = 0; i < text_len; i++) {
 if ((text_len - i) < pattern_len) {
 return -1;
 }
 if (pattern[pi_44] != text[i + pi_44]) {
 if (skip_table[(int) (text[i + pi_44])] > 0) {
 i = i + skip_table[(int) (text[i + pi_44])] - 1;
 }
 continue;
 }
 if (pattern[0] != text[i]) {
 continue;
 }
 if (pattern[pi_24] != text[i + pi_24]) {
 continue;
 }
 if (pattern[pi_34] != text[i + pi_34]) {
 continue;
 }
 if (pattern[pi_14] != text[i + pi_14]) {
 continue;
 }
 end_index = i + pi_44;
 not_found = false;
 for (index = i; index <= end_index; index++) {
 if (text[index] != pattern[index - i]) {
 not_found = true;
 break;
 }
 } // end of inner for loop
 if (not_found == false) { // match is found
 return i;
 }
 } // end of outer for loop
 return -1;
 } // end of choudharyStringSearchAlgorithm
Toby Speight
87.7k14 gold badges104 silver badges325 bronze badges
asked Jul 17, 2021 at 5:59
\$\endgroup\$
3
  • 4
    \$\begingroup\$ Looks like you've re-invented Boyer-Moore-Horspool string searching. At least in theory, Boyer-Moore string searching is a bit faster (but honestly, the difference is quite small). Even though it's been invented before, independently re-inventing BMH is no mean feat. \$\endgroup\$ Commented Jul 17, 2021 at 7:30
  • \$\begingroup\$ This algorithm is not Boyer-Moore-Horspool. Please see the algorithm carefully. It compares characters at indices pattern_len - 1, 0, (3 * pattern_len) / 4, pattern_len / 2, and pattern_len / 4 before going for full pattern search. \$\endgroup\$ Commented Jul 17, 2021 at 9:23
  • 1
    \$\begingroup\$ This [does not implement] Boyer-Moore-Horspool I'd call it Raita². \$\endgroup\$ Commented Jan 20, 2022 at 12:05

1 Answer 1

2
\$\begingroup\$

If you are searching for long patterns, your algorithm wins most of the time (long strings as far to the end of the text as possible are favored), but if you compare the algorithms on patterns up to 100 characters long (human search), Boyer-Moore smokes everyone and it is not even close.

Edit: After some testing I have found out that your algorithm is worse on long patterns than if you just implemented the skip table.

Edit 2: Actually here is the implementation that is consistently faster than every other algorithm in your collection more than 95% of the time:

public class MyTextSearch implements TextSearch
{
 @Override
 public int search(
 char[] text,
 char[] pattern
 )
 {
 int[] skip_table = new int[256];
 int patternLength = pattern.length - 1;
 char lastPatternChar = pattern[patternLength];
 Arrays.fill(skip_table, patternLength);
 for (int i = 0; i < pattern.length; i++)
 {
 skip_table[pattern[i]] = patternLength - i - 1;
 }
 for (int i = 0; i < text.length; i++)
 {
 if (i + patternLength >= text.length)
 {
 return -1;
 }
 char lastTextChar = text[i + patternLength];
 if (lastPatternChar != lastTextChar)
 {
 i += skip_table[lastTextChar];
 continue;
 }
 if (isEqual(text, i, pattern))
 {
 return i;
 }
 }
 return -1;
 }
 private boolean isEqual(
 char[] text,
 int at,
 char[] pattern
 )
 {
 for (int j = 0; j < pattern.length; j++)
 {
 if (text[at + j] != pattern[j])
 {
 return false;
 }
 }
 return true;
 }
 @Override
 public String name()
 {
 return "My Text Search";
 }
}

I have refactored your code a bit and it is on https://github.com/blazmrak/fastest-string-search if you are interested.

answered Jul 17, 2021 at 19:50
\$\endgroup\$
3
  • \$\begingroup\$ I have modified my algorithm. I tested and it comes out to be faster than my previous version and beats all other algorithms (Boyer Moore, etc.). The pattern for searching is taken from the end of string. You can find my new algorithm here: codereview.stackexchange.com/questions/270565/… \$\endgroup\$ Commented Dec 7, 2021 at 8:56
  • \$\begingroup\$ Is it faster than my implementation? Because I don't see my implementation in the list there. \$\endgroup\$ Commented Dec 7, 2021 at 9:57
  • \$\begingroup\$ I actually didn't include your implementation. I will do it when I get time. \$\endgroup\$ Commented Dec 7, 2021 at 11:48

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.