2
\$\begingroup\$

Given a suffix array for a word, check if a pattern (consecutive chars) exists.

Example:

A suffix array constructed for "ameya" should return true for "ame" but false for "foo" or false for "ameyp".

This algoritm is case sensitive.

I'm looking for code-review, best practices and optimizations.

public final class SuffixArrayPatternMatch {
 private final String[] suffixArray;
 public SuffixArrayPatternMatch(String[] suffixArray) {
 this.suffixArray = suffixArray;
 }
 /**
 * Returns 0, if the string starts with the pattern.
 * Returns 1, if the string is greater than 1.
 * Returns -1, if the string is smaller than pattern. 
 * 
 * 
 * @param str the string to be matched with the pattern.
 * @param pattern the pattern to be searched in the string
 * @return -1, 0, or 1.
 */
 private int compare(String str, String pattern) {
 int length = str.length() < pattern.length() ? str.length() : pattern.length();
 for (int i = 0; i < length; i++) {
 if (str.charAt(i) < pattern.charAt(i)) return -1;
 if (str.charAt(i) > pattern.charAt(i)) return 1;
 }
 // string was greater than myself. thus return 0.
 return str.length() < pattern.length() ? -1 : 0;
 }
 private boolean binarySearch(int lb, int hb, String pattern) {
 if (lb > hb) {
 return false;
 }
 int mid = (lb + hb) / 2;
 int match = compare(suffixArray[mid], pattern);
 if (match == 0) return true;
 if (match > 0) { 
 return binarySearch(lb, mid -1, pattern);
 } else {
 return binarySearch(mid + 1, hb, pattern);
 }
 }
 /**
 * Checks the pattern if present in the string
 * 
 * @param pattern the pattern to be searched in the string
 * @return true if pattern is found else false.
 */
 public boolean search (String pattern) {
 return binarySearch(0, suffixArray.length - 1, pattern);
 }
}
public class SuffixArrayPatternArrayTest {
 @Test
 public void testTrue() {
 String[] suffix = {"a", "ameya", "eya", "meya", "ya"};
 SuffixArrayPatternMatch sapm = new SuffixArrayPatternMatch(suffix);
 for (int i = 0; i < suffix.length; i++) {
 assertTrue(sapm.search(suffix[i]));
 }
 assertTrue(sapm.search("mey"));
 assertTrue(sapm.search("ame"));
 }
 @Test
 public void testFalse() {
 String[] suffix = {"a", "ameya", "eya", "meya", "ya"};
 SuffixArrayPatternMatch sapm = new SuffixArrayPatternMatch(suffix);
 assertFalse(sapm.search("foo"));
 assertFalse(sapm.search("yap"));
 assertFalse(sapm.search("ameyap"));
 }
}
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Jun 17, 2014 at 20:49
\$\endgroup\$

2 Answers 2

1
\$\begingroup\$

Since you have a hidden implementation requirement that the suffix be sorted (i.e. binary search), you should sort the array before assigning it to the member variable.

Your compare implementation differs from that of the String class in that it returns 0 if str starts with pattern. Why not code it that way?

private int compare(String str, String pattern) {
 return str.startsWith(pattern) ? 0 : str.compareTo(pattern);
}
answered Jun 19, 2014 at 4:46
\$\endgroup\$
2
  • \$\begingroup\$ A suffix array is sorted by definition. Instead of sorting, I would throw an exception if the suffix array is not sorted. \$\endgroup\$ Commented Jul 3, 2014 at 6:54
  • \$\begingroup\$ @mjolka in the same boat as @rolfl, didn't realize this was an actual data structure. In light of such, an exception would be appropriate. \$\endgroup\$ Commented Jul 3, 2014 at 18:23
1
\$\begingroup\$

Right, talk about confusing. This code is written about 'suffix search', but all it does is prefix-search. Suffix happens at the end of the word, prefix happens at the beginning.

answered Jul 3, 2014 at 5:10
\$\endgroup\$
2
  • \$\begingroup\$ It's checking if the pattern is a prefix of any element in the suffix array. p is a substring of s if and only if p is a prefix of a suffix of s. \$\endgroup\$ Commented Jul 3, 2014 at 7:13
  • \$\begingroup\$ @mjolka - Oh, you're right. I have not heard of the concept of a "suffix array" before. I see it is a well-defined structure. This is a clear indication that the question does not contain enough detail about the problem being solved. I don't think I will delete my answer, it makes it clear that naming is a problem in the code. \$\endgroup\$ Commented Jul 3, 2014 at 11:19

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.