Determining the existence of string Intersections

Question 1

Problem:

Given 2 strings, consider all the substrings within them of length len. Len will be 1 or more. Returns true if there are any such substrings which appear in both strings. Compute this in linear time using a HashSet.

Solution:

import java.util.HashSet;
public class Standford3 {
 public static void main(String[] args) {
 System.out.println("Test 1: " +
 (false == stringIntersect("blahblah", "foralheh", 3))
 );
 System.out.println("Test 2: " +
 (true == stringIntersect("checking", "deck", 2))
 );
 System.out.println("Test 3: " +
 (false == stringIntersect("derping", "slurp", 3))
 );
 System.out.println("Test 4: " +
 (false == stringIntersect("foo", "bar", 1))
 );
 System.out.println("Test 5: " +
 (true == stringIntersect("nowai", "55&dcsnow", 3))
 );
 }
 public static boolean stringIntersect(String a, String b, int len) {
 if (a.length() == 0 || b.length() == 0) { return false; }
 HashSet<String> alpha = permutateString(a, len);
 HashSet<String> beta = permutateString(b, len);
 for (String s : alpha) {
 if (beta.contains(s)) { return true; }
 }
 return false;
 }
 public static HashSet<String> permutateString(String str, int i) {
 if (i > str.length()) {
 throw new IllegalArgumentException(
 "Substring length cannot be larger than provided string"
 );
 }
 HashSet<String> set = new HashSet<>();
 int count = i;
 for (int j = 0; j < str.length(); j++ ) {
 if (count > str.length()) { break; }
 set.add(str.substring(j, count));
 count++;
 }
 return set;
 }
}

Are these tests sufficient? Unlike the two challenges that preceded it, the tests are mine; is there anything I should be testing for that I missed?
I hope this isn't outside of codereview territory, but I'm wondering if something more was meant by "compute in linear time" or does this adequately encompass that requirement?
Although the use of HashSet was explicitly cited, I'm wondering if there are performance advantages in using another built in class like LinkedHashSet?

Question 2

It is difficult to say what was exactly meant by "linear time". If len is treated as a constant, your code has linear time complexity. If it is not, then your code has O(n^2) time complexity.

Question 3

Some simple suggestions for refactoring on your current code (without going into the implementation)

Use Set over HashSet

Your original question says "Compute this in linear time using a HashSet.", but that usually doesn't mean you need to use HashSet as the return type. Set<String> result = new HashSet<>() works better in the sense that callers of your method will not need to know the underlying implementation. Also, you'll be free to change it to LinkedHashSet without changing the method signature.

Variable names

Two points here.

i is usually used in for loops, so do consider using a better name in the method signature, e.g. permutateString(String str, int length).

Your count isn't actually counting anything, but to keep track of the last exclusive index for substring-ing (I think I just made a new word up!). For people who don't use String.substring() often, that line of code will instead read like you are creating sub-strings of increasing lengths.

Looping

for (int j = 0; j < str.length(); j++ ) {
 if (count > str.length()) { break; }
 ...
}

This can be better expressed as:

for (int j = 0; j <= str.length() - length; j++ ) {
 result.add(str.substring(j, j + length));
}

Illustration:

So, to extract 3-character sub-strings from a 10-character string, we need to loop from i = 0 to i = 7, or i <= string.length() - length for the latter.

Unit testing

As mentioned elsewhere, please consider using unit testing frameworks like JUnit or TestNG. :)

Question 4

Maybe I'm not understanding this correctly but wouldn't your refactoring suggestion for point 3 result in being out of bounds in some cases? I used a tracker that increments to prevent that.

Question 5

@Legato please see my updated answer.

Question 6

@Legato, sorry, please see my second edit. I forgot to add j to the second argument just now. Thanks for pointing this out.

h.j.k. h.j.k. 19.4k3 gold badges37 silver badges93 bronze badges · Accepted Answer · 2015-01-02 00:49:59Z

Some simple suggestions for refactoring on your current code (without going into the implementation)

Use Set over HashSet

Your original question says "Compute this in linear time using a HashSet.", but that usually doesn't mean you need to use HashSet as the return type. Set<String> result = new HashSet<>() works better in the sense that callers of your method will not need to know the underlying implementation. Also, you'll be free to change it to LinkedHashSet without changing the method signature.

Variable names

Two points here.

i is usually used in for loops, so do consider using a better name in the method signature, e.g. permutateString(String str, int length).

Your count isn't actually counting anything, but to keep track of the last exclusive index for substring-ing (I think I just made a new word up!). For people who don't use String.substring() often, that line of code will instead read like you are creating sub-strings of increasing lengths.

Looping

for (int j = 0; j < str.length(); j++ ) {
 if (count > str.length()) { break; }
 ...
}

This can be better expressed as:

for (int j = 0; j <= str.length() - length; j++ ) {
 result.add(str.substring(j, j + length));
}

Illustration:

So, to extract 3-character sub-strings from a 10-character string, we need to loop from i = 0 to i = 7, or i <= string.length() - length for the latter.

Unit testing

As mentioned elsewhere, please consider using unit testing frameworks like JUnit or TestNG. :)

Maybe I'm not understanding this correctly but wouldn't your refactoring suggestion for point 3 result in being out of bounds in some cases? I used a tracker that increments to prevent that.
@Legato, sorry, please see my second edit. I forgot to add j to the second argument just now. Thanks for pointing this out.

Stack Exchange Network

Determining the existence of string Intersections

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Determining the existence of string Intersections

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions