Problem:
Given 2 strings, consider all the substrings within them of length len. Len will be 1 or more. Returns true if there are any such substrings which appear in both strings. Compute this in linear time using a HashSet.
Solution:
import java.util.HashSet;
public class Standford3 {
public static void main(String[] args) {
System.out.println("Test 1: " +
(false == stringIntersect("blahblah", "foralheh", 3))
);
System.out.println("Test 2: " +
(true == stringIntersect("checking", "deck", 2))
);
System.out.println("Test 3: " +
(false == stringIntersect("derping", "slurp", 3))
);
System.out.println("Test 4: " +
(false == stringIntersect("foo", "bar", 1))
);
System.out.println("Test 5: " +
(true == stringIntersect("nowai", "55&dcsnow", 3))
);
}
public static boolean stringIntersect(String a, String b, int len) {
if (a.length() == 0 || b.length() == 0) { return false; }
HashSet<String> alpha = permutateString(a, len);
HashSet<String> beta = permutateString(b, len);
for (String s : alpha) {
if (beta.contains(s)) { return true; }
}
return false;
}
public static HashSet<String> permutateString(String str, int i) {
if (i > str.length()) {
throw new IllegalArgumentException(
"Substring length cannot be larger than provided string"
);
}
HashSet<String> set = new HashSet<>();
int count = i;
for (int j = 0; j < str.length(); j++ ) {
if (count > str.length()) { break; }
set.add(str.substring(j, count));
count++;
}
return set;
}
}
Are these tests sufficient? Unlike the two challenges that preceded it, the tests are mine; is there anything I should be testing for that I missed?
I hope this isn't outside of codereview territory, but I'm wondering if something more was meant by "compute in linear time" or does this adequately encompass that requirement?
Although the use of HashSet was explicitly cited, I'm wondering if there are performance advantages in using another built in class like
LinkedHashSet
?
-
1\$\begingroup\$ It is difficult to say what was exactly meant by "linear time". If len is treated as a constant, your code has linear time complexity. If it is not, then your code has O(n^2) time complexity. \$\endgroup\$kraskevich– kraskevich2015年01月01日 23:55:49 +00:00Commented Jan 1, 2015 at 23:55
1 Answer 1
Some simple suggestions for refactoring on your current code (without going into the implementation)
- Use
Set
overHashSet
Your original question says "Compute this in linear time using a HashSet
.", but that usually doesn't mean you need to use HashSet
as the return type. Set<String> result = new HashSet<>()
works better in the sense that callers of your method will not need to know the underlying implementation. Also, you'll be free to change it to LinkedHashSet
without changing the method signature.
- Variable names
Two points here.
i
is usually used in for
loops, so do consider using a better name in the method signature, e.g. permutateString(String str, int length)
.
Your count
isn't actually counting anything, but to keep track of the last exclusive index for substring
-ing (I think I just made a new word up!). For people who don't use String.substring()
often, that line of code will instead read like you are creating sub-strings of increasing lengths.
Looping
for (int j = 0; j < str.length(); j++ ) { if (count > str.length()) { break; } ... }
This can be better expressed as:
for (int j = 0; j <= str.length() - length; j++ ) { result.add(str.substring(j, j + length)); }
Illustration:
1234567890 123 234 345 456 567 678 789 890
So, to extract 3-character sub-strings from a 10-character string, we need to loop from
i = 0
toi = 7
, ori <= string.length() - length
for the latter.Unit testing
As mentioned elsewhere, please consider using unit testing frameworks like JUnit or TestNG. :)
-
\$\begingroup\$ Maybe I'm not understanding this correctly but wouldn't your refactoring suggestion for point 3 result in being out of bounds in some cases? I used a tracker that increments to prevent that. \$\endgroup\$Legato– Legato2015年01月02日 01:21:36 +00:00Commented Jan 2, 2015 at 1:21
-
\$\begingroup\$ @Legato please see my updated answer. \$\endgroup\$h.j.k.– h.j.k.2015年01月02日 01:32:45 +00:00Commented Jan 2, 2015 at 1:32
-
1\$\begingroup\$ @Legato, sorry, please see my second edit. I forgot to add
j
to the second argument just now. Thanks for pointing this out. \$\endgroup\$h.j.k.– h.j.k.2015年01月02日 02:11:12 +00:00Commented Jan 2, 2015 at 2:11
Explore related questions
See similar questions with these tags.