The following method receives a string and finds all occurrences of repeated substrings with a length of three characters, including spaces. Is this a good approach?
public Map<String, Integer> findOccurences(String str) {
Map<String, Integer> map = new HashMap<String, Integer>();
int counter = 0;
String sub;
int stringLen = str.length() - 2;
for (int i = 0; i < stringLen ; i++) {
sub = str.substring(i, i + 3);
if (map.containsKey(sub)) {
counter = map.get(sub);
counter++;
map.put(sub, counter);
counter = 0;
} else {
map.put(sub, 1);
}
}
return map;
}
Input
This This
Output
Th 1 <<< includes space s T 1 his 2 Thi 2 is 1 <<< includes space
1 Answer 1
It's quite fine. The logic is simple and clear. There might be a more clever solution (I don't know), but probably it will be less clear.
A few improvements would be good though:
counter
is set to 0 at two places, both unnecessary, as the value is overwritten later- It's a good practice to declare variables in the smallest scope where they are used. For example
counter
andsub
would be better to declare inside the loop. stringLen
is not a great name, because it's not really the "length" of anything. Evenlimit
would be better- Instead of the magic number 3, and
str.length() - 2
, it would be better to make the target length a method parameter. - Instead of calling
.containsKey
followed byget
andput
, I prefer toget
, checknull
and thenput
, whenever possible. This is not always possible, for example whennull
is a meaningful entry, but that's not the case here.
Based on the above suggestions, the implementation could become a bit simpler:
public Map<String, Integer> findOccurences(String str, int length) {
Map<String, Integer> map = new HashMap<>();
int limit = str.length() - length + 1;
for (int i = 0; i < limit; i++) {
String sub = str.substring(i, i + length);
Integer counter = map.get(sub);
if (counter == null) {
counter = 0;
}
map.put(sub, ++counter);
}
return map;
}
And it would be good to add a unit test for it:
@Test
public void test_This_This_() {
Map<String, Integer> map = new HashMap<>();
map.put("his", 2);
map.put("Thi", 2);
map.put(" Th", 1);
map.put("s T", 1);
map.put("is ", 2);
assertEquals(map, findOccurences("This This ", 3));
}