I want to check if a string contains only a single occurrence of the given substring (so the result of containsOnce("foo-and-boo", "oo")
should be false
). In Java, a simple and straightforward implementation would be either
boolean containsOnce(final String s, final CharSequence substring) {
final String substring0 = substring.toString();
final int i = s.indexOf(substring0);
return i != -1 && i == s.lastIndexOf(substring0);
}
or
boolean containsOnce(final String s, final CharSequence substring) {
final String substring0 = substring.toString();
final int i = s.indexOf(substring0);
if (i == -1) {
return false;
}
final int nextIndexOf = s.indexOf(substring0, i + 1);
return nextIndexOf == 0 || nextIndexOf == -1; // nextIndexOf is 0 if both arguments are empty strings.
}
Can you suggest a simpler/more efficient implementation?
3 Answers 3
Perhaps using pattern matching. It looks more readable to me, but I have no idea about which is more efficient.
public static boolean containsOnce(final String s, final CharSequence substring) {
Pattern pattern = Pattern.compile(substring.toString());
Matcher matcher = pattern.matcher(s);
if(matcher.find()){
return !matcher.find();
}
return false;
}
public static void main(String[] args) throws Exception {
System.out.println(containsOnce("aba","a")); //false
System.out.println(containsOnce("abab", "ab")); //false
System.out.println(containsOnce("aba", "b")); //true
System.out.println(containsOnce("aaa", "aa")); //true
System.out.println(containsOnce("","")); //true
System.out.println(containsOnce("ab","")); //false
}
Also note that containsOnce("aaa","aa")
returns true
. Not sure if you would count this as correct or not.
Note: you can also write it as
Pattern pattern = Pattern.compile(substring.toString());
Matcher matcher = pattern.matcher(s);
return matcher.find() && !matcher.find();
I don't know which of the two is more readable.
EDIT:
After looking up whichever is faster: indexof vs matcher
It seems to be the general consensus that indexof is a bit faster, but that we're talking about such low times that it really doesn't matter much.
Go for whichever solution you prefer I guess ...
-
\$\begingroup\$
containsOnce("ab", "*")
throws an exception. \$\endgroup\$Roland Illig– Roland Illig2021年12月19日 10:50:42 +00:00Commented Dec 19, 2021 at 10:50
I don't see anything wrong with the original post. It's concise and readable:
boolean containsOnce(String s, String sub) {
int firstIndex = s.indexOf(sub)
return firstIndex >=0 && firstIndex == s.lastIndexOf(sub)
}
I was writing a test recently that needed to verify that adding a string would refuse if it was already present (I used a set in the black box method) so I wrote something like this:
public void testNonDuplicateAdd() {
String actual = methodUnderTest(stringToAdd, addingTo);
assertThat(actual, containsString(stringToAdd));
int firstIndex = actual.indexOf(stringToAdd);
int lastIndex = actual.lastIndexOf(stringToAdd);
assertThat(firstIndex, equals(lastIndex));
}
In terms of efficiency, the substring.toString()
is the showstopper as soon as substring
is not a java.lang.String
. The rest of the code looks fine.
The documentation of the method should include containsOnce("fooo", "oo")
as an example for returning false
, since that might be a non-obvious corner case.
oo
in the stringooaooxxxxxxxxxxxxx....(huge string here)
the second solution will only scan 2 + 3 letters. The first will scan the full huge string. I do agree that the second solution is efficient and readable enough though. \$\endgroup\$ooxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxoo
the first solution will be faster aslastIndexOf()
scans backwards? \$\endgroup\$