Checking whether a string contains a substring only once

Question 1

I want to check if a string contains only a single occurrence of the given substring (so the result of containsOnce("foo-and-boo", "oo") should be false). In Java, a simple and straightforward implementation would be either

boolean containsOnce(final String s, final CharSequence substring) {
 final String substring0 = substring.toString();
 final int i = s.indexOf(substring0);
 return i != -1 && i == s.lastIndexOf(substring0);
}

or

boolean containsOnce(final String s, final CharSequence substring) {
 final String substring0 = substring.toString();
 final int i = s.indexOf(substring0);
 if (i == -1) {
 return false;
 }
 final int nextIndexOf = s.indexOf(substring0, i + 1);
 return nextIndexOf == 0 || nextIndexOf == -1; // nextIndexOf is 0 if both arguments are empty strings.
}

Can you suggest a simpler/more efficient implementation?

Question 2

Why do you assume the solution is not efficient? In your first solution you only scan the characters that are needed, which seems very efficient already?

Question 3

@RobAu In the first implementation you scan the string up till the end to search for a second occurence. If you search for oo in the string ooaooxxxxxxxxxxxxx....(huge string here) the second solution will only scan 2 + 3 letters. The first will scan the full huge string. I do agree that the second solution is efficient and readable enough though.

Question 4

@Imus maybe I am missing something, but if the huge string is ooxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxoo the first solution will be faster as lastIndexOf() scans backwards?

Question 5

Aha, you are correct. ignore my previous comment :). Then it also comes down to preference on which is more readable, the first solution or my answer.

Question 6

Perhaps using pattern matching. It looks more readable to me, but I have no idea about which is more efficient.

public static boolean containsOnce(final String s, final CharSequence substring) {
 Pattern pattern = Pattern.compile(substring.toString());
 Matcher matcher = pattern.matcher(s);
 if(matcher.find()){
 return !matcher.find();
 }
 return false;
}
public static void main(String[] args) throws Exception {
 System.out.println(containsOnce("aba","a")); //false
 System.out.println(containsOnce("abab", "ab")); //false
 System.out.println(containsOnce("aba", "b")); //true
 System.out.println(containsOnce("aaa", "aa")); //true
 System.out.println(containsOnce("","")); //true
 System.out.println(containsOnce("ab","")); //false
}

Also note that containsOnce("aaa","aa") returns true. Not sure if you would count this as correct or not.

Note: you can also write it as

Pattern pattern = Pattern.compile(substring.toString());
Matcher matcher = pattern.matcher(s);
return matcher.find() && !matcher.find();

I don't know which of the two is more readable.

EDIT:

After looking up whichever is faster: indexof vs matcher
It seems to be the general consensus that indexof is a bit faster, but that we're talking about such low times that it really doesn't matter much.

Go for whichever solution you prefer I guess ...

Question 7

containsOnce("ab", "*") throws an exception.

Question 8

I don't see anything wrong with the original post. It's concise and readable:

boolean containsOnce(String s, String sub) {
 int firstIndex = s.indexOf(sub)
 return firstIndex >=0 && firstIndex == s.lastIndexOf(sub)
}

I was writing a test recently that needed to verify that adding a string would refuse if it was already present (I used a set in the black box method) so I wrote something like this:

public void testNonDuplicateAdd() {
 String actual = methodUnderTest(stringToAdd, addingTo);
 assertThat(actual, containsString(stringToAdd));
 int firstIndex = actual.indexOf(stringToAdd);
 int lastIndex = actual.lastIndexOf(stringToAdd);
 assertThat(firstIndex, equals(lastIndex));
}

Question 9

In terms of efficiency, the substring.toString() is the showstopper as soon as substring is not a java.lang.String. The rest of the code looks fine.

The documentation of the method should include containsOnce("fooo", "oo") as an example for returning false, since that might be a non-obvious corner case.

Imus Imus 4,38710 silver badges25 bronze badges · Answer 1 · 2017-04-21 11:48:19Z

Perhaps using pattern matching. It looks more readable to me, but I have no idea about which is more efficient.

public static boolean containsOnce(final String s, final CharSequence substring) {
 Pattern pattern = Pattern.compile(substring.toString());
 Matcher matcher = pattern.matcher(s);
 if(matcher.find()){
 return !matcher.find();
 }
 return false;
}
public static void main(String[] args) throws Exception {
 System.out.println(containsOnce("aba","a")); //false
 System.out.println(containsOnce("abab", "ab")); //false
 System.out.println(containsOnce("aba", "b")); //true
 System.out.println(containsOnce("aaa", "aa")); //true
 System.out.println(containsOnce("","")); //true
 System.out.println(containsOnce("ab","")); //false
}

Also note that containsOnce("aaa","aa") returns true. Not sure if you would count this as correct or not.

Note: you can also write it as

Pattern pattern = Pattern.compile(substring.toString());
Matcher matcher = pattern.matcher(s);
return matcher.find() && !matcher.find();

I don't know which of the two is more readable.

EDIT:

After looking up whichever is faster: indexof vs matcher
It seems to be the general consensus that indexof is a bit faster, but that we're talking about such low times that it really doesn't matter much.

Go for whichever solution you prefer I guess ...

\$\begingroup\$ containsOnce("ab", "*") throws an exception. \$\endgroup\$

Roland Illig
– Roland Illig

2021年12月19日 10:50:42 +00:00
Commented Dec 19, 2021 at 10:50

Loran Loran 1112 bronze badges · Answer 2 · 2018-05-17 18:42:42Z

I don't see anything wrong with the original post. It's concise and readable:

boolean containsOnce(String s, String sub) {
 int firstIndex = s.indexOf(sub)
 return firstIndex >=0 && firstIndex == s.lastIndexOf(sub)
}

I was writing a test recently that needed to verify that adding a string would refuse if it was already present (I used a set in the black box method) so I wrote something like this:

public void testNonDuplicateAdd() {
 String actual = methodUnderTest(stringToAdd, addingTo);
 assertThat(actual, containsString(stringToAdd));
 int firstIndex = actual.indexOf(stringToAdd);
 int lastIndex = actual.lastIndexOf(stringToAdd);
 assertThat(firstIndex, equals(lastIndex));
}

Roland Illig Roland Illig 21.8k2 gold badges36 silver badges83 bronze badges · Answer 3 · 2017-04-21 20:08:47Z

In terms of efficiency, the substring.toString() is the showstopper as soon as substring is not a java.lang.String. The rest of the code looks fine.

The documentation of the method should include containsOnce("fooo", "oo") as an example for returning false, since that might be a non-obvious corner case.

Stack Exchange Network

Checking whether a string contains a substring only once

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Checking whether a string contains a substring only once

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions