I am working through the Coding Bat exercises for Java. Here is one I have just completed:
Given a string and a non-empty word string, return a version of the original String where all chars have been replaced by pluses ("+"), except for appearances of the word string which are preserved unchanged.
And here is my solution:
public String plusOut(String str, String word){
String s = "";
String n = str + " ";
int i = 0;
while(i < str.length()) {
if (!(n.substring(i, i + word.length()).equals(word))) {
s += "+";
i++;
} else {
s += word;
i += word.length();
}
}
return s;
}
As you can see, I concatenated two spaces on the original string and used that to test the substrings. This was to avoid out of bounds exceptions.
What I want to do is optimise my solution, because I don't like how I added two spaces just to avoid errors. I want to change the if statement so that it tests the substrings of str rather than n, and get rid of n entirely. I have tried but I can't seem to find a way of doing it. How can I do this? What else needs optimising?
-
1\$\begingroup\$ Not a solution, but a hint is to check the word length before doing the substring. Your issue seems to be when i approaches the end of str, then you need to ensure that you don't attempt to substring past the end. Post back any attempts or success, and we can help push in the right direction. \$\endgroup\$Matt Klinker– Matt Klinker2015年04月08日 19:03:17 +00:00Commented Apr 8, 2015 at 19:03
4 Answers 4
Manipulating Strings directly is a bad idea, especially when you are doing it repeatedly. Java strings are immutable, so every time you do s += ..., you are making a copy of the entire provisional result. Anytime you need to do a substantial amount of editing on a string, use a StringBuilder.
To look for the next occurrence of word, use String.indexOf() or StringBuilder.indexOf(). It's a lot easier than extracting substrings and checking for equality. When extracting the substring, your code crashes if word is longer than three characters. Using .indexOf() is one way to eliminate that problem.
Here is an example of a more efficient solution:
public String plusOut(String str, String word) {
StringBuilder s = new StringBuilder(str);
// Find substrings to redact, from index i (inclusive)
// to index j (exclusive)
for (int i = 0; i < s.length(); i += word.length()) {
int j = s.indexOf(word, i);
if (j < 0) {
// word not found; redact the rest of the string
j = s.length();
}
while (i < j) {
s.setCharAt(i++, '+');
}
}
return s.toString();
}
-
\$\begingroup\$ Duly noted. Need to spend a lot of time with StringBuilder. Thanks for pointing me in the right direction. \$\endgroup\$alanbuchanan– alanbuchanan2015年04月09日 01:19:57 +00:00Commented Apr 9, 2015 at 1:19
As @mklinker proposed, you could try to use the word lengths. I tried to come up with an example:
public String plusOut(String str, String word){
String result = "";
int s = str.length();
int w = word.length();
for (int i = 0; i < s; i++) {
if (i <= s - w) {
if (str.substring(i,i+w).equals(word)) {
result += word;
i += w-1;
}
else {
result += "+";
}
}
else {
result += "+";
}
}
return result;
}
-
\$\begingroup\$ I suggest rewriting the loop as
for (int i = 0; i < s; ) { if (i <= s - w && str.substring(i, i+w).equals(word)) { result += word; i += w; } else { result += "+"; i++; } }. It's like the original code, except that thei <= s - wprerequisite acts to prevent.substring()from crashing. Movingi++out of the loop header makes it clearer thatiincreases by the length of the newly appended suffix. \$\endgroup\$200_success– 200_success2015年04月09日 06:16:08 +00:00Commented Apr 9, 2015 at 6:16 -
\$\begingroup\$ Instead of incrementing
iyourself, you could write the loop aswhile ((i = result.length()) < s) { ... }, which makes it clear thatiis always the length of theresultso far. \$\endgroup\$200_success– 200_success2015年04月09日 06:20:49 +00:00Commented Apr 9, 2015 at 6:20
At its core, this is a conditional string replacement problem, so it might make sense to use string replacement techniques.
public String plusOut(String str, String word) {
StringBuffer result = new StringBuffer(str.length());
Matcher m = Pattern.compile(Pattern.quote(word) + "|.").matcher(str);
while (m.find()) {
// Replace word with itself, and any other character with "+"
m.appendReplacement(result, word.equals(m.group()) ? "0ドル" : "+");
}
return m.appendTail(result).toString();
}
At every point in str, it looks for either the word (which, if found, is replaced by itself) or a single character (which is replaced by a "+").
Pros
- This solution is based on a standard recipe.
- The thinking is at a higher level of abstraction, without headache-inducing indexes like
ior the possibility ofStringIndexOutOfBoundsExceptions. All of those details are taken care of by code withinMatcher.
Cons
- This solution is appealing to those who know regular expressions (which is really a separate skill from knowing Java), and probably befuddling to those who don't.
There are subtle pitfalls that could cause one to write this solution wrong:
- The alternation in the pattern has to be
(?>word)|., not.|(?>word), due to the leftmost-longest rule of pattern matching. - The word in the pattern has to be quoted in case
wordcontains characters with special significance in regexes. - The replacement text has to be
"0ドル", notword, in casewordcontains a$character.
- The alternation in the pattern has to be
- It is less efficient than a well-implemented low-level
StringBuilder-based solution. Still, it scales better than your original code (which is O(n2) due to inefficient string concatenation).
200_success and DJanssens both make good points.
I want to add that, again, Regular Expressions are often useful for text manipulation. This work requires a complicated pattern if you want to keep the groups, but if you can ignore the groups, it is easier.
Note also, that StringBuilders are also more efficient than String concatenation.
Two traditional Java versions. One using a simple split, the other using a more complex one:
Simple split
public static String plusOut(String str, String word) {
String[] parts = str.split(Pattern.quote(word));
StringBuilder sb = new StringBuilder(str.length());
for (int i = 0; i < parts.length; i++) {
if (i > 0) {
sb.append(word);
}
sb.append(parts[i].replaceAll(".", "+"));
}
return sb.toString();
}
Note the use of Pattern.quote(...) in case the input word is something like Boo(hoo)? ... we don't want the word to be interpreted as a regex pattern itself.
Complex split:
public static String plusOut(String str, String word) {
String[] parts = str.split("(?=\\Q" + word + "\\E)|(?<=\\Q" + word + "\\E)");
StringBuilder sb = new StringBuilder(str.length());
for (int i = 0; i < parts.length; i++) {
sb.append(i % 2 == 0 ? parts[i].replaceAll(".", "+") : parts[i]);
}
return sb.toString();
}
Java 8
In Java 8 I would consider the following:
public static String plusOut8(String str, String word) {
return Stream.of(str.split("\\Q" + word + "\\E"))
.map(fill -> fill.replaceAll(".", "+"))
.collect(Collectors.joining(word));
}
See these all working in Ideone
-
1\$\begingroup\$ Solutions based on
split()are fraught with peril. In fact, none of these three implementations is correct forplusOut("monkey-monkeymonkey", "monkey"), and they even break in different ways in Java 7 and 8, due to a compatibility-breaking change. Even without that change, the boundary conditions are difficult to get right when splitting. \$\endgroup\$200_success– 200_success2015年04月09日 05:55:11 +00:00Commented Apr 9, 2015 at 5:55
You must log in to answer this question.
Explore related questions
See similar questions with these tags.