I wrote this function to format a given sentence via Java, and I'm wondering if there is a better way of doing this without use of external libraries, such as with regex.
public class ChatFormat {
public static void main(String[] args) {
String input = "hello, it's a monday today and i think i'll go to the store. and get? some! tea.";
System.out.println(optimizeText(input));
}
public static String optimizeText(String text) {
char buf[] = text.toLowerCase().toCharArray();
boolean endMarker = true;
for (int i = 0; i < buf.length; i++) {
char c = buf[i];
if (endMarker && c >= 'a' && c <= 'z') {
buf[i] = Character.toUpperCase(c);
endMarker = false;
}
if (c == '.' || c == '!' || c == '?')
endMarker = true;
if (c == 'i') {
char next = 0;
if (i + 1 < buf.length) {
next = buf[i + 1];
}
char last = 0;
if (i - 1 > 0)
last = buf[i - 1];
if (last == ' ' && (next == ' ' || next == '\'' || next == 0))
buf[i] = Character.toUpperCase(c);
}
}
return new String(buf, 0, buf.length);
}
}
1 Answer 1
- Inconsistent braces usage
Some of your if
statements have enclosing { }
but some do not, please standardize and use them throughout. :)
- Character comparison
c >= 'a' && c <= 'z'
can be better represented as Character.isLetter(c)
. IMHO it's easier to comprehend.
Similarly, last == ' '
and next == ' '
can be replaced with Character.isWhitespace(last)
and Character.isWhitespace(next)
respectively.
- Derivation of
next
andlast
characters when encountering the letter'i'
I'll prefer to use the ternary operator to set next
and last
as such, it's a little more compact:
char next = i < buf.length - 1 ? buf[i + 1] : 0;
char last = i > 1 ? buf[i - 1] : 0;
If-else-if
ladder
Since your three if
conditions are mutually exclusive, you can also consider putting them together in one if-else-if
ladder, e.g.
if (endMarker && Character.isLetter(c)) {
...
} else if (c == '.' || c == '!' || c == '?') {
...
} else if (c == 'i') {
...
}
- Use of
toLowerCase()
Since you are already operating on a character array, perhaps you can consider converting to lower case letter as the final else
branch, i.e.
char buf[] = text.toCharArray();
...
(for loop)
if (...) {
...
} else if (c == 'i') {
...
} else {
buf[i] = Character.toLowerCase(c);
}
This does save one extra copy of the String
in lower case, if anything.
- Testing
It's fine so far to showcase one example using public static void main
, but do consider building out some unit tests using a unit testing framework (e.g. TestNG, JUnit):
@Test // TestNG
public void doTest() {
// Hamcrest matchers
assertThat(optimizeText("that is its"), equalTo("That is its"));
assertThat(optimizeText("i'm"), equalTo("I'm"));
}
- Other notes
Your simplified logic doesn't appear to handle sentence structures accurately, but since you seem to be more concerned with the formatting (capitalization), then I guess it's ok.
-
\$\begingroup\$ Very well explained answer. Thanks for this! \$\endgroup\$Jonathan Beaudoin– Jonathan Beaudoin2014年12月30日 03:53:06 +00:00Commented Dec 30, 2014 at 3:53
-
\$\begingroup\$ @JonathanBeaudoin glad to help! \$\endgroup\$h.j.k.– h.j.k.2014年12月30日 03:53:27 +00:00Commented Dec 30, 2014 at 3:53