4
\$\begingroup\$

I wrote this function to format a given sentence via Java, and I'm wondering if there is a better way of doing this without use of external libraries, such as with regex.

public class ChatFormat {
 public static void main(String[] args) {
 String input = "hello, it's a monday today and i think i'll go to the store. and get? some! tea.";
 System.out.println(optimizeText(input));
 }
 public static String optimizeText(String text) {
 char buf[] = text.toLowerCase().toCharArray();
 boolean endMarker = true;
 for (int i = 0; i < buf.length; i++) {
 char c = buf[i];
 if (endMarker && c >= 'a' && c <= 'z') {
 buf[i] = Character.toUpperCase(c);
 endMarker = false;
 }
 if (c == '.' || c == '!' || c == '?')
 endMarker = true;
 if (c == 'i') {
 char next = 0;
 if (i + 1 < buf.length) {
 next = buf[i + 1];
 }
 char last = 0;
 if (i - 1 > 0)
 last = buf[i - 1];
 if (last == ' ' && (next == ' ' || next == '\'' || next == 0))
 buf[i] = Character.toUpperCase(c);
 }
 }
 return new String(buf, 0, buf.length);
 }
}
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Dec 30, 2014 at 1:49
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$
  1. Inconsistent braces usage

Some of your if statements have enclosing { } but some do not, please standardize and use them throughout. :)

  1. Character comparison

c >= 'a' && c <= 'z' can be better represented as Character.isLetter(c). IMHO it's easier to comprehend.

Similarly, last == ' ' and next == ' ' can be replaced with Character.isWhitespace(last) and Character.isWhitespace(next) respectively.

  1. Derivation of next and last characters when encountering the letter 'i'

I'll prefer to use the ternary operator to set next and last as such, it's a little more compact:

char next = i < buf.length - 1 ? buf[i + 1] : 0;
char last = i > 1 ? buf[i - 1] : 0;
  1. If-else-if ladder

Since your three if conditions are mutually exclusive, you can also consider putting them together in one if-else-if ladder, e.g.

if (endMarker && Character.isLetter(c)) {
 ...
} else if (c == '.' || c == '!' || c == '?') {
 ...
} else if (c == 'i') {
 ...
}
  1. Use of toLowerCase()

Since you are already operating on a character array, perhaps you can consider converting to lower case letter as the final else branch, i.e.

char buf[] = text.toCharArray();
...
(for loop)
 if (...) {
 ...
 } else if (c == 'i') {
 ...
 } else {
 buf[i] = Character.toLowerCase(c);
 }

This does save one extra copy of the String in lower case, if anything.

  1. Testing

It's fine so far to showcase one example using public static void main, but do consider building out some unit tests using a unit testing framework (e.g. TestNG, JUnit):

@Test // TestNG
public void doTest() {
 // Hamcrest matchers
 assertThat(optimizeText("that is its"), equalTo("That is its")); 
 assertThat(optimizeText("i'm"), equalTo("I'm"));
}
  1. Other notes

Your simplified logic doesn't appear to handle sentence structures accurately, but since you seem to be more concerned with the formatting (capitalization), then I guess it's ok.

answered Dec 30, 2014 at 2:41
\$\endgroup\$
2
  • \$\begingroup\$ Very well explained answer. Thanks for this! \$\endgroup\$ Commented Dec 30, 2014 at 3:53
  • \$\begingroup\$ @JonathanBeaudoin glad to help! \$\endgroup\$ Commented Dec 30, 2014 at 3:53

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.