Reverse word by word efficiency

Question 1

I have a method that reverse each word in a string. I want to know how I can increase it's performance and memory efficiency. Ideas I had in mind were simply using char[] instead of strings being that strings are immutable and each time I concatenate the Strings, I am creating a new String object which is not efficient at all.

Example:

hi there cow --> ih ereht woc

Code:

public static String reverseWord(String str) {
 int len = str.length();
 String strResult = "";
 String strBuffer = "";
 for (int i = 0; i < len; i++) {
 if (str.charAt(i) != ' ') {
 strBuffer = str.charAt(i) + strBuffer;
 }
 else if (str.charAt(i) == ' ') {
 strResult += strBuffer + " ";
 strBuffer = "";
 }
 }
 strResult += strBuffer;
 return strResult;
}

Question 2

possible duplicate of Reversing words in a string

Question 3

@JamesKhoury This question seeks to reverse the characters within each word. The other question seeks to reverse the words within a sentence.

Question 4

@200_success You're right. I had missed that distinction.

Question 5

I literally fell asleep with an answer mostly built.... and now that 200_success and palacsint have covered pretty much my complete answer.

I have to reiterate though that the use of strBuffer = str.charAt(i) + strBuffer; will be your worst performance culprit.

Also, (as I learned recently), your code will not handle surrogate Unicode characters well (it will reverse and thus break them). So, I had put together some code which I believe will be much faster than yours, it will handle space and punctuation in a very different way to yours, and it will correctly not reverse surrogate pairs in the text.

The reason this code will be faster is because it converts the input String to a single char array, and it does not create any other objects until the end, when it converts the array back to the output String.

public class WordRev {
 public static void main(String[] args) {
 String[] phrases = { "Hello World!!", " Bilbo Bagginses",
 "Pair\uD800\uDC00Pair" };
 for (String p : phrases) {
 System.out.print(p + " -> ");
 System.out.println(wordReverse(p));
 }
 }
 private static final boolean isWordChar(final char ch) {
 return Character.isLetterOrDigit(ch) || Character.isSurrogate(ch);
 }
 private static final String wordReverse(final String p) {
 char[] chars = p.toCharArray();
 int wordstart = 0;
 while (wordstart < chars.length) {
 while (wordstart < chars.length && !isWordChar(chars[wordstart])) {
 wordstart++;
 }
 int wordend = wordstart + 1;
 boolean hasSurrogate = false;
 while (wordend < chars.length && isWordChar(chars[wordend])) {
 if (Character.isSurrogate(chars[wordend])) {
 hasSurrogate = true;
 }
 wordend++;
 }
 if (wordend - 1 > wordstart) {
 // reverse between.
 int mid = wordstart + (wordend - wordstart - 1) / 2;
 int midoff = (wordend - wordstart - 1) & 1;
 for (int i = mid - wordstart; i >= 0; i--) {
 char tmp = chars[mid - i];
 chars[mid - i] = chars[mid + midoff + i];
 chars[mid + midoff + i] = tmp;
 }
 if (hasSurrogate) {
 for (int i = wordstart + 1; i < wordend; i++) {
 if (Character.isLowSurrogate(chars[i - 1])
 && Character.isHighSurrogate(chars[i])) {
 char tmp = chars[i];
 chars[i] = chars[i - 1];
 chars[i - 1] = tmp;
 }
 }
 }
 }
 wordstart = wordend + 1;
 }
 return new String(chars);
 }
}

Question 6

wow you really went over the top with this and i appreciate it! i thought about surrogate pairs but decided to ignore these cases due to my lack of understanding in concepts.

Question 7

Ideas I had in mind were simply using char[] instead of strings being that strings are immutable and each time I concatenate the Strings, I am creating a new String object which is not efficient at all.

Yes, that will be faster. Go ahead and write it :-)
Instead of Strings like strResult and strBuffer you could use StringBuilders (although using char is probably faster). StringBuilder is mutable and would be faster than using Strings. StringBuilder also supports insert(offset, char) but also has a reverse method.
I'd rename strResult to result (to avoid Hungarian notation) and strBuffer to the more descriptive currentWord.

if (str.charAt(i) != ' ') {
 ...
}
else if (str.charAt(i) == ' ') {
 ...
}

The following is the same, you don't need the second, inverted condition:

if (str.charAt(i) != ' ') {
 ...
} else {
 ...
}

You might want to use a more another check for spaces. Currently it handles tabs (\t) as normal characters and
```
"hi\tthere\tcow"
```
will be
```
"woc\tereht\tih"
```

Here is a slightly improved version with StringBuilders:

public static String reverseWord(String str) {
 int len = str.length();
 StringBuilder result = new StringBuilder();
 StringBuilder currentWord = new StringBuilder();
 for (int i = 0; i < len; i++) {
 if (str.charAt(i) != ' ') {
 currentWord.insert(0, str.charAt(i));
 } else {
 result.append(currentWord).append(" ");
 currentWord.setLength(0);
 }
 }
 result.append(currentWord);
 return result.toString();
}

Question 8

Your instincts are correct: the current solution is terribly inefficient. You'll do approximately one string concatenation for each character in the input string. Each such concatenation would involve allocating a new string and copying the previous characters. The running time is O(m n), where m is the average word length and n is the total number of characters.

That said, your algorithm has the advantage of being easy to understand. For a short string such as "hi there cow", efficiency doesn't really matter. Sometimes a simple but inefficient algorithm is preferable.

strBuffer is an unfortunately confusing name for a String, as it suggests that it would be a StringBuffer.

Converting your algorithm to use a StringBuilder or StringBuffer would be a slight improvement, but it would still be inefficient, since stringBuilder.insert(0, ...) would result in a lot of characters having to be shifted over. It would still be O(m n).

For efficiency, swapping characters within a char[] array is definitely the way to go. To implement that, I suggest defining a helper function

/**
 * Reverses the characters of buf, resulting in the characters between
 * start (inclusive) and end (exclusive) get swapped.
 */
private static void reverseChar(char[] buf, int start, int end) {
 ...
}

Question 9

Quick note for OP: StringBuffer is for multithread accessing and StringBuilder is single thread operation. Synchronization have a good impact on performance, so StringBuilder is normally preferred in single thread environment.

rolfl rolfl 98.1k17 gold badges219 silver badges419 bronze badges · Accepted Answer · 2014-03-14 11:56:59Z

I literally fell asleep with an answer mostly built.... and now that 200_success and palacsint have covered pretty much my complete answer.

I have to reiterate though that the use of strBuffer = str.charAt(i) + strBuffer; will be your worst performance culprit.

Also, (as I learned recently), your code will not handle surrogate Unicode characters well (it will reverse and thus break them). So, I had put together some code which I believe will be much faster than yours, it will handle space and punctuation in a very different way to yours, and it will correctly not reverse surrogate pairs in the text.

The reason this code will be faster is because it converts the input String to a single char array, and it does not create any other objects until the end, when it converts the array back to the output String.

public class WordRev {
 public static void main(String[] args) {
 String[] phrases = { "Hello World!!", " Bilbo Bagginses",
 "Pair\uD800\uDC00Pair" };
 for (String p : phrases) {
 System.out.print(p + " -> ");
 System.out.println(wordReverse(p));
 }
 }
 private static final boolean isWordChar(final char ch) {
 return Character.isLetterOrDigit(ch) || Character.isSurrogate(ch);
 }
 private static final String wordReverse(final String p) {
 char[] chars = p.toCharArray();
 int wordstart = 0;
 while (wordstart < chars.length) {
 while (wordstart < chars.length && !isWordChar(chars[wordstart])) {
 wordstart++;
 }
 int wordend = wordstart + 1;
 boolean hasSurrogate = false;
 while (wordend < chars.length && isWordChar(chars[wordend])) {
 if (Character.isSurrogate(chars[wordend])) {
 hasSurrogate = true;
 }
 wordend++;
 }
 if (wordend - 1 > wordstart) {
 // reverse between.
 int mid = wordstart + (wordend - wordstart - 1) / 2;
 int midoff = (wordend - wordstart - 1) & 1;
 for (int i = mid - wordstart; i >= 0; i--) {
 char tmp = chars[mid - i];
 chars[mid - i] = chars[mid + midoff + i];
 chars[mid + midoff + i] = tmp;
 }
 if (hasSurrogate) {
 for (int i = wordstart + 1; i < wordend; i++) {
 if (Character.isLowSurrogate(chars[i - 1])
 && Character.isHighSurrogate(chars[i])) {
 char tmp = chars[i];
 chars[i] = chars[i - 1];
 chars[i - 1] = tmp;
 }
 }
 }
 }
 wordstart = wordend + 1;
 }
 return new String(chars);
 }
}

wow you really went over the top with this and i appreciate it! i thought about surrogate pairs but decided to ignore these cases due to my lack of understanding in concepts.

Stack Exchange Network

Reverse word by word efficiency

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Reverse word by word efficiency

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions