I have a method that reverse each word in a string. I want to know how I can increase it's performance and memory efficiency. Ideas I had in mind were simply using char[]
instead of strings being that strings are immutable and each time I concatenate the Strings, I am creating a new String object which is not efficient at all.
Example:
hi there cow --> ih ereht woc
Code:
public static String reverseWord(String str) {
int len = str.length();
String strResult = "";
String strBuffer = "";
for (int i = 0; i < len; i++) {
if (str.charAt(i) != ' ') {
strBuffer = str.charAt(i) + strBuffer;
}
else if (str.charAt(i) == ' ') {
strResult += strBuffer + " ";
strBuffer = "";
}
}
strResult += strBuffer;
return strResult;
}
-
\$\begingroup\$ possible duplicate of Reversing words in a string \$\endgroup\$James Khoury– James Khoury2014年03月14日 06:05:24 +00:00Commented Mar 14, 2014 at 6:05
-
\$\begingroup\$ @JamesKhoury This question seeks to reverse the characters within each word. The other question seeks to reverse the words within a sentence. \$\endgroup\$200_success– 200_success2014年03月14日 06:12:09 +00:00Commented Mar 14, 2014 at 6:12
-
\$\begingroup\$ @200_success You're right. I had missed that distinction. \$\endgroup\$James Khoury– James Khoury2014年03月14日 06:14:54 +00:00Commented Mar 14, 2014 at 6:14
3 Answers 3
I literally fell asleep with an answer mostly built.... and now that 200_success and palacsint have covered pretty much my complete answer.
I have to reiterate though that the use of strBuffer = str.charAt(i) + strBuffer;
will be your worst performance culprit.
Also, (as I learned recently), your code will not handle surrogate Unicode characters well (it will reverse and thus break them). So, I had put together some code which I believe will be much faster than yours, it will handle space and punctuation in a very different way to yours, and it will correctly not reverse surrogate pairs in the text.
The reason this code will be faster is because it converts the input String to a single char array, and it does not create any other objects until the end, when it converts the array back to the output String.
public class WordRev {
public static void main(String[] args) {
String[] phrases = { "Hello World!!", " Bilbo Bagginses",
"Pair\uD800\uDC00Pair" };
for (String p : phrases) {
System.out.print(p + " -> ");
System.out.println(wordReverse(p));
}
}
private static final boolean isWordChar(final char ch) {
return Character.isLetterOrDigit(ch) || Character.isSurrogate(ch);
}
private static final String wordReverse(final String p) {
char[] chars = p.toCharArray();
int wordstart = 0;
while (wordstart < chars.length) {
while (wordstart < chars.length && !isWordChar(chars[wordstart])) {
wordstart++;
}
int wordend = wordstart + 1;
boolean hasSurrogate = false;
while (wordend < chars.length && isWordChar(chars[wordend])) {
if (Character.isSurrogate(chars[wordend])) {
hasSurrogate = true;
}
wordend++;
}
if (wordend - 1 > wordstart) {
// reverse between.
int mid = wordstart + (wordend - wordstart - 1) / 2;
int midoff = (wordend - wordstart - 1) & 1;
for (int i = mid - wordstart; i >= 0; i--) {
char tmp = chars[mid - i];
chars[mid - i] = chars[mid + midoff + i];
chars[mid + midoff + i] = tmp;
}
if (hasSurrogate) {
for (int i = wordstart + 1; i < wordend; i++) {
if (Character.isLowSurrogate(chars[i - 1])
&& Character.isHighSurrogate(chars[i])) {
char tmp = chars[i];
chars[i] = chars[i - 1];
chars[i - 1] = tmp;
}
}
}
}
wordstart = wordend + 1;
}
return new String(chars);
}
}
-
\$\begingroup\$ wow you really went over the top with this and i appreciate it! i thought about surrogate pairs but decided to ignore these cases due to my lack of understanding in concepts. \$\endgroup\$Liondancer– Liondancer2014年03月14日 15:56:39 +00:00Commented Mar 14, 2014 at 15:56
Ideas I had in mind were simply using char[] instead of strings being that strings are immutable and each time I concatenate the Strings, I am creating a new String object which is not efficient at all.
Yes, that will be faster. Go ahead and write it :-)
Instead of
String
s likestrResult
andstrBuffer
you could useStringBuilder
s (although using char is probably faster).StringBuilder
is mutable and would be faster than usingString
s.StringBuilder
also supportsinsert(offset, char)
but also has areverse
method.I'd rename
strResult
toresult
(to avoid Hungarian notation) andstrBuffer
to the more descriptivecurrentWord
.if (str.charAt(i) != ' ') { ... } else if (str.charAt(i) == ' ') { ... }
The following is the same, you don't need the second, inverted condition:
if (str.charAt(i) != ' ') { ... } else { ... }
You might want to use a more another check for spaces. Currently it handles tabs (
\t
) as normal characters and"hi\tthere\tcow"
will be
"woc\tereht\tih"
Here is a slightly improved version with StringBuilder
s:
public static String reverseWord(String str) {
int len = str.length();
StringBuilder result = new StringBuilder();
StringBuilder currentWord = new StringBuilder();
for (int i = 0; i < len; i++) {
if (str.charAt(i) != ' ') {
currentWord.insert(0, str.charAt(i));
} else {
result.append(currentWord).append(" ");
currentWord.setLength(0);
}
}
result.append(currentWord);
return result.toString();
}
Your instincts are correct: the current solution is terribly inefficient. You'll do approximately one string concatenation for each character in the input string. Each such concatenation would involve allocating a new string and copying the previous characters. The running time is O(m n), where m is the average word length and n is the total number of characters.
That said, your algorithm has the advantage of being easy to understand. For a short string such as "hi there cow", efficiency doesn't really matter. Sometimes a simple but inefficient algorithm is preferable.
strBuffer
is an unfortunately confusing name for a String
, as it suggests that it would be a StringBuffer
.
Converting your algorithm to use a StringBuilder
or StringBuffer
would be a slight improvement, but it would still be inefficient, since stringBuilder.insert(0, ...)
would result in a lot of characters having to be shifted over. It would still be O(m n).
For efficiency, swapping characters within a char[]
array is definitely the way to go. To implement that, I suggest defining a helper function
/**
* Reverses the characters of buf, resulting in the characters between
* start (inclusive) and end (exclusive) get swapped.
*/
private static void reverseChar(char[] buf, int start, int end) {
...
}
-
\$\begingroup\$ Quick note for OP:
StringBuffer
is for multithread accessing andStringBuilder
is single thread operation. Synchronization have a good impact on performance, soStringBuilder
is normally preferred in single thread environment. \$\endgroup\$Marc-Andre– Marc-Andre2014年03月14日 14:37:54 +00:00Commented Mar 14, 2014 at 14:37