I wrote this function as part of interview practice. This method removes a character from a given string.
I was wondering how I could make this code more efficient when it comes to runtime/space. I think my code is O(n) and I'm not sure if I can increase the efficiency. However, perhaps using things such as StringBuffer
or StringBuilder
would increase it a bit? I'm not sure as I'm still a bit new to Java.
public static String takeOut(String str, char c) {
int len = str.length();
String newstr = "";
for (int i = 0; i < len; i++) {
if (str.charAt(i) != c) {
newstr = newstr + str.charAt(i);
}
}
return newstr;
}
3 Answers 3
Learn from existing implementations, they usually have solutions to corner cases, common pitfalls and performance bottlenecks. For example, Apache Commons Lang StringUtils
also has a remove
function. It's implementation is quite simple and similar to yours:
public static String remove(final String str, final char remove) {
if (isEmpty(str) || str.indexOf(remove) == INDEX_NOT_FOUND) {
return str;
}
final char[] chars = str.toCharArray();
int pos = 0;
for (int i = 0; i < chars.length; i++) {
if (chars[i] != remove) {
chars[pos++] = chars[i];
}
}
return new String(chars, 0, pos);
}
Anyway, there are some differences:
It uses
char
array.String
s are immutable, the following concatenation creates a newString
object:newstr = newstr + str.charAt(i);
I guess it's slower than changing a value in an array.
Using the string concatenation operator repeatedly to concatenate n strings requires time quadratic in n.
Source: Effective Java, 2nd edition, Item 51: Beware the performance of string concatenation
At the first line it checks whether the
String
contains the removed character at all. If it doesn't contain itremove
saves the creation the char array. It might help but it depends on your input.The code in the question calls
String.charAt()
in the comparison which checks the given index:public char charAt(int index) { if ((index < 0) || (index >= value.length)) { throw new StringIndexOutOfBoundsException(index); } return value[index]; }
The
StringUtils
implementation skips this test since it knows that the index is valid.
(See also: Effective Java, 2nd edition, Item 47: Know and use the libraries)
-
3\$\begingroup\$ 1 additional micro optimization would be to cache the result of the
indexOf
call and use it as the starting value in he for loop \$\endgroup\$ratchet freak– ratchet freak2014年02月28日 11:43:01 +00:00Commented Feb 28, 2014 at 11:43 -
\$\begingroup\$ @ratchetfreak I don't get why indexOf is there at all? If the char is not found, looping through the char array should find that out just as fast as indexOf would right? indexOf can't be any faster than looping through the char array right? \$\endgroup\$Cruncher– Cruncher2014年02月28日 14:25:54 +00:00Commented Feb 28, 2014 at 14:25
-
\$\begingroup\$ @Cruncher:
String.toCharArray()
copies the wholeString
to a new array. I guess it could be slow (but it should be measured). \$\endgroup\$palacsint– palacsint2014年02月28日 14:31:33 +00:00Commented Feb 28, 2014 at 14:31
First indeed your code has a complexity of O(n) (n the number of characters of the String), it would precisely depend on the implementation of the method String.charAt(index), but we could most likely consider it has direct access.
O(n) is the lowest one can do. Indeed, think as the following: how could you remove all instances of a character of the String if you do not look at all the characters of this String. So this is hardly possible to do better.
There is a question on SO about performance of String concatenation in Java. For your question I would say then the StringBuilder might be a better option. I would also consider creating an array of char from scratch, but I don't know how the String constructor from an array behaves in terms of performance.
In the end, if the goal is just to remove a character from a String (out of an interview question), usually call String.replaceAll(myChar, "");
-
1\$\begingroup\$ Isn't this code
O(n^2)
? The concatenation operation is alreadyO(newstring.length())
and this can happenn
times for a total cost ofO(n^2)
. (Assuming a naive compiler that doesn't silently replaceString
withStringBuilder
) \$\endgroup\$CodesInChaos– CodesInChaos2014年02月28日 16:09:20 +00:00Commented Feb 28, 2014 at 16:09 -
\$\begingroup\$ @CodesInChaos the code is O(n^2) you are correct because of the way I am concatenating the String objects \$\endgroup\$Liondancer– Liondancer2014年02月28日 16:16:51 +00:00Commented Feb 28, 2014 at 16:16
-
\$\begingroup\$ OK so clearly using an array of characters would be faster (I suppose with direct access). \$\endgroup\$Vince– Vince2014年02月28日 16:18:40 +00:00Commented Feb 28, 2014 at 16:18
I would add the following:
unit test
Code some unit tests and it has to become an habit for you. If you publish some code then also add the unit tests. First, you can check yourself the correctness of your code and second it will help us to refactor your code (producing a better solution).
good names
Please use good names. I know, even the names used by Apache are not good. No 'c', no 'str'. A name that is self descriptive. No comments needed. I know that not everybody will agree because the code seems a little bit more 'heavy', but what a pleasure to read!
-
\$\begingroup\$ What could you use instead of
str
andc
in this library function? \$\endgroup\$ChrisW– ChrisW2014年02月28日 10:40:31 +00:00Commented Feb 28, 2014 at 10:40 -
\$\begingroup\$ I it my own responsibility but I prefer a lot more to read stringToBeProcessed and characterToBeRemoved... But I know that some people will disagree... \$\endgroup\$Rudy Vissers– Rudy Vissers2014年02月28日 10:46:08 +00:00Commented Feb 28, 2014 at 10:46
-
\$\begingroup\$ Btw because you will write lots of small methods (good method names) that do almost nothing and avoid code duplication, the self descriptive names will not be a big problem... \$\endgroup\$Rudy Vissers– Rudy Vissers2014年02月28日 10:51:26 +00:00Commented Feb 28, 2014 at 10:51
Explore related questions
See similar questions with these tags.