Implement a method to perform basic string compression using the counts of repeated characters. For example, the string
aabcccccaaa
would becomea2blc5a3
. If the "compressed" string would not become smaller than the original string, your method should return the original string.
Can this be implemented in a better way performance-wise?
package string;
public class CompressChar {
int[] seq = new int [256];
public String compressString(String str){
StringBuffer strComp = new StringBuffer();
for( char c : str.toCharArray()){
seq[c]++;
}
for (char c : str.toCharArray()){
if(seq[c]>0){
strComp.append(c).append(seq[c]);
seq[c]=0;//so that it does not enter , when char occurs again
}
}
if(str.length()<strComp.length()){
return str;
}
return strComp.toString();
}
public static void main(String[] args) {
CompressChar ch = new CompressChar();
System.out.println(ch.compressString("abbcdrfac"));
}
}
2 Answers 2
Your solution here is wrong, it produces incorrect results when it compresses chars that are in two different places in the string:
For the input: aaaaabbbbbcdrfaaaaaaccccc
your program produces:
a11b5c6d1r1f1
when it should produce:
a5b5c1d1r1f1a6c5
Because the a
and c
repeat in two different places, you include them all in the first count.
Since this is homework, writing out the full solution would be counter-productive, but, if you have:
StringBuilder sb = new StringBuilder();
and two variables to contain the previous char and the previous char count, you could just go through each character in the string, if it's the same as the previous char, you increment the count. If it's different, you add the previous char and it's count to the StringBuilder, and reset the previous and count to 1.
Then, when your loop is done, if the StringBuilder is smaller than the input, return the StringBuilder version.
This type of compression is called 'run length encoding' by the way.
-
\$\begingroup\$ I received this as an interview question, and internet searches show that it may be in a book about code interviews as well. posting the full solution would be helpful on code review. Should I post a separate question? \$\endgroup\$Hoppe– Hoppe2015年12月24日 14:41:16 +00:00Commented Dec 24, 2015 at 14:41
Use StringBuilder
instead of StringBuffer
because it's faster whereas synchronization
is not required.
There is no reason for the array seq
which is a temporary array to be a field
. Declare it local to the compressString
method instead.
When you call toCharArray()
it allocates a new char[]
initialized to the content of the string, and happens every time you call it. You can store into an array
variable to avoid that.
char[] charArray = str.toCharArray()
for(char c: charArray){
...
}
// reuse the same array again
Explore related questions
See similar questions with these tags.
temp
variables in theelse
block. Additionally, get rid of the wastedchar[]
initialization by assigningstr.toCharArray()
straight tostrArr
. \$\endgroup\$