Description:
Implement a method to perform basic compression using counts of repeated characters. If the compressed string is not smaller then return the original string.
Code:
class Main {
public static String compress(String in) {
if (in == null) {
throw new IllegalArgumentException("Input string cannot be null");
}
if (in.length() <= 1) {
return in;
}
StringBuffer out = new StringBuffer();
int count = 1;
for (int i = 1; i < in.length(); i++) {
char current = in.charAt(i);
char previous = in.charAt(i - 1);
if (current == previous) {
count++;
}
else {
out.append(previous);
out.append(count);
count = 1;
}
}
out.append(in.charAt(in.length() - 1));
out.append(count);
return out.toString().length() < in.length() ? out.toString() : in;
}
public static void main(String[] args) {
try {
System.out.println("Should not happen: " + compress(null));
} catch (IllegalArgumentException e) {
System.out.println("Got expected exception for null");
}
System.out.println(compress("").equals(""));
System.out.println(compress("a").equals("a"));
System.out.println(compress("ab").equals("ab"));
System.out.println(compress("aa").equals("aa"));
System.out.println(compress("aabcccccaaa").equals("a2b1c5a3"));
System.out.println(compress("aabb").equals("aabb"));
}
}
Question:
The solution was just based on intuition. The last append was by trial and error which I didn't like (I may be more nervous during the real interview). I would like to know if there is any way to avoid such mistakes, possibly using loop invariants?
2 Answers 2
The main problem IMO is, that you cram it all into a single method which is not helpful to organize your thoughts.
When I tried this for comparison, I came up with the following in pseudo code:
// outer logic
compress(s)
rle = doRunLengthEncoding(s)
if rle.length < s.length return rle
else return s
// base algorithm
doRunLengthEncoding(s)
res = empty String buffer
char[] characters = s.toCharArray
int startIndex = 0
while startIndex < characters.length
int endIndex = findEndOfCurrentSequence(startIndex, characters)
res.append(currentChar).append(endIndex - startIndex)
startIndex = endIndex
return res
// return the first index i, where c[i] != c[start], return
// c.length if there is no such index
findEndOfCurrentSequence(start, c)
... // simple enough
This way, you divide the problem into manageble pieces, give you thoughts a place to stop every once in a while, and should get rid of this nervousness.
BTW: from an interviewer's point of view (at least I you want into my team in my country ;-)) a well thought pseudocode algorithm is worth more than working code. If you apply for a programmer's job, I just assume that you have the basic skills to use an IDE and kick against some piece of code until it compiles and produces the correct result. What I am looking for is organized thoughts and problem solving skills.
One last thing: there is one and only one correct exception to throw for a null parameter that must not be null: NullPointerException. Ideally by simply using Objects.requireNonNull.
-
1\$\begingroup\$ Why is only a
NullPointerException
correct? Because you say so? The description ofIllegalArgumentException
would also apply in such contexts. This issue has already been debated countess times, so maybe it is not as clear-cut as you make it out to be, but also a matter of preference. \$\endgroup\$Stingy– Stingy2018年03月29日 08:48:13 +00:00Commented Mar 29, 2018 at 8:48
If your loop works by comparing each character to the character that precedes it, the only way to avoid the duplication of the last append would be to run the loop once more, i.e. making the stop condition
i <= in.length()
instead ofi < in.length()
, and assigningcurrent
a default value in casei == in.length()
. The problem with this approach is that it will not work if the last character of the string matches this default character.So you would probably have to redesign the loop so that it compares each character to the character that follows it. If the following character matches the current character, then
count
is increased, and if not, or if there is no following character, then the current character will be appended, followed by its count. For this, you would have to initializei
to0
instead of1
. This would also take care of the special case that the input string is only one character long.Also, you are using a
StringBuffer
even though synchronization is not relevant in your code. You might consider using aStringBuilder
instead, which has the same functionality as aStringBuffer
but is not synchronized and therefore faster.Finally, in the
return
statement ofcompress(String)
, you first convertout
to aString
before determining its length. This is not necessary, becauseStringBuffer
/StringBuilder
also has alength()
method which you can call directly.
Explore related questions
See similar questions with these tags.