Naive string compression

Question 1

Description:

Implement a method to perform basic compression using counts of repeated characters. If the compressed string is not smaller then return the original string.

Code:

class Main {
 public static String compress(String in) {
 if (in == null) {
 throw new IllegalArgumentException("Input string cannot be null");
 }
 if (in.length() <= 1) {
 return in;
 }
 StringBuffer out = new StringBuffer();
 int count = 1;
 for (int i = 1; i < in.length(); i++) {
 char current = in.charAt(i);
 char previous = in.charAt(i - 1);
 if (current == previous) {
 count++;
 }
 else {
 out.append(previous);
 out.append(count);
 count = 1;
 }
 }
 out.append(in.charAt(in.length() - 1));
 out.append(count);
 return out.toString().length() < in.length() ? out.toString() : in;
 }
 public static void main(String[] args) {
 try {
 System.out.println("Should not happen: " + compress(null));
 } catch (IllegalArgumentException e) {
 System.out.println("Got expected exception for null");
 }
 System.out.println(compress("").equals(""));
 System.out.println(compress("a").equals("a"));
 System.out.println(compress("ab").equals("ab"));
 System.out.println(compress("aa").equals("aa"));
 System.out.println(compress("aabcccccaaa").equals("a2b1c5a3"));
 System.out.println(compress("aabb").equals("aabb"));
 }
}

Question:

The solution was just based on intuition. The last append was by trial and error which I didn't like (I may be more nervous during the real interview). I would like to know if there is any way to avoid such mistakes, possibly using loop invariants?

Question 2

The main problem IMO is, that you cram it all into a single method which is not helpful to organize your thoughts.

When I tried this for comparison, I came up with the following in pseudo code:

// outer logic
compress(s)
 rle = doRunLengthEncoding(s)
 if rle.length < s.length return rle
 else return s
// base algorithm
doRunLengthEncoding(s)
 res = empty String buffer
 char[] characters = s.toCharArray
 int startIndex = 0
 while startIndex < characters.length
 int endIndex = findEndOfCurrentSequence(startIndex, characters)
 res.append(currentChar).append(endIndex - startIndex)
 startIndex = endIndex
 return res
// return the first index i, where c[i] != c[start], return
// c.length if there is no such index
findEndOfCurrentSequence(start, c)
 ... // simple enough

This way, you divide the problem into manageble pieces, give you thoughts a place to stop every once in a while, and should get rid of this nervousness.

BTW: from an interviewer's point of view (at least I you want into my team in my country ;-)) a well thought pseudocode algorithm is worth more than working code. If you apply for a programmer's job, I just assume that you have the basic skills to use an IDE and kick against some piece of code until it compiles and produces the correct result. What I am looking for is organized thoughts and problem solving skills.

One last thing: there is one and only one correct exception to throw for a null parameter that must not be null: NullPointerException. Ideally by simply using Objects.requireNonNull.

Question 3

Why is only a NullPointerException correct? Because you say so? The description of IllegalArgumentException would also apply in such contexts. This issue has already been debated countess times, so maybe it is not as clear-cut as you make it out to be, but also a matter of preference.

Question 4

If your loop works by comparing each character to the character that precedes it, the only way to avoid the duplication of the last append would be to run the loop once more, i.e. making the stop condition i <= in.length() instead of i < in.length(), and assigning current a default value in case i == in.length(). The problem with this approach is that it will not work if the last character of the string matches this default character.

So you would probably have to redesign the loop so that it compares each character to the character that follows it. If the following character matches the current character, then count is increased, and if not, or if there is no following character, then the current character will be appended, followed by its count. For this, you would have to initialize i to 0 instead of 1. This would also take care of the special case that the input string is only one character long.
Also, you are using a StringBuffer even though synchronization is not relevant in your code. You might consider using a StringBuilder instead, which has the same functionality as a StringBuffer but is not synchronized and therefore faster.
Finally, in the return statement of compress(String), you first convert out to a String before determining its length. This is not necessary, because StringBuffer/StringBuilder also has a length() method which you can call directly.

mtj mtj 4,96211 silver badges21 bronze badges · Accepted Answer · 2018-03-28 06:12:08Z

The main problem IMO is, that you cram it all into a single method which is not helpful to organize your thoughts.

When I tried this for comparison, I came up with the following in pseudo code:

// outer logic
compress(s)
 rle = doRunLengthEncoding(s)
 if rle.length < s.length return rle
 else return s
// base algorithm
doRunLengthEncoding(s)
 res = empty String buffer
 char[] characters = s.toCharArray
 int startIndex = 0
 while startIndex < characters.length
 int endIndex = findEndOfCurrentSequence(startIndex, characters)
 res.append(currentChar).append(endIndex - startIndex)
 startIndex = endIndex
 return res
// return the first index i, where c[i] != c[start], return
// c.length if there is no such index
findEndOfCurrentSequence(start, c)
 ... // simple enough

This way, you divide the problem into manageble pieces, give you thoughts a place to stop every once in a while, and should get rid of this nervousness.

BTW: from an interviewer's point of view (at least I you want into my team in my country ;-)) a well thought pseudocode algorithm is worth more than working code. If you apply for a programmer's job, I just assume that you have the basic skills to use an IDE and kick against some piece of code until it compiles and produces the correct result. What I am looking for is organized thoughts and problem solving skills.

One last thing: there is one and only one correct exception to throw for a null parameter that must not be null: NullPointerException. Ideally by simply using Objects.requireNonNull.

Why is only a NullPointerException correct? Because you say so? The description of IllegalArgumentException would also apply in such contexts. This issue has already been debated countess times, so maybe it is not as clear-cut as you make it out to be, but also a matter of preference.

Stack Exchange Network

Naive string compression

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Naive string compression

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions