I wrote an answer for the following question and wondering if there is any better approach to it.
Using the Java language, have the function
RunLength(str)
take thestr
parameter being passed and return a compressed version of the string using the Run-length encoding algorithm. This algorithm works by taking the occurrence of each repeating character and outputting that number along with a single character of the repeating sequence.For example: "wwwggopp" would return 3w2g1o2p. The string will not contain any numbers, punctuation, or symbols.
Code
public class App {
void runLength(String str) {
HashMap<Character, Integer> hash = new HashMap<Character, Integer>();
Character c;
for (int i = 0; i < str.length(); i++) {
c = str.toLowerCase().charAt(i);
if (hash.containsKey(c)) {
int value = hash.get(c);
value++;
hash.put(c, value);
} else {
hash.put(c, 1);
}
}
int value = 0;
StringBuffer buf = new StringBuffer();
String temp;
for (int j = 0; j < str.length(); j++) {
c = str.toLowerCase().charAt(j);
value = hash.get(c);
temp = str.substring(j, j + 1);
if (buf.indexOf(c.toString()) == -1) {
buf.append(value);
buf.append(temp);
}
}
System.err.println(buf.toString());
}
public static void main(String[] args) {
new App().runLength("wwwggopp");
}
}
Output
3w2g1o2p
2 Answers 2
Avoid printing in methods, unless the purpose of the method is speicifically to print. The main purpose of the runLength
method is certainly not to print. It should return the compressed value instead.
The description doesn't say that you should treat upper and lowercase characters equal. It's not correct to do that. And if you really wanted to do that, it would have been simpler to lowercase the entire input string once. This is a very wasteful operation:
for (int i = 0; i < str.length(); i++) { c = str.toLowerCase().charAt(i);
This would have been much better:
String lowered = str.toLowerCase();
for (int i = 0; i < lowered.length(); i++) {
c = lowered.charAt(i);
Declare variables with the interface type, when possible. Instead of:
HashMap<Character, Integer> hash = new HashMap<Character, Integer>();
This would have been better:
Map<Character, Integer> hash = new HashMap<Character, Integer>();
Btw are you still on Java6? You should definitely migrate to at least Java7, where the above declaration becomes simply:
Map<Character, Integer> hash = new HashMap<>();
Note that in most use cases, StringBuilder
is preferred over StringBuffer
.
StringBuffer
is synchronized, StringBuilder
is not, which makes it faster.
In this program you don't need synchronization when building the compressed string.
Suggested implementation
This is simpler, without a hashmap:
public String compress(String str) {
if (str.isEmpty()) {
return "";
}
char[] chars = str.toCharArray();
StringBuilder builder = new StringBuilder();
int count = 1;
char prev = chars[0];
for (int i = 1; i < chars.length; i++) {
char current = chars[i];
if (current == prev) {
count++;
} else {
builder.append(count).append(prev);
count = 1;
}
prev = current;
}
return builder.append(count).append(prev).toString();
}
Unit testing
It's always good to have unit tests to verify correctness:
@Test
public void test_aabcccccaaa() {
assertEquals("2a1b5c3a", compress("aabcccccaaa"));
}
@Test
public void test_a5() {
assertEquals("5a", compress("aaaaa"));
}
@Test
public void test_empty() {
assertEquals("", compress(""));
}
@Test
public void test_a() {
assertEquals("1a", compress("a"));
}
@Test
public void test_a3b4() {
assertEquals("3a4b", compress("aaabbbb"));
}
@Test
public void test_abc() {
assertEquals("1a1b1c", compress("abc"));
}
@Test
public void test_wwwggopp() {
assertEquals("3w2g1o2p", compress("wwwggopp"));
}
-
\$\begingroup\$ +1 for your answer . as testCase
test_aabcccccaaa
fails for the original code , But this scenario is covered by you ... As you suggested Map should not have been used here. I would say that beacause of Map this bug has happened. \$\endgroup\$Shirish Bari– Shirish Bari2014年10月12日 10:12:27 +00:00Commented Oct 12, 2014 at 10:12
Your program is an answer to a different question. It counts characters, as opposed to characters in runs.
Here is an example of the difference:
Input: "difference".
Output requested by the question: "1d 1i 2f 1e 1r 1e 1n 1c 1e"
Your output: "1d 1i 2f 3e 1r 1n 1c"
(I put the spaces in to make it more readable)
wwwgggooopppwww
? It appears that your code will not be able to handle this case. \$\endgroup\$