2
\$\begingroup\$

There have been already many similar questions to this one (e.g. this and this, just to point two), but I did not find any for clojure.

The goal is to compress a string in a RLE-like way: i.e. each character "run" is replaced by the character and the count of the run. Some examples:

aaaabbccc --> a4b2c3
b --> b1
aaaaaaaaaaaa --> a9a3

Some remarks:

  • For now, I do not care about the possible optimisation that runs with the length of 1 can omit the count. (E.g. the output could be b instead of b1.)
  • If there are more than n > 9 characters in a run, then it is split up into two or more runs, like 9 + 9 + 9 + ... + k = n, where k <= 9.
  • For now, I do not care if the "compressed" string will be longer than the original one (this is the case e.g. for an input containing only single characters, like: abcde --> a1b1c1d1e1).

That said, please see my implementation below:

Source

(defn countrepetitions [s]
 (if (empty? s) 0
 (loop [acts (vec s) ch (first acts) cnt 1]
 (if (= ch (first(rest acts)))
 (recur (rest acts) ch (inc cnt))
 cnt))))
(defn mycompress [s]
 (loop [acts s encoded "" cnt1 (countrepetitions acts)]
 (if (= 0 cnt1)
 encoded
 (let [cnt2 (if (< cnt1 10) cnt1 9)]
 (recur (subs acts cnt2) (str encoded (first acts) cnt2) (countrepetitions (subs acts cnt2)))))))

Tests

(deftest a-test
 (testing "compressor"
 (is (= "" (mycompress "")))
 (is (= "a2" (mycompress "aa")))
 (is (= "a1b8a1c7a3" (mycompress "abbbbbbbbacccccccaaa")))
 (is (= "b9b3" (mycompress "bbbbbbbbbbbb")))))

Questions

In particular, I would be interested in the following, but any other remark/suggestion is also welcome:

  • Is there a more efficient way (either in terms of readability or speed) to implement countrepetitions, or maybe leave it out completely?
  • Are there any other tests worth adding? Or can you find a "counter-example" (i.e. a test case, for which the implementation does not work correctly)?
asked Jan 17, 2016 at 18:50
\$\endgroup\$
2
  • 3
    \$\begingroup\$ You could use partition-by to count consecutive occurrences, instead of the implicit recursion, for example: (map (juxt first count) (partition-by identity "abbbbbbbbacccccccaaa")) \$\endgroup\$ Commented Jan 17, 2016 at 19:38
  • 1
    \$\begingroup\$ @Shlomi thank you, after some experimentation I came up with the following based on your suggestion: (apply str (flatten (map (juxt first count) (apply concat (map #(partition-all 9 %) (partition-by identity s))))))). Definitely shorter then my original attempt :) \$\endgroup\$ Commented Jan 19, 2016 at 19:10

1 Answer 1

3
\$\begingroup\$

Whenever you're splitting a list up into a number of chunks, it's worth looking into the partition function (and related functions).

A string is just a list of characters, and you just want to split it when the character changes, so (partition-by identity characters) should be much simpler than what you have above. You can pair that with another partition to make sure you have no chunks of> 9 characters:

(defn rle-chunks [string]
 (->> string
 (partition-by identity)
 (mapcat #(partition-all 9 %))))

Actually performing the compression is then just a case of calling first & count on each of these chunks, then converting back to a string:

(defn rle [input]
 (->> input
 rle-chunks
 (map (juxt first count))
 (apply concat)
 (apply str)))
answered Jan 17, 2016 at 20:48
\$\endgroup\$
2
  • \$\begingroup\$ Thank you, I think this much more readable than the solution I came up with based on the suggestion of @Shlomi: it is definitely more readable. Also, I was not aware of mapcat and did not remember the threading macro ->>. Just one question: I think (apply concat) can be replaced with (flatten) in this case, to make it even shorter. What do you think? \$\endgroup\$ Commented Jan 19, 2016 at 19:12
  • \$\begingroup\$ @Attilio yeah, you're probably right. \$\endgroup\$ Commented Jan 19, 2016 at 19:59

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.