Skip to main content
Code Review

Return to Question

replaced http://codereview.stackexchange.com/ with https://codereview.stackexchange.com/
Source Link

There have been already many similar questions to this one (e.g. this this and this this, just to point two), but I did not find any for clojure.

There have been already many similar questions to this one (e.g. this and this, just to point two), but I did not find any for clojure.

There have been already many similar questions to this one (e.g. this and this, just to point two), but I did not find any for clojure.

Source Link
Attilio
  • 1.7k
  • 1
  • 13
  • 19

RLE-like string compression in clojure

There have been already many similar questions to this one (e.g. this and this, just to point two), but I did not find any for clojure.

The goal is to compress a string in a RLE-like way: i.e. each character "run" is replaced by the character and the count of the run. Some examples:

aaaabbccc --> a4b2c3
b --> b1
aaaaaaaaaaaa --> a9a3

Some remarks:

  • For now, I do not care about the possible optimisation that runs with the length of 1 can omit the count. (E.g. the output could be b instead of b1.)
  • If there are more than n > 9 characters in a run, then it is split up into two or more runs, like 9 + 9 + 9 + ... + k = n, where k <= 9.
  • For now, I do not care if the "compressed" string will be longer than the original one (this is the case e.g. for an input containing only single characters, like: abcde --> a1b1c1d1e1).

That said, please see my implementation below:

Source

(defn countrepetitions [s]
 (if (empty? s) 0
 (loop [acts (vec s) ch (first acts) cnt 1]
 (if (= ch (first(rest acts)))
 (recur (rest acts) ch (inc cnt))
 cnt))))
(defn mycompress [s]
 (loop [acts s encoded "" cnt1 (countrepetitions acts)]
 (if (= 0 cnt1)
 encoded
 (let [cnt2 (if (< cnt1 10) cnt1 9)]
 (recur (subs acts cnt2) (str encoded (first acts) cnt2) (countrepetitions (subs acts cnt2)))))))

Tests

(deftest a-test
 (testing "compressor"
 (is (= "" (mycompress "")))
 (is (= "a2" (mycompress "aa")))
 (is (= "a1b8a1c7a3" (mycompress "abbbbbbbbacccccccaaa")))
 (is (= "b9b3" (mycompress "bbbbbbbbbbbb")))))

Questions

In particular, I would be interested in the following, but any other remark/suggestion is also welcome:

  • Is there a more efficient way (either in terms of readability or speed) to implement countrepetitions, or maybe leave it out completely?
  • Are there any other tests worth adding? Or can you find a "counter-example" (i.e. a test case, for which the implementation does not work correctly)?
lang-clj

AltStyle によって変換されたページ (->オリジナル) /