The following code solves this problem:
The 3072 characters below contain a sequence of 5 characters which is repeated. However, there is a twist: one of the sequences has a typo. To be specific, you're looking for two groups of 5 characters which share 4 of those characters. Enter the 4 shared characters below to proceed.
Any suggestions on how it can be improved using idiomatic Lisp/Clojure?
(ns slurp.slurp)
(defn get-five [in]
(if (> (.length in) 4)
(subs in 0 5)
nil))
(defn how-good [a b position goodness bad-position]
(if (= position 5)
(list goodness bad-position)
(if (= (subs a position (+ 1 position)) (subs b position (+ 1 position)))
(recur a b (+ 1 position) (+ 1 goodness) bad-position)
(recur a b (+ 1 position) goodness position))))
(defn matches [a b]
(let [x (how-good a b 0 0 -1)]
(list (> (first x) 3) (last x))))
(defn remove-bad-char [in-string bad-index]
(str (subs in-string 0 bad-index) (subs in-string (+ 1 bad-index))))
(defn sub-parse [left in]
(let [right (get-five in)]
(if (not (= right nil))
(let [match-result (matches left right)]
(if (first match-result)
(let [bad-index (last match-result)]
(println left "is a match. bad-position:" (last match-result) "ans: "(remove-bad-char right bad-index)))
(recur left (subs in 1)))))))
(defn parse [in]
(let [left (get-five in)] ;; as soon as we hit the ass end of the input string, get-five returns null and we stop trying to match.
(if (not (= left nil))
(do
(sub-parse left (subs in 1))
(recur (subs in 1))))))
(parse (slurp "/Users/clojure/challange.txt"))
;; https://www.readyforzero.com/challenge/491f97bc25574346aed237d43e7d0404
;; (answer is i2+r)
-
\$\begingroup\$ Actual URL of input data appears to be: readyforzero.com/challenge-data/… \$\endgroup\$overthink– overthink2011年07月19日 01:51:12 +00:00Commented Jul 19, 2011 at 1:51
-
\$\begingroup\$ this isn't common-lisp. tag is incorrect. \$\endgroup\$Paul Nathan– Paul Nathan2011年07月19日 20:39:07 +00:00Commented Jul 19, 2011 at 20:39
-
\$\begingroup\$ Yeah, I've done the ready for zero programming challenge too... :) \$\endgroup\$Julian Birch– Julian Birch2011年08月08日 08:36:41 +00:00Commented Aug 8, 2011 at 8:36
2 Answers 2
I'd start bottom-up on this one if you want to do it in a slightly more idiomatic / functional style. Also it's good to get the data in Clojure sequences rather than strings as it makes it a bit easier to manipulate.
You're trying to work on groups of five adjacent characters so I'd set up a sequence of such adjacent groups:
; define some source data, right answer is "1235"
(def char-seq (seq "dfyfgvbufi12345idvdfhopshbsopgeobvufpdhbugphp123a5ghauoyhewqojbvrbon"))
; now partition into adjacent overlapping groups of five (with a step of 1)
(def char-groups (partition 5 1 char-seq))
; prove it works
(take 3 char-groups)
=> ((\d \f \y \f \g) (\f \y \f \g \v) (\y \f \g \v \b))
Now we want to define what a valid match means: in this case 4 characters should be the same in two groups being compared.
; helper function to count the number of true values in a collection / sequence
(defn count-true [coll]
(reduce
#(+ %1 (if %2 1 0))
0
coll))
; function to determine if exactly four characters are equal in two groups
(defn matches? [group1 group2]
(= 4
(count-true (map = group1 group2))))
Finally you just need to run the matches? function over all possible groups:
(def results
(distinct
(keep identity
(for [group1 char-groups
group2 char-groups]
(if (matches? group1 group2)
(filter identity (map #(if (= %1 %2) %1 nil) group1 group2))
nil)))))
results
=> ((1円 2円 3円 5円))
-
\$\begingroup\$ thanks mikera. I got to learn some new stuff: partition reduce distinct keep indentity filter map for. btw, why did you need (keep identity ....) ? \$\endgroup\$Zordak– Zordak2011年07月16日 20:58:37 +00:00Commented Jul 16, 2011 at 20:58
-
\$\begingroup\$ I see that the keep identity is used to filter the nil's returned by: (for [group1 char-groups group2 char-groups] (println group1 group2)) But I can't understand where the nils come from when char-groups Does not return any nils. Why is the for putting them in? \$\endgroup\$Zordak– Zordak2011年07月16日 21:15:28 +00:00Commented Jul 16, 2011 at 21:15
-
\$\begingroup\$ the (for ....) construct returns nils when there is no match via the if statement. This is an artifact of the way the for construct works (it creates a lazy sequence over all possible combinations of char groups). Incidentally this is also why you need the "distinct" because it will match the char-groups twice (once in normal order, once in reverse order) \$\endgroup\$mikera– mikera2011年07月18日 17:44:16 +00:00Commented Jul 18, 2011 at 17:44
Here is my version. Works on any seq (not just strings) and is fully lazy.
(def data (slurp "data.txt"))
(defn same-pos
"Return all the elements of xs and ys that match, position-wise. e.g.
(same-pos [\\a \\x \\c] [\\a \\r \\c]) returns [\\a \\c]. Returns nil if xs
and ys have different lengths."
[xs ys]
(when (= (count xs) (count ys))
(filter (complement nil?) (map #(if (= %1 %2) %1) xs ys))))
(def answer
(let [ps (partition 5 1 data)] ; build seq of sliding 5-tuples
(->> (for [x ps y ps] (same-pos x y)) ; n^2 FTW
(filter (comp (partial = 4) count)) ; 4 of 5 positions must match
(first))))
(println (apply str "Answer: " answer)) ; will print "Answer: i2+r"
Update: I didn't read other answers beforehand (wanted to figure it out myself), but it looks like I've got basically the same thing as mikera. So maybe not a whole lot to learn here :)