Skip to main content
Code Review

Return to Question

Notice removed Draw attention by Community Bot
Bounty Ended with no winning answer by Community Bot
Tweeted twitter.com/#!/StackCodeReview/status/418511701198524416
Notice added Draw attention by sloth
Bounty Started worth 50 reputation by sloth
added 193 characters in body
Source Link
sloth
  • 718
  • 4
  • 17

P.S.: While looking at the code myself, I spot two minor things: I can use when-not instead of when(not (... and clojure.string/join coll instead of (apply str coll)


P.S.: While looking at the code myself, I spot two minor things: I can use when-not instead of when(not (... and clojure.string/join coll instead of (apply str coll)

Source Link
sloth
  • 718
  • 4
  • 17

Idiomatic clojure code in a markdown parser

Some time ago I created a markdown parser in clojure and I would like to get some feedback, since I'm a clojure noob in the first place (is the code understandable?/is it idiomatic?/can some things be improved?).

So I'm looking for feedback on best practices and design pattern usage (performance isn't my main concern).

The most relevant parts are:

blocks.clj

(ns mdclj.blocks
 (:use [clojure.string :only [blank? split]]
 [mdclj.spans :only [parse-spans]]
 [mdclj.misc]))
(defn- collect-prefixed-lines [lines prefix]
 (when-let [[prefixed remaining] (partition-while #(startswith % prefix) lines)]
 [(map #(to-string (drop (count prefix) %)) prefixed) remaining]))
(defn- line-seperated [lines]
 (when-let [[par [r & rrest :as remaining]] (partition-while (complement blank?) lines)]
 (list par rrest)))
(declare parse-blocks)
(defn- create-block-map [type content & extra]
 (into {:type type :content content} extra))
(defn- clean-heading-string [line]
 (-> line (to-string)
 (clojure.string/trim) 
 (clojure.string/replace #" #*$" "") ;; match space followed by any number of #s
 (clojure.string/trim)))
(defn match-heading [[head & remaining :as text]]
 (let [headings (map vector (range 1 6) (iterate #(str \# %) "#")) ;; ([1 "#"] [2 "##"] [3 "###"] ...) 
 [size rest] (some (fn [[index pattern]] 
 (let [rest (startswith head pattern)]
 (when (seq rest) 
 [index rest]))) headings)]
 (when (not (nil? rest))
 [(create-block-map ::heading (parse-spans (clean-heading-string rest)) {:size size}) remaining])))
(defn- match-underline-heading [[caption underline & remaining :as text]]
 (let [current (set underline)
 marker [\- \=]
 markers (mapcat #(list #{\space %} #{%}) marker)]
 (when (and (some #(= % current) markers)
 (some #(startswith underline [%]) marker)
 (< (count (partition-by identity underline)) 3))
 [(create-block-map ::heading (parse-spans caption) remaining {:size 1}) remaining])))
(defn- match-horizontal-rule [[rule & remaining :as text]]
 (let [s (set rule)
 marker [\- \*]
 markers (mapcat #(list #{\space %} #{%}) marker)]
 (when (and (some #(= % s) markers)
 (> (some #(get (frequencies rule) %) marker) 2))
 [{:type ::hrule} remaining])))
(defn- match-codeblock [text]
 (when-let [[code remaining] (collect-prefixed-lines text " ")]
 [(create-block-map ::codeblock code) remaining]))
(defn- match-blockquote [text]
 (when-let [[quote remaining] (collect-prefixed-lines text "> ")]
 [(create-block-map ::blockquote (parse-blocks quote)) remaining]))
(defn- match-paragraph [text]
 (when-let [[lines remaining] (line-seperated text)]
 [(create-block-map ::paragraph (parse-spans (clojure.string/join "\n" lines))) remaining]))
(defn- match-empty [[head & remaining :as text]]
 (when (and (blank? head) (seq remaining))
 (parse-blocks remaining)))
(def ^:private block-matcher 
 [match-heading 
 match-underline-heading
 match-horizontal-rule
 match-codeblock 
 match-blockquote
 match-paragraph 
 match-empty])
(defn- parse-blocks [lines]
 (lazy-seq
 (when-let [[result remaining] (some #(% lines) block-matcher)]
 (cons result (parse-blocks remaining)))))
 
(defn parse-text [text]
 (parse-blocks (seq (clojure.string/split-lines text))))

spans.clj

(ns mdclj.spans
 (:use [mdclj.misc]))
(def ^:private formatter 
 [["`" ::inlinecode]
 ["**" ::strong]
 ["__" ::strong]
 ["*" ::emphasis]
 ["_" ::emphasis]])
(defn- apply-formatter [text [pattern spantype]]
 "Checks if text starts with the given pattern. If so, return the spantype, the text
 enclosed in the pattern, and the remaining text"
 (when-let [[body remaining] (delimited text pattern)]
 [spantype body remaining]))
(defn- get-spantype [text]
 (let [[spantype body remaining :as match] (some #(apply-formatter text %) formatter)]
 (if (some-every-pred startswith [body remaining] ["*" "_"]) 
 [spantype (-> body (vec) (conj (first remaining))) (rest remaining)]
 match)))
(defn- make-literal [acc]
 "Creates a literal span from the acc"
 {:type ::literal :content (to-string (reverse acc))})
(declare parse-spans)
(defn- span-emit [literal-text span]
 "Creates a vector containing a literal span created from literal-text and 'span' if literal-text, else 'span'"
 (if (seq literal-text)
 [(make-literal literal-text) span] ;; if non-empty literal before next span
 [span]))
(defn- concat-spans [acc span remaining]
 (concat (span-emit acc span) (parse-spans [] remaining)))
(defn- parse-span-body
 ([body]
 (parse-span-body nil body))
 ([spantype body]
 (if (in? [::inlinecode ::image] spantype)
 (to-string body)
 (parse-spans [] body)))) ;; all spans except inlinecode and image can be nested
(defn- match-span [acc text] ;; matches ::inlinecode ::strong ::emphasis
 (when-let [[spantype body remaining :as match] (get-spantype text)] ;; get the first matching span
 (let [span {:type spantype :content (parse-span-body spantype body)}]
 (concat-spans acc span remaining))))
(defn- extract-link-title [text]
 (reduce #(clojure.string/replace % %2 "") (to-string text) [#"\"$" #"'$" #"^\"" #"^'"])) 
(defn- parse-link-text [linktext]
 (let [[link title] (clojure.string/split (to-string linktext) #" " 2)]
 (if (seq title)
 {:url link :title (extract-link-title title)}
 {:url link})))
(defn- match-link-impl [acc text type]
 (when-let [[linkbody remaining :as body] (bracketed text "[" "]")]
 (when-let [[linktext remaining :as link] (bracketed remaining "(" ")")]
 (concat-spans acc (into {:type type :content (parse-span-body type linkbody)} (parse-link-text linktext)) remaining))))
 
(defn- match-link [acc text]
 (match-link-impl acc text ::link))
(defn- match-inline-image [acc [exmark & remaining :as text]]
 (when (= exmark \!)
 (match-link-impl acc remaining ::image)))
(defn- match-break [acc text]
 (when-let [remaining (some #(startswith text %) [" \n\r" " \n" " \r"])] ;; match hard-breaks
 (concat-spans acc {:type ::hard-break} remaining)))
(defn- match-literal [acc [t & trest :as text]]
 (cond
 (seq trest)
 (parse-spans (cons t acc) trest) ;; accumulate literal body (unparsed text left)
 (seq text)
 (list (make-literal (cons t acc))))) ;; emit literal (at end of text: no trest left)
(def ^:private span-matcher 
 [match-span 
 match-link 
 match-inline-image
 match-break 
 match-literal])
(defn parse-spans
 ([text]
 (parse-spans [] text))
 ([acc text]
 (some #(% acc text) span-matcher)))

misc.clj

(ns mdclj.misc)
 
(defn in? 
 "true if seq contains elm"
 [seq elm] 
 (some #(= elm %) seq))
(defn startswith [coll prefix]
 "Checks if coll starts with prefix.
 If so, returns the rest of coll, otherwise nil"
 (let [[t & trest] coll
 [p & prest] prefix]
 (cond
 (and (= p t) ((some-fn seq) trest prest)) (recur trest prest)
 (= p t) '()
 (nil? prefix) coll)))
(defn partition-while
 ([f coll]
 (partition-while f [] coll))
 ([f acc [head & tail :as coll]]
 (cond
 (f head)
 (recur f (cons head acc) tail)
 (seq acc)
 (list (reverse acc) coll))))
(defn- bracketed-body [closing acc text]
 "Searches for the sequence 'closing' in text and returns a
 list containing the elements before and after it"
 (let [[t & trest] text
 r (startswith text closing)]
 (cond
 (not (nil? r)) (list (reverse acc) r)
 (seq text) (recur closing (cons t acc) trest))))
(defn bracketed [coll opening closing]
 "Checks if coll starts with opening and ends with closing.
 If so, returns a list of the elements between 'opening' and 'closing', and the
 remaining elements"
 (when-let [remaining (startswith coll opening)]
 (bracketed-body closing '() remaining)))
(defn delimited [coll pattern]
 "Checks if coll starts with pattern and also contains pattern.
 If so, returns a list of the elements between the pattern and the remaining elements"
 (bracketed coll pattern pattern))
(defn to-string [coll]
 "Takes a coll of chars and returns a string"
 (apply str coll))
(defn some-every-pred [f ands ors]
 "Builds a list of partial function predicates with function f and
 all values in ands and returns if any argument in ors fullfills
 all those predicates" 
 (let [preds (map #(partial f %) ands)]
 (some true? (map #((apply every-pred preds) %) ors))))

Some "highlights":

(def ^:private block-matcher 
 [match-heading 
 match-underline-heading
 match-horizontal-rule
 match-codeblock 
 match-blockquote
 match-paragraph 
 match-empty])
(defn- parse-blocks [lines]
 (lazy-seq
 (when-let [[result remaining] (some #(% lines) block-matcher)]
 (cons result (parse-blocks remaining)))))

This piece always seemed somewhat strange to me. Is using a list of function and when-let idiomatic here? Are there alternatives?

(defn- create-block-map [type content & extra]
 (into {:type type :content content} extra))

I'm using this function to create hashmaps in a certain "format". Is this an idiomatic approach?

lang-clj

AltStyle によって変換されたページ (->オリジナル) /