I have implemented both some sorting algorithms (namely bubble sort
and merge sort
), as well as a command-line interface, in Clojure for fun and to test my Clojure abilities (I've tampered with Clojure for a while now, but I'm still quite a beginner).
The code looks alright, but I think the styling of some parts of the code could use improvement (mainly the merge sort
implementation, which looks awfully verbose). As well as that, I could use some other improvements in general, such as speed and performance, especially with larger lists.
core.clj:
(ns experiments.core
(:gen-class)
(:require [experiments.command-line :as cl]))
(defn bubble-sort [array]
(loop [ind 0
sort-arr array
swaps? 0]
(if (= ind (dec (count array)))
(if (not= swaps? 0)
(bubble-sort sort-arr))
(if (> (nth sort-arr ind) (nth sort-arr (inc ind)))
(let
[temp-arr
(vec
(concat
(subvec sort-arr 0 ind)
[(nth sort-arr (inc ind))]
[(nth sort-arr ind)]
(subvec sort-arr (+ ind 2))))]
(do
(println temp-arr)
(recur (inc ind) temp-arr (inc swaps?))))
(recur (inc ind) sort-arr swaps?)))))
(defn merge-sort [array]
(defn split-array [arr]
(loop [orig arr final []]
(if (< (count orig) 2)
(if (= (count orig) 1)
(conj final orig)
final)
(recur
(subvec orig 2)
(conj final (subvec orig 0 2))))))
(defn sort-two [[a b]]
(loop [arr-a a
arr-b b
final []]
(if
(or
(empty? arr-a)
(empty? arr-b))
(vec
(concat
final
(if (empty? arr-a) arr-b arr-a)))
(if (> (first arr-a) (first arr-b))
(recur
arr-a
(vec (rest arr-b))
(conj final (first arr-b)))
(recur
(vec (rest arr-a))
arr-b
(conj final (first arr-a)))))))
(loop
[sort-arr
(split-array
(vec
(for [a (range (count array))] [(nth array a)])))]
(if (= (count sort-arr) 1)
(println (sort-two sort-arr))
(recur
(split-array
(loop [ind 0
temp-arr sort-arr]
(println temp-arr)
(if (= ind (count temp-arr)) temp-arr
(recur
(inc ind)
(vec
(concat
(subvec temp-arr 0 ind)
[(if (= (count (nth temp-arr ind)) 1)
(nth (nth temp-arr ind) 0)
(sort-two (nth temp-arr ind)))]
(subvec temp-arr (inc ind))))))))))))
(defn gen-random-array [length]
(loop [arr []]
(if (= (count arr) length) arr
(let [rand-val (rand-int length)]
(if (some #{rand-val} arr)
(recur arr)
(recur (conj arr rand-val)))))))
(defn init-args []
(cl/add-arg "-b" "bubble"
["Print the procedure of bubble-sorting,"
"given an arbitrary amount of arguments."]
(fn
([]
(bubble-sort
(gen-random-array 10)))
([args]
(bubble-sort
(vec
(for [a (range (count args))]
(read-string (nth args a))))))))
(cl/add-arg "-m" "merge"
["Print the procedure of merge-sorting,"
"given an arbitrary amount of arguments."]
(fn
([]
(merge-sort
(gen-random-array 10)))
([args]
(merge-sort
(vec
(for [a (range (count args))]
(read-string (nth args a))))))))
(cl/add-arg "-r" "random"
["Generate a random array."]
(fn
([args]
(println
(gen-random-array
(read-string (nth args 0))))))))
(defn -main [& args]
(init-args)
(cl/parse-arg (vec args)))
command-line.clj:
(ns experiments.command-line
(:gen-class)
(:require [clojure.string :as string]))
(def names (atom [["-h" "help"]]))
(def docs (atom [[""]]))
(def funcs (atom [(fn [] ())]))
(defn add-arg [s-name l-name arg-doc func]
(swap! names conj [s-name l-name])
(swap! docs conj arg-doc)
(swap! funcs conj func))
(defn disp-help []
(println "\nCommands: ")
(loop [a 1]
(if (= a (count @names))
(print "")
(do
(println " "
(nth (nth @names a) 0) "\b,"
(nth (nth @names a) 1) "\b:\n "
(string/join "\n "
(nth @docs a)))
(println "")
(recur (+ a 1))))))
(defn parse-arg [args]
(let [main-arg (nth args 0)
other-args (subvec args 1)]
(loop [a 0]
(if (= a (count @names))
(println "No match!")
(if
(or
(= main-arg (nth (nth @names a) 0))
(= main-arg (nth (nth @names a) 1)))
(if (= a 0)
(disp-help)
((nth @funcs a) other-args))
(recur (+ a 1)))))))
1 Answer 1
Let's deal with bubble-sort
.
The if
form
(if (not= swaps? 0)
(bubble-sort sort-arr))
... has no expression for what happens if the condition fails, hence then returns nil
. The above needs to be ...
(if (not= swaps? 0)
(bubble-sort sort-arr)
sort-arr)
This works.
Next, every time you do a swap, you reconstruct an entire vector using concat
. It is faster and neater to use assoc
to exchange the two neighbouring elements:
(defn bubble-sort [array]
(loop [ind 0
sort-arr array
swaps? 0]
(if (= ind (dec (count array)))
(if (zero? swaps?)
array
(bubble-sort sort-arr))
(if (> (nth sort-arr ind) (nth sort-arr (inc ind)))
(let [temp-arr (assoc sort-arr
ind (sort-arr (inc ind))
(inc ind) (sort-arr ind))]
(recur (inc ind) temp-arr (inc swaps?)))
(recur (inc ind) sort-arr swaps?)))))
I've also used zero?
to tidy up the if
we looked at before.
Another couple of improvements:
The bubble-sort
function performs a recursive call for each pass through the vector. This limits it to sequences about 10K long, depending on how the JVM is configured. Otherwise you get stack overflow.
The solution is to replace the recursive call (bubble-sort sort-arr)
with the equivalent (recur 0 sort-arr 0)
. We also need to replace the reference to array
with one to sort-arr
, since array
is no longer renewed on every pass. So we replace
(if (zero? swaps?)
array
(bubble-sort sort-arr))
with ...
(if (zero? swaps?)
sort-arr
(recur 0 sort-arr 0))
This works. Try ...
(-> (range 10000 0 -1) vec bubble-sort)
before and after.
A fundamental mismatch is that bubble sort works by update-in-place, whereas clojure's core data structures are persistent - you can't change them. What you get back from assoc
and the like is a modified version of the original which shares most of its structure. This carries a substantial cost in performance.
Since you are not using the persistence of the vector, you can use transients. This reduces the cost of updates considerably - YMMV. To do so,
- Use
transient
to create the transient version. - Replace
assoc
withassoc!
- Use
persistent!
to recover a persistent version.
We get ...
(defn bubble-sort [array]
(loop [ind 0
sort-arr (transient array)
swaps? 0]
(if (= ind (dec (count array)))
(if (zero? swaps?)
(persistent! sort-arr)
(recur 0 sort-arr 0))
(if (> (nth sort-arr ind) (nth sort-arr (inc ind)))
(let [temp-arr (assoc! sort-arr
ind (sort-arr (inc ind))
(inc ind) (sort-arr ind))]
(recur (inc ind) temp-arr (inc swaps?)))
(recur (inc ind) sort-arr swaps?)))))
This seems to run a bit quicker, though it's still horribly slow compared to the core sort
.
-
\$\begingroup\$ Thanks! I would love to see more of the review. I know this might sound annoying, but... can I have more stuff please? \$\endgroup\$Qwerp-Derp– Qwerp-Derp2017年02月18日 03:28:53 +00:00Commented Feb 18, 2017 at 3:28
-
\$\begingroup\$ Thanks for the full bubble-sort review! I thought I couldn't turn the function into a recursive one using
recur
. I changedswaps?
to a boolean as well, so it's easier to follow. I'm not sure if it's faster overall, though. \$\endgroup\$Qwerp-Derp– Qwerp-Derp2017年03月04日 00:00:27 +00:00Commented Mar 4, 2017 at 0:00 -
\$\begingroup\$ @Qwerp-Derp
recur
takes the place of a recursive call, but is implemented by a loop - faster and uses no call stack. You can only use it when the returned value is that of the recursive call itself. This is called tail position. Lucklly, that prevails here. \$\endgroup\$Thumbnail– Thumbnail2017年03月04日 00:09:12 +00:00Commented Mar 4, 2017 at 0:09 -
\$\begingroup\$ I've revisited this review, and I kinda want reviews for the other sorts. If you have time, could you do it? \$\endgroup\$Qwerp-Derp– Qwerp-Derp2017年03月04日 00:11:41 +00:00Commented Mar 4, 2017 at 0:11
Explore related questions
See similar questions with these tags.
defn
inside of anotherdefn
. Instead, use(let [split-array (fn [x] ...)]...)
or(letfn [(split-array [x] ...)]...)
\$\endgroup\$