This is my Clojure code for finding prime numbers.
Note: this is an advanced version which starts eliminating from i*i with step i, instead of filtering all list against mod i == 0 (Sieve of Eratosthenes).
It has better asymptotic runtime: O(n log log n) instead of O(n log n) in typical examples of finding primes in Clojure.
What can be done better? Do I use some slow constructions? Can I make it more concise? Gain more performance? Format it better?
(defn primes [n]
(let [mark (fn [i di v]
(if (<= i (count v))
(recur (+ i di) di (assoc v (dec i) di))
v))
step (fn [i ps v]
(if (<= i (count v))
(if (= (v (dec i)) 1)
(recur (inc i) (conj ps i) (mark (* i i) i v))
(recur (inc i) ps v))
ps))]
(->> (repeat 1) (take n) vec (step 2 []))))
(defn -main
[& args]
(println (primes 50)))
2 Answers 2
(non-Clojure-specific) Gain performance, sure. Use packed array of odds, where index i represents value 2i+1. Then you don't have to deal with evens, which are all non-prime a priori (except the 2 of course). Then you can increment by 2*p for a prime p to find its odd multiples twice faster.
For a non-marked index i, the prime p is p = 2*i+1, its square is p*p = (2i+1)(2i+1) = 4i^2 + 4i + 1 and its index is (p*p-1)/2 = 2i^2 + 2i = 2i(i+1) = (p-1)(i+1). For the value increment of 2*p, the index increment on 2x-packed array is di = p = 2i+1.
I've played around with your code to try to understand it, frankly. After a long sequence of changes, I ended up with the following:
(defn primes [n]
(let [mark (fn [i di v]
(reduce
(fn [w i] (assoc w i di))
v
(range i (count v) di)))
[answer &_] (reduce
(fn [[ps v :as both] i]
(if (= (v i) 1)
[(conj ps i) (mark (* i i) i v)]
both))
(let [init-v (->> (repeat 1) (take n) (vec))]
[[] init-v])
(range 2 n))]
answer))
- I've got rid of the
decs in all the accesses to the vectorv. - I've captured the
recurs, in themarkandstepfunctions, inreduces. - Since the
stepfunction is no longer recursive, I've unwrapped it into its one call.
The new mark function is a little faster. But the step equivalent is going to be slower, since it generates a new pair-vector for every prime.
The main problem here is space - your vector v is the size of your candidate range of numbers. I've come across a cute algorithm that gets round this, though at some cost in speed, spent on laziness.
n log (log n)only if(assoc v (dec i) di)is O(1). \$\endgroup\$