For some reason, this is the first time I've ever written an implementation of the Sieve of Eratosthenes. I basically just stared at the Wikipedia walkthrough of the algorithm until it made sense, then tried making the algorithm in idiomatic Clojure.
I'm not using an boolean array representing every number. Instead, I have a marked
set, and a primes
vector that are maintained separately.
My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naïve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.
I decided to go overboard with the use of
->>
here. It just fit so perfectly everywhere, but I think I may have taken it too far.Is there a way to avoid the
cond
dispatch innaïve-prime?
?Anything else notable.
(ns irrelevant.sieves.eratosthenes)
; ----- Sieve -----
(defn sieve-primes [n]
(loop [p 2 ; First prime
marked #{} ; Found composites
primes []]
(let [mults (->> n
(range p)
(map #(* p %))
(take-while #(< % n)))
next-p (->> p
(inc)
(iterate inc)
(remove marked)
(first))
new-primes (conj primes p)]
(if (>= next-p n)
new-primes
(recur next-p (into marked mults) new-primes)))))
; ----- Naive -----
(defn naive-prime? [n]
(cond
(< n 2) false
(= n 2) true
:else (->> n
(Math/sqrt)
(inc)
(range 2)
(some #(zero? (rem n %)))
(not))))
(defn naive-primes [n]
(->> n
(range 2)
(filterv naive-prime?)))
Runtime Complexity:
I wrote up a testing function to automate timing with the help of Criterium for benchmarking:
(defn test-times [start-n end-n step-factor]
(->> (iterate #(* step-factor %) start-n)
(take-while #(< % end-n))
(mapv (fn [n] [n (->> (c/quick-benchmark*
(fn [] (sieve-primes n))
nil)
(:mean)
(first))]))))
This tests how long it takes for various values of n
, then packages the results in a vector of [n time-taken]
pairs. Here are the results (times in seconds):
(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
[2 2.3105128053865377E-7]
[4 1.42932799280724E-6]
[8 5.63559279071997E-6]
[16 1.2984918224299065E-5]
[32 3.289705735278571E-5]
[64 7.087895492662475E-5]
[128 1.4285673446327686E-4]
[256 2.923177758112095E-4]
[512 6.273025793650794E-4]
[1024 0.0012981272479166666]
[2048 0.0025652252416666667]
[4096 0.005141740791666667]
[8192 0.01068772265]
[16384 0.022486903166666666]
[32768 0.05367695991666667]
[65536 0.11212592816666668]
[131072 0.23180363016666666]
[262144 0.48056071016666674]
[524288 1.1464995601666668]
[1048576 2.8823068596666666]
[2097152 6.4231157065]
[4194304 15.808042165333335]
[8388608 56.98961213533334]]
Plotting it out, it appears to be roughly O(n^3)
. The time taken seems to roughly double every time n
is doubled (O(n)
) until n=524288, then it explodes. A custom Wolfram regression calculator gave a best fit with a cubic curve.
1 Answer 1
Is there a way to avoid the
cond
dispatch innaïve-prime?
?
You could do naive-prime?
this way:
(defn naive-prime? [n]
(and
(>= n 2)
(->> n
(inc)
(Math/sqrt)
(range 2)
(not-any? #(zero? (rem n %))))))
This gets rid of the special cases 1
and 2
. The trick is to increment before applying Math/sqrt
, so that the edge case 2
works correctly.
I've also elided the not
into the some
to produce not-any?
.
I'd mainly like recommendations for speeding this up.
Try
(defn sieve-primes [n]
(loop [p 2 ; First prime
marked #{} ; Found composites
primes []]
(let [mults (range (* p p) (inc n) p)
next-p (->> p
(inc)
(iterate inc)
(remove marked)
(first))
new-primes (conj primes p)]
(if (>= (* next-p next-p) n)
(into new-primes (remove marked (range next-p (inc n)))
There are a couple of changes here.
- Generate
mults
as arange
, starting at(* p p)
- the first number that needs to be tested with prime factorp
. - Stop testing when the prime you would use,
next-p
, squared is bigger than the limitn
.
I don't like the idea of building a massive set of composites. Better turn the algorithm inside out and test each number as it occurs against only the necessary prime factors. But that's a quite different algorithm.
Explore related questions
See similar questions with these tags.
(remove marked ps)
compares eachp
inps
against all elements inmarked
, out of order? \$\endgroup\$