Functional Sieve of Eratosthenes in Clojure

Question 1

For some reason, this is the first time I've ever written an implementation of the Sieve of Eratosthenes. I basically just stared at the Wikipedia walkthrough of the algorithm until it made sense, then tried making the algorithm in idiomatic Clojure.

I'm not using an boolean array representing every number. Instead, I have a marked set, and a primes vector that are maintained separately.

My main concern is performance. I was hoping for better performance than I'm getting. It's only about 2x faster than the naïve implementation (included below the Sieve code for reference). I'd mainly like recommendations for speeding this up.
I decided to go overboard with the use of ->> here. It just fit so perfectly everywhere, but I think I may have taken it too far.
Is there a way to avoid the cond dispatch in naïve-prime??
Anything else notable.

(ns irrelevant.sieves.eratosthenes)
; ----- Sieve -----
(defn sieve-primes [n]
 (loop [p 2 ; First prime
 marked #{} ; Found composites
 primes []]
 (let [mults (->> n
 (range p)
 (map #(* p %))
 (take-while #(< % n)))
 next-p (->> p
 (inc)
 (iterate inc)
 (remove marked)
 (first))
 new-primes (conj primes p)]
 (if (>= next-p n)
 new-primes
 (recur next-p (into marked mults) new-primes)))))
; ----- Naive -----
(defn naive-prime? [n]
 (cond
 (< n 2) false
 (= n 2) true
 :else (->> n
 (Math/sqrt)
 (inc)
 (range 2)
 (some #(zero? (rem n %)))
 (not))))
(defn naive-primes [n]
 (->> n
 (range 2)
 (filterv naive-prime?)))

Runtime Complexity:

I wrote up a testing function to automate timing with the help of Criterium for benchmarking:

(defn test-times [start-n end-n step-factor]
 (->> (iterate #(* step-factor %) start-n)
 (take-while #(< % end-n))
 (mapv (fn [n] [n (->> (c/quick-benchmark*
 (fn [] (sieve-primes n))
 nil)
 (:mean)
 (first))]))))

This tests how long it takes for various values of n, then packages the results in a vector of [n time-taken] pairs. Here are the results (times in seconds):

(test-times 1 1e7 2)
=>
[[1 2.3196602948749615E-7]
 [2 2.3105128053865377E-7]
 [4 1.42932799280724E-6]
 [8 5.63559279071997E-6]
 [16 1.2984918224299065E-5]
 [32 3.289705735278571E-5]
 [64 7.087895492662475E-5]
 [128 1.4285673446327686E-4]
 [256 2.923177758112095E-4]
 [512 6.273025793650794E-4]
 [1024 0.0012981272479166666]
 [2048 0.0025652252416666667]
 [4096 0.005141740791666667]
 [8192 0.01068772265]
 [16384 0.022486903166666666]
 [32768 0.05367695991666667]
 [65536 0.11212592816666668]
 [131072 0.23180363016666666]
 [262144 0.48056071016666674]
 [524288 1.1464995601666668]
 [1048576 2.8823068596666666]
 [2097152 6.4231157065]
 [4194304 15.808042165333335]
 [8388608 56.98961213533334]]

Plotting it out, it appears to be roughly O(n^3). The time taken seems to roughly double every time n is doubled (O(n)) until n=524288, then it explodes. A custom Wolfram regression calculator gave a best fit with a cubic curve.

Question 2

Empirical Orders of Growth, please! (i.e. compare the speeds at two size points at least, preferably three or more, to get better picture of what's going on there).

Question 3

can it be that (remove marked ps) compares each p in ps against all elements in marked, out of order?

Question 4

@WillNess A membership test of a set is near constant. This is also lazy, so it will only search enough to find the first result.

Question 5

ok, thanks. what about the empirical orders of growth? it would give a good hint about the goings on.

Question 6

@WillNess I haven't yet graphed the runtime of different n values. I can do that today. I did notice that it scaled better than the naive version though. It was only like 1.8x faster than the naive for n=1000, but ~2.2x faster for n=10000.

Question 7

Is there a way to avoid the cond dispatch in naïve-prime??

You could do naive-prime? this way:

(defn naive-prime? [n]
 (and
 (>= n 2)
 (->> n
 (inc)
 (Math/sqrt)
 (range 2)
 (not-any? #(zero? (rem n %))))))

This gets rid of the special cases 1 and 2. The trick is to increment before applying Math/sqrt, so that the edge case 2 works correctly.

I've also elided the not into the some to produce not-any?.

I'd mainly like recommendations for speeding this up.

Try

(defn sieve-primes [n]
 (loop [p 2 ; First prime
 marked #{} ; Found composites
 primes []]
 (let [mults (range (* p p) (inc n) p)
 next-p (->> p
 (inc)
 (iterate inc)
 (remove marked)
 (first))
 new-primes (conj primes p)]
 (if (>= (* next-p next-p) n)
 (into new-primes (remove marked (range next-p (inc n)))

There are a couple of changes here.

Generate mults as a range, starting at (* p p) - the first number that needs to be tested with prime factor p.
Stop testing when the prime you would use, next-p, squared is bigger than the limit n.

I don't like the idea of building a massive set of composites. Better turn the algorithm inside out and test each number as it occurs against only the necessary prime factors. But that's a quite different algorithm.

Thumbnail Thumbnail 1,6349 silver badges16 bronze badges · Accepted Answer · 2019-02-11 02:31:04Z

Is there a way to avoid the cond dispatch in naïve-prime??

You could do naive-prime? this way:

(defn naive-prime? [n]
 (and
 (>= n 2)
 (->> n
 (inc)
 (Math/sqrt)
 (range 2)
 (not-any? #(zero? (rem n %))))))

This gets rid of the special cases 1 and 2. The trick is to increment before applying Math/sqrt, so that the edge case 2 works correctly.

I've also elided the not into the some to produce not-any?.

I'd mainly like recommendations for speeding this up.

Try

(defn sieve-primes [n]
 (loop [p 2 ; First prime
 marked #{} ; Found composites
 primes []]
 (let [mults (range (* p p) (inc n) p)
 next-p (->> p
 (inc)
 (iterate inc)
 (remove marked)
 (first))
 new-primes (conj primes p)]
 (if (>= (* next-p next-p) n)
 (into new-primes (remove marked (range next-p (inc n)))

There are a couple of changes here.

Generate mults as a range, starting at (* p p) - the first number that needs to be tested with prime factor p.
Stop testing when the prime you would use, next-p, squared is bigger than the limit n.

I don't like the idea of building a massive set of composites. Better turn the algorithm inside out and test each number as it occurs against only the necessary prime factors. But that's a quite different algorithm.

Stack Exchange Network

Functional Sieve of Eratosthenes in Clojure

Runtime Complexity:

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Functional Sieve of Eratosthenes in Clojure

Runtime Complexity:

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions