3
\$\begingroup\$

I am learning streams and lambdas in Java 8.

I wanted to implement the classic Sieve of Eratosthenes using lambdas and streams. This is my implementation.

 /**
 * @author Sanjay
 */
class Sieve {
 /**
 * Implementation of Sieve of eratosthenes algorithm using streams and lambdas 
 * It's much slower than the version with traditional for loops
 * It consumes much memory than the version with traditional for loops
 * It computes all primes upto 100_000_000 in 2 secs (approx) 
 */
 public static List<Integer> sieveOfEratosthenes(int _n) {
 // if _n is empty return an empty list
 if (_n <= 1) return Arrays.asList();
 // create a boolean array each index representing index number itself
 boolean[] prime = new boolean[_n + 1];
 // set all the values in prime as true
 IntStream.rangeClosed(0, _n).forEach(x -> prime[x] = true);
 // make all composite numbers false in prime array
 IntStream.rangeClosed(2, (int) Math.sqrt(_n))
 .filter(x -> prime[x])
 .forEach(x -> unsetFactors(x, prime, _n));
 // create a list containing primes upto _n
 List<Integer> primeList = new ArrayList<>((_n < 20) ? _n : (_n < 100) ? 30 : _n / 3);
 // add all the indexes that represent true in prime array to primeList
 IntStream.rangeClosed(2, _n).filter(x -> prime[x]).forEach(primeList::add);
 // return prime list
 return primeList;
 }
 /*
 * makes all the factors of x in prime array to false
 * as primes don't have any factors
 * here x is a prime and this makes all factors of x as false
 */
 private static void unsetFactors(int x, boolean[] prime, int _n) {
 IntStream.iterate(x * 2, factor -> factor + x)
 .limit(_n / x - 1)
 .forEach(factor -> prime[factor] = false);
 }
}

What is its efficiency compared to normal for-loops? Are there any imporvements to be made?

200_success
145k22 gold badges190 silver badges478 bronze badges
asked Oct 1, 2017 at 8:56
\$\endgroup\$
1
  • \$\begingroup\$ @greybeard Please don't answer in comments. \$\endgroup\$ Commented Oct 1, 2017 at 14:36

1 Answer 1

1
\$\begingroup\$

Implementation

You can invert the values for prime and not prime to avoid the initial rangeClosed(0, _n).forEach(x -> prime[x] = true). The innermost loop can start at x*x instead of x*2:

// (1)
public static List<Integer> sieveOfEratosthenes(int n) {
 boolean[] notPrime = new boolean[n + 1];
 IntStream.rangeClosed(2, (int) Math.sqrt(n))
 .filter(x -> !notPrime[x])
 .forEach(x -> {
 IntStream.iterate(x * x, m -> m <= n, m -> m + x)
 .forEach(y -> notPrime[y] = true);
 });
 List<Integer> list = new ArrayList<>();
 IntStream.rangeClosed(2, n)
 .filter(x -> !notPrime[x])
 .forEach(x -> list.add(x));
 return list;
}

This implementation still feels like a literal translation of the for-loop approach. Streams offer options to improve readability, i.e. the inner loop can be flattened:

// (2)
public static List<Integer> sieveOfEratosthenes(int n) {
 boolean[] notPrime = new boolean[n + 1];
 rangeClosed(2, (int) Math.sqrt(n))
 .filter(x -> !notPrime[x])
 .flatMap(x -> iterate(x * x, m -> m <= n, m -> m + x))
 .forEach(x -> notPrime[x] = true);
 return rangeClosed(2, n)
 .filter(x -> !notPrime[x])
 .boxed().collect(toList());
}

Stream vs Loop

The main problem I see with a stream based approach is (besides the overhead of the streams), that it is more difficult to optimize compared to a loop approach as the abstraction level is higher. For example converting the implementation to a parallel approach is in my opinion way more complicated with a stream approach.

Algorithm improvements (not directly related to the question)

You can use a BitSet (or directly a long/int/byte array) to ensure that each entry in the sieve consumes only one bit (instead of currently (likely) 8). You are storing all values from 2 to n in the sieve, you can save memory by skipping multiples of 2 (and 3, 5, ...) at the cost of some additional calculations to convert between sieve position and value.

For larger values of n it might be advantageous to divide the sieving process in smaller steps and use a small array instead of a large array for all values and sieving the complete range at once.

The sieving process can be converted to a parallel implementation with nearly linear speedup as two or more values can be sieved simultaneously (sieving process of each value is independent from other values).

Approx. performance for n=100_000_000, included my implementation (which utilizes most of the improvements mentioned above) as reference value:

// 1 Thread
Initial 3406676978ns
Version (1) 2487619662ns
Version (2) 2459434320ns
For loop 2158261344ns
Nevay to List 1436579484ns
Nevay to Sieve 1040829243ns
// 8 Threads
Nevay to List 682085628ns
Nevay to Sieve 172952467ns
answered Oct 1, 2017 at 16:01
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.