I am learning streams and lambdas in Java 8.
I wanted to implement the classic Sieve of Eratosthenes using lambdas and streams. This is my implementation.
/**
* @author Sanjay
*/
class Sieve {
/**
* Implementation of Sieve of eratosthenes algorithm using streams and lambdas
* It's much slower than the version with traditional for loops
* It consumes much memory than the version with traditional for loops
* It computes all primes upto 100_000_000 in 2 secs (approx)
*/
public static List<Integer> sieveOfEratosthenes(int _n) {
// if _n is empty return an empty list
if (_n <= 1) return Arrays.asList();
// create a boolean array each index representing index number itself
boolean[] prime = new boolean[_n + 1];
// set all the values in prime as true
IntStream.rangeClosed(0, _n).forEach(x -> prime[x] = true);
// make all composite numbers false in prime array
IntStream.rangeClosed(2, (int) Math.sqrt(_n))
.filter(x -> prime[x])
.forEach(x -> unsetFactors(x, prime, _n));
// create a list containing primes upto _n
List<Integer> primeList = new ArrayList<>((_n < 20) ? _n : (_n < 100) ? 30 : _n / 3);
// add all the indexes that represent true in prime array to primeList
IntStream.rangeClosed(2, _n).filter(x -> prime[x]).forEach(primeList::add);
// return prime list
return primeList;
}
/*
* makes all the factors of x in prime array to false
* as primes don't have any factors
* here x is a prime and this makes all factors of x as false
*/
private static void unsetFactors(int x, boolean[] prime, int _n) {
IntStream.iterate(x * 2, factor -> factor + x)
.limit(_n / x - 1)
.forEach(factor -> prime[factor] = false);
}
}
What is its efficiency compared to normal for-loops? Are there any imporvements to be made?
-
\$\begingroup\$ @greybeard Please don't answer in comments. \$\endgroup\$200_success– 200_success2017年10月01日 14:36:33 +00:00Commented Oct 1, 2017 at 14:36
1 Answer 1
Implementation
You can invert the values for prime
and not prime
to avoid the initial rangeClosed(0, _n).forEach(x -> prime[x] = true)
. The innermost loop can start at x*x
instead of x*2
:
// (1)
public static List<Integer> sieveOfEratosthenes(int n) {
boolean[] notPrime = new boolean[n + 1];
IntStream.rangeClosed(2, (int) Math.sqrt(n))
.filter(x -> !notPrime[x])
.forEach(x -> {
IntStream.iterate(x * x, m -> m <= n, m -> m + x)
.forEach(y -> notPrime[y] = true);
});
List<Integer> list = new ArrayList<>();
IntStream.rangeClosed(2, n)
.filter(x -> !notPrime[x])
.forEach(x -> list.add(x));
return list;
}
This implementation still feels like a literal translation of the for-loop approach. Streams offer options to improve readability, i.e. the inner loop can be flattened:
// (2)
public static List<Integer> sieveOfEratosthenes(int n) {
boolean[] notPrime = new boolean[n + 1];
rangeClosed(2, (int) Math.sqrt(n))
.filter(x -> !notPrime[x])
.flatMap(x -> iterate(x * x, m -> m <= n, m -> m + x))
.forEach(x -> notPrime[x] = true);
return rangeClosed(2, n)
.filter(x -> !notPrime[x])
.boxed().collect(toList());
}
Stream vs Loop
The main problem I see with a stream based approach is (besides the overhead of the streams), that it is more difficult to optimize compared to a loop approach as the abstraction level is higher. For example converting the implementation to a parallel approach is in my opinion way more complicated with a stream approach.
Algorithm improvements (not directly related to the question)
You can use a BitSet
(or directly a long
/int
/byte
array) to ensure that each entry in the sieve consumes only one bit (instead of currently (likely) 8). You are storing all values from 2
to n
in the sieve, you can save memory by skipping multiples of 2
(and 3
, 5
, ...) at the cost of some additional calculations to convert between sieve position and value.
For larger values of n
it might be advantageous to divide the sieving process in smaller steps and use a small array instead of a large array for all values and sieving the complete range at once.
The sieving process can be converted to a parallel implementation with nearly linear speedup as two or more values can be sieved simultaneously (sieving process of each value is independent from other values).
Approx. performance for n=100_000_000
, included my implementation (which utilizes most of the improvements mentioned above) as reference value:
// 1 Thread
Initial 3406676978ns
Version (1) 2487619662ns
Version (2) 2459434320ns
For loop 2158261344ns
Nevay to List 1436579484ns
Nevay to Sieve 1040829243ns
// 8 Threads
Nevay to List 682085628ns
Nevay to Sieve 172952467ns
Explore related questions
See similar questions with these tags.