I'm making a prime number scanning algorithm in Java. It uses the concept of the Sieve of Eratosthenes to efficiently find the prime numbers.
It works well and can calculate all the prime numbers under 1,000,000 in less than a second on my laptop, but I was wondering how the algorithm could be further improved.
public static void eratoImproved(long max){
long start = System.currentTimeMillis();
ArrayList<Long> invalidated = new ArrayList<>(); // where invalidated numbers will be stored.
// prepare
double maxFac = Math.sqrt(max);
int f = 2;
while(f <= maxFac){
boolean isNew = true;
for(long l : invalidated){
if(f % l == 0){
isNew = false;
}
}
if(isNew) {
invalidated.add(Math.round(f + 0.0));
}
f++;
}
ArrayList<Long> primes = new ArrayList<>();
long i = 3; // current test
for(long l : invalidated){
primes.add(l);
}
while(i <= max){
boolean v = true;
for(long s : invalidated){
if(i % s == 0){
v = false;
break;
}
}
if(v){
primes.add(i);
long m = 2*i;
if(primes.size() >= 150){
String s = Math.round(((i + 0.0)/max)*100) + "%: ";
for(long l : primes){
s += l + "/";
}
System.out.println(s);
primes.clear();
}
}
i += 2;
}
System.out.println("Completed search!");
String s = "Remaining primes: ";
for(long l : primes){
s += l + "/";
}
System.out.println(s);
primes.clear();
System.out.println("Conducted search in " + (System.currentTimeMillis() - start)/1000 + " seconds.");
}
I've optimized everything I can think of, but I wanted to get a second opinion.
-
\$\begingroup\$ For algorithm, you can use the concept of segmented sieve. stackoverflow.com/questions/10249378/… \$\endgroup\$Gulshan Raj– Gulshan Raj2015年11月01日 02:10:54 +00:00Commented Nov 1, 2015 at 2:10
4 Answers 4
Answer by janos is good, but is more about style. Here are some functional/performance observations related to your stated goal of efficiency.
invalidated.add(Math.round(f + 0.0))
, huh???
f
coerced todouble
, add a zero, callround()
. Why?
Suggest changef
fromint
tolong
, and just useinvalidated.add(f)
.while(f <= maxFac){
is done by continually coercingf
todouble
and comparing thedouble
values. ChangemaxFac
tolong
using cast (truncation is ok here, no need for rounding).for(long l : invalidated){ primes.add(l); }
is a long way of sayingprimes.addAll(invalidated)
, except you're adding the overhead of unboxing and reboxing the values.You have two identical loops of
invalidated
to determine if prime. Move to reusable method (DRY).Never do
String += String
in a loop. Use aStringBuilder
.
And again, what's with thei + 0.0
?StringBuilder buf = new StringBuilder(); buf.append(Math.round(i * 100.0 / max)).append("%: "); for (Long prime : primes) buf.append(prime).append('/'); System.out.println(buf);
Not a sieve
You claimed that you were using a Sieve of Eratosthenes algorithm but your program actually uses a trial division algorithm. This trial division is fast enough for primes up to 1000000, but it becomes slower and slower as you try to find higher primes.
For instance, I modified your program to add up all the primes up to 50000000. I compared it my own program that did the same thing using a sieve algorithm. The trial division program took 13.4 seconds versus only 0.42 seconds for the sieve program, which is a 32x speed difference.
Example of sieve
Here is the sieve function I used to add up all the primes to 50000000:
public static long addPrimes(int n)
{
boolean [] isComposite = new boolean[n+1];
int sqrtn = (int) Math.sqrt(n);
// This is the sieve. It computes whether each number is prime.
for (int i = 3; i < sqrtn; i += 2) {
if (!isComposite[i]) {
int increment = i+i;
for (int j = i*i; j <= n; j += increment) {
isComposite[j] = true;
}
}
}
// Now add up all the primes.
long total = 0;
// Start with 2 as the first prime.
if (n >= 2)
total = 2;
// Add each prime from the sieve.
for (int i = 3; i < n; i += 2) {
if (!isComposite[i]) {
total += i;
}
}
return total;
}
-
\$\begingroup\$ You could probably get a small speed improvement, and certainly a memory usage improvement, by using
java.lang.BitSet
instead ofboolean[]
, tweaking the logic slightly so that each bit corresponds to an odd number, and usingnextClearBit
. \$\endgroup\$Peter Taylor– Peter Taylor2015年11月01日 08:32:40 +00:00Commented Nov 1, 2015 at 8:32 -
\$\begingroup\$ @PeterTaylor In fact I already did all that but I didn't want to confuse a beginner with too many advanced concepts. The bit per odd number version was faster than the boolean version by about 70% for the primes up to 50000000. \$\endgroup\$JS1– JS12015年11月01日 08:42:06 +00:00Commented Nov 1, 2015 at 8:42
In terms of performance, this seems fine.
In terms of coding practices, a couple of things can be improved.
Declare with interface type when it's enough
This list can be declared as a List<Long>
:
ArrayList<Long> invalidated = new ArrayList<>();
Like this:
List<Long> invalidated = new ArrayList<>();
Apply this everywhere possible.
Use a for
loop instead of while
where more appropriate
Both while
loops in your code can be rewritten as for
loops, for example:
for (int f = 2; f <= maxFac; ++f) {
boolean isNew = true;
for (long l : invalidated) {
if (f % l == 0) {
isNew = false;
}
}
if (isNew) {
invalidated.add(Math.round(f + 0.0));
}
}
Reasons to prefer this form:
- The key elements of the looping logic are all on one line, easy to see
- By declaring the loop variable in the
for
statement, you limit its scope to the loop body, making it impossible to misuse the variable outside the loop by mistake - It's a common pattern for a counting loop
Poor names
The code is full of poor names. For example:
for (long l : invalidated) { primes.add(l); }
Why not name that loop variable prime
? It would be all the more readable.
All the single-letter variables would be better to rename to something more descriptive to help reading the code.
Unnecessary statements
This variable is unused, so delete it:
long m = 2 * i;
-
-
\$\begingroup\$ I still see many single-letter variables there... \$\endgroup\$janos– janos2015年10月31日 20:20:34 +00:00Commented Oct 31, 2015 at 20:20
It's rather hard to review this code because the documentation is extremely misleading.
I'm making a prime number scanning algorithm in Java. It uses the concept of the Sieve of Eratosthenes to efficiently find the prime numbers.
It works well and can calculate all the prime numbers under 1,000,000 in less than a second on my laptop, but I was wondering how the algorithm could be further improved.
So you want to find primes. But actually it seems, from the way you use the invalidated
list, that you really want to find primes between sqrt(max)
and max
. If that's the case,
- Document it clearly, especially before asking other people to review the code!
- Having computed
invalidated
, starti
at1 | (int)Math.sqrt(max)
to avoid repeating all of the calculation you've done to generateinvalidated