Project Euler Problem #10

Question 1

I've read other solutions about solving the 10th problem for Project Euler (finding the sum of all prime numbers under 2,000,000) but I wanted to try it on my own using the Sieve of Eratosthenes method. It runs quickly and it seems to work for all of my test increments (under 10, first 1000 primes) but it doesn't seem to work for the full 2,000,000 range. I was wondering if anyone could help point out the problem in my code. I know it's not the prettiest/most efficient code, so sorry.

#include <iostream>
using namespace std;
int main(int argc, char const *argv[])
{
 // upper limit & sum
 int sum = 0;
 int upLimit = 2000000;
 // array of isPrime check for integers
 bool isPrime[upLimit+1];
 // initialize
 for (int i=2; i <= upLimit; i++) {
 isPrime[i] = true;
 }
 for (int i=2; i<=upLimit; i++) {
 if (isPrime[i]) {
 // add to sum
 sum += i;
 // mark all multiples
 int j = i;
 while (j <= upLimit) {
 isPrime[j] = false;
 j += i;
 }
 }
 }
 cout << sum << endl;
}

Question 2

You need to check for overflow before adding.

for (int i=2; i<=upLimit; i++) {
 if (isPrime[i]) {
 // Check for overflow
 if (std::numeric_limits<decltype(sum)>::max() - sum <= i) {
 std::cerr << "Overflow" << std::endl;
 return 1;
 }
 // add to sum
 sum += i;
 ...

As int is only guaranteed to hold values up to 32767, you'll find that you should use a long long for sum.

Question 3

int is only guaranteed to hold as many values as short, which in turn guaranteed to hold as many values as char. In any case, if you suspect that overflow is a problem, testing for overflow inside an int loop running up to upLimit is a dubious solution.

Question 4

Note that long long is only guaranteed to be at least 64-bits since C++11.

Question 5

(Or C99, if you're not adverse to mixing the two)

Question 6

The problem is overflow, which can be solved by changing some of the types in your code (I'm going to pretend i is predeclared here):

int sum = 0;
int upLimit = 2000000;
int i = 2;
int j = i;

While int is commonly a 32-bit integer, this isn't required. And we really care about the size (and the range it implies) of our integer types. So let's start by switching to the fixed width integer types defined in cstdint*. We should also switch to the unsigned variety. Since our code only uses positive integers, this effectively doubles our overflow-safe range:

uint32_t sum = 0;
uint32_t upLimit = 2000000;
uint32_t i = 2;
uint32_t j = i;

This helps, but will still break eventually. So let's establish an upper bound on the maximum value that we'd like to store in sum. There's some really interesting math in this area, but for now, let's pretend we don't know anything about primes. We only know that we'll be summing distinct 32-bit numbers. So that's the sum of at most 2^32 values. And the largest possible value is 2^32. Which means that sum can never grow larger than 2^32 * 2^32 = 2^64. And we have a type that is guaranteed to be able to contain that:

uint64_t sum = 0;

No more overflow.

If you're interested, cstdint also contains type definitions like uint_fast32_t, which allow the compiler to use a larger type if that would allow it to generate more efficient code. See for example cstdint on CppReference. Usual caveats regarding premature optimization apply, of course.

One last tip, not related to overflow: a bool[] takes up 1 bytes per value, while a std::vector<bool> is allowed to be more space-efficient. And space-efficiency (and more generally cache-friendliness) will be paramount if you want to take this beyond a few million.

*) These definitions require C++11. If that's not available, many compilers offer similar types in C++98 mode, although portability might suffer. Alternatively, if your compiler supports C99, you might be able to get working fixed width definitions from stdint.h

Question 7

Other than the datatype limits, try this optimisation with the information that any prime greater than 3 can be represented as 6x+1 or 6x-1.

int sum = 5; // (2 + 3)
// For generic upLimit, i.e upLimit<4 have the conditional returns.
// Initialize the isPrime array with true.
// loop through i with i=5, check for i (6x-1), i+2 (6x+1) and then increase i by 6 
// (which is done in two steps i+=2 and i+=4.
for (int i=5; i<=upLimit; i+=4) {
 if (isPrime[i]) { // Prime 6x-1
 // add to sum
 sum += i;
 // mark all multiples
 int j = i;
 while (j <= upLimit) {
 isPrime[j] = false;
 j += i;
 }
 }
 i+=2; // Prime 6x+1
 if (isPrime[i]) {
 // add to sum
 sum += i;
 // mark all multiples
 int j = i;
 while (j <= upLimit) {
 isPrime[j] = false;
 j += i;
 }
 }
}

Question 8

While this is not bad advice, in my experience, space efficiency matters more to a sieve. For example, storing only odd candidates in your sieve array (such that isPrime[i] represents candidate 2i+1) will likely provide a larger speedup than the above optimization. IIRC, the paper available after correctly submitting problem #10 discusses the odd-only option in greater detail.

Question 9

Yes that surely will add to it but ensure readability of the code. Your array size could be reduced to upLimit/3 , storing 6x-1 and 6x+1 only.

Question 10

Why stop there? You can use the length-30 prime wheel to reduce the size requirements to 8-bits per 30 candidates :-). But yes, maintainability suffers.

score 3 · Answer 1 · 2015-01-06 06:17:02Z

3

\$\begingroup\$

You need to check for overflow before adding.

for (int i=2; i<=upLimit; i++) {
 if (isPrime[i]) {
 // Check for overflow
 if (std::numeric_limits<decltype(sum)>::max() - sum <= i) {
 std::cerr << "Overflow" << std::endl;
 return 1;
 }
 // add to sum
 sum += i;
 ...

As int is only guaranteed to hold values up to 32767, you'll find that you should use a long long for sum.

Share

answered Jan 6, 2015 at 6:17

200_success's user avatar

200_success 200_success200_success

145k22 gold badges190 silver badges478 bronze badges

\$\endgroup\$

3

1

\$\begingroup\$ int is only guaranteed to hold as many values as short, which in turn guaranteed to hold as many values as char. In any case, if you suspect that overflow is a problem, testing for overflow inside an int loop running up to upLimit is a dubious solution. \$\endgroup\$

vnp
– vnp

2015年01月06日 06:42:05 +00:00
Commented Jan 6, 2015 at 6:42
\$\begingroup\$ Note that long long is only guaranteed to be at least 64-bits since C++11. \$\endgroup\$

Daan
– Daan

2015年01月08日 14:36:54 +00:00
Commented Jan 8, 2015 at 14:36
\$\begingroup\$ (Or C99, if you're not adverse to mixing the two) \$\endgroup\$

Daan
– Daan

2015年01月08日 15:03:30 +00:00
Commented Jan 8, 2015 at 15:03

Add a comment |

Daan DaanDaan 5263 silver badges8 bronze badges · Answer 2 · 2015-01-07 17:22:20Z

The problem is overflow, which can be solved by changing some of the types in your code (I'm going to pretend i is predeclared here):

int sum = 0;
int upLimit = 2000000;
int i = 2;
int j = i;

While int is commonly a 32-bit integer, this isn't required. And we really care about the size (and the range it implies) of our integer types. So let's start by switching to the fixed width integer types defined in cstdint*. We should also switch to the unsigned variety. Since our code only uses positive integers, this effectively doubles our overflow-safe range:

uint32_t sum = 0;
uint32_t upLimit = 2000000;
uint32_t i = 2;
uint32_t j = i;

This helps, but will still break eventually. So let's establish an upper bound on the maximum value that we'd like to store in sum. There's some really interesting math in this area, but for now, let's pretend we don't know anything about primes. We only know that we'll be summing distinct 32-bit numbers. So that's the sum of at most 2^32 values. And the largest possible value is 2^32. Which means that sum can never grow larger than 2^32 * 2^32 = 2^64. And we have a type that is guaranteed to be able to contain that:

uint64_t sum = 0;

No more overflow.

If you're interested, cstdint also contains type definitions like uint_fast32_t, which allow the compiler to use a larger type if that would allow it to generate more efficient code. See for example cstdint on CppReference. Usual caveats regarding premature optimization apply, of course.

One last tip, not related to overflow: a bool[] takes up 1 bytes per value, while a std::vector<bool> is allowed to be more space-efficient. And space-efficiency (and more generally cache-friendliness) will be paramount if you want to take this beyond a few million.

*) These definitions require C++11. If that's not available, many compilers offer similar types in C++98 mode, although portability might suffer. Alternatively, if your compiler supports C99, you might be able to get working fixed width definitions from stdint.h

thepace thepacethepace 2,32910 silver badges10 bronze badges · Answer 3 · 2015-01-08 06:22:00Z

Other than the datatype limits, try this optimisation with the information that any prime greater than 3 can be represented as 6x+1 or 6x-1.

int sum = 5; // (2 + 3)
// For generic upLimit, i.e upLimit<4 have the conditional returns.
// Initialize the isPrime array with true.
// loop through i with i=5, check for i (6x-1), i+2 (6x+1) and then increase i by 6 
// (which is done in two steps i+=2 and i+=4.
for (int i=5; i<=upLimit; i+=4) {
 if (isPrime[i]) { // Prime 6x-1
 // add to sum
 sum += i;
 // mark all multiples
 int j = i;
 while (j <= upLimit) {
 isPrime[j] = false;
 j += i;
 }
 }
 i+=2; // Prime 6x+1
 if (isPrime[i]) {
 // add to sum
 sum += i;
 // mark all multiples
 int j = i;
 while (j <= upLimit) {
 isPrime[j] = false;
 j += i;
 }
 }
}

While this is not bad advice, in my experience, space efficiency matters more to a sieve. For example, storing only odd candidates in your sieve array (such that isPrime[i] represents candidate 2i+1) will likely provide a larger speedup than the above optimization. IIRC, the paper available after correctly submitting problem #10 discusses the odd-only option in greater detail.
Yes that surely will add to it but ensure readability of the code. Your array size could be reduced to upLimit/3 , storing 6x-1 and 6x+1 only.
Why stop there? You can use the length-30 prime wheel to reduce the size requirements to 8-bits per 30 candidates :-). But yes, maintainability suffers.

Stack Exchange Network

Project Euler Problem #10

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Project Euler Problem #10

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions