Object-oriented approach to sieve of Eratosthenes

Question 1

I was trying to implement the sieve of Eratosthenes entirely by myself, using only what I have learned so far. Now I'm wondering what I can do to improve my code's readability. Optimizing the algorithm is not my goal for now.

#include <iostream>
#include <cmath>
using namespace std;
class num {
private:
 static unsigned obj_count; //simplifying initialization of the array, performance is not my main concern
 unsigned n;
 bool flagged;
public:
 num();
 void setn(unsigned);
 void setflagged(bool);
 bool isflagged();
 unsigned getn();
};
void Eratosthenes_sieve(num*, unsigned);
//Initialization of static member
unsigned num::obj_count = 0;
//Implementation of member functions
num::num() {
 flagged = false;
 ++obj_count; //simply declaring an array of class num will result in initialized values for n
 n = obj_count;
}
void num::setn(unsigned n) {
 this->n = n;
}
void num::setflagged(bool flagged) {
 this->flagged = flagged;
}
bool num::isflagged() {
 return flagged;
}
unsigned num::getn() {
 return n;
}
int main()
{
 unsigned n = 100;
 num arr[n];
 Eratosthenes_sieve(arr, n);
 return 0;
}
void Eratosthenes_sieve(num *arr, unsigned n) {
 unsigned i;
 for (i = 1; i < sqrt(n); ++i) { //minor optimization, as there will be no multiples or sqrt(arr[i]) that are lower than n
 if ( !(arr[i].isflagged()) ) { //if the number is not flagged, proceed to flag it
 for (unsigned j = 1; j < ( n / (arr[i].getn()) ); ++j) {
 arr[i + j * arr[i].getn()].setflagged(true); //flags every arr[i]-th number in the array.
 }
 }
 }
 cout << "Prime numbers up to " << n << ": \n";
 for (i = 1; i < n; ++i) {
 if ( !(arr[i].isflagged()) ) {
 cout << arr[i].getn() << " ";
 }
 }
}

Question 2

Do calculations before loops

 unsigned i;
 for (i = 1; i < sqrt(n); ++i) { //minor optimization, as there will be no multiples or sqrt(arr[i]) that are lower than n

Another optimization that you can do is to save the result of the sqrt(n), as that will be constant through all iterations of the loop.

 unsigned i = 1;
 double sqrt_n = sqrt(n);
 //minor optimization, as there will be no multiples or sqrt(arr[i]) that are lower than n
 for ( ; i < sqrt_n; ++i) {

I also moved the comment to its own line for readability.

 for (unsigned j = 1; j < ( n / (arr[i].getn()) ); ++j) {

And again

 for (unsigned j = 1, m = n / (arr[i].getn()); j < m; ++j) {

But we can do even better

 for (unsigned j = i + arr[i].getn(); j < n; j += arr[i].getn()) {

Note that this allows us to simplify the array dereference in

 arr[i + j * arr[i].getn()].setflagged(true); //flags every arr[i]-th number in the array.

Now becomes just

 //flags every arr[i]-th number in the array.
 arr[j].setflagged(true);

This changes a complex calculation to a simple one. I also find it clearer as to what it is doing than the original.

Avoid using namespace declarations

 cout << "Prime numbers up to " << n << ": \n";
 for (i = 1; i < n; ++i) {
 if ( !(arr[i].isflagged()) ) {
 cout << arr[i].getn() << " ";
 }
 }

In this example it is both shorter and more readable to say

 std::cout << "Prime numbers up to " << n << ": \n";
 for (i = 1; i < n; ++i) {
 if ( !(arr[i].isflagged()) ) {
 std::cout << arr[i].getn() << " ";
 }
 }

and save the using namespace std; declaration.

This is also better in general, as it reduces the chance of name clashes. While it is unlikely that someone else will try to create a function called cout, it is quite possible that they might want to create ones called swap or move. If you say std::swap, std::move, etc. every time, then it is much easier to see which function is being used.

Naming

It's generally easier to use the same naming convention throughout. You use two here: nospaces and snake_case. Personally, I think nospaces is horrid, so I'd standardize on snake_case. So is_flagged, set_flagged, set_n, get_n, etc.

To understand why I dislike nospaces, consider a variable nos_paces. Under nospaces, that would becomes nospaces. That's hard to read. None of your variables have that problem, but it is still easier to read with something separating the words. We don't have to think about where to split up the name. It just tells us.

Try to avoid names like num, arr, or isflagged in favor of names that describe what the variable holds in more detail than just the type. arr is an array of numbers, so call it numbers. isflagged returns true when n is a composite (not prime) value, so call it is_composite. num exists to associate a number with its primality, so call it Number_Primality (another suggestion given later but requires changing its type).

Plan code for reuse

 num arr[n];

What happens if you add a second array?

 num arr[n];
 num numbers[n];

Now numbers[0].getn() will return n which seems unlikely to be what you want to happen.

It's not really clear why you need an object here. An array of Booleans would hold the same information more efficiently and without the extra code. This wastes the first element of the array but is otherwise simpler. You would have to change a few < symbols to <= but the algorithm is basically the same.

 bool is_composite[n+1] = { false };

If you're doing this purely as a coding exercise, that's fine. There are however other problems that better use the expressiveness of objects.

Question 3

Is "n" really a good idea?

I don't like arr[i].getn(). The whole algorithm relies on the fact that arr[i].getn() is exactly i+1. If this is ever not true, then the sieve will not work. Given that the array index is inextricably linked to the value of n, I wouldn't even bother with n at all.

Using constant expressions in loop conditions

In two places, you use constant expressions in your loop condition:

for (i = 1; i < sqrt(n); ++i) // sqrt() on every loop
for (unsigned j = 1; j < ( n / (arr[i].getn()) ); ++j) // division on every loop

What's bad about this is that you will be calling sqrt() or doing a division on every loop, even though the value of the expression will never change. It's possible the compiler could optimize this for you, but it's not a certainty. It's easy enough to just move the expression out of the loop yourself:

int sqrtN = sqrt(n);
for (i = 1; i < sqrtN; ++i)

Simplifying the loop

Even after getting rid of getn(), the second loop can still be simplified. The current loop is this:

for (unsigned j = 1; j < (n / (i+1)); ++j) {
 arr[i + j * (i+1)].setflagged(true);
}

It can be simplified to this:

for (unsigned j = i+(i+1); j < n; j += (i+1)) {
 arr[j].setflagged(true);
}

In other words, instead of computing the array index using a multiplication, you can just have j be the array index. This way avoids any multiplications and divisions and in my opinion is also simpler to read. It would be even simpler if arr[i] stood for the number i instead of i+1.

Question 4

If I wanted to make arr[i] to stand for the number i instead of i+1, does that mean I will have to start from i=1, instead of i=0?

Question 5

@StefanRendevski I think you would need to start at i=2, the first prime number. Right now you are starting from i=1, which is the number 2.

mdfst13 mdfst13 22.4k6 gold badges34 silver badges70 bronze badges · Accepted Answer · 2015-06-02 00:10:24Z

Do calculations before loops

 unsigned i;
 for (i = 1; i < sqrt(n); ++i) { //minor optimization, as there will be no multiples or sqrt(arr[i]) that are lower than n

Another optimization that you can do is to save the result of the sqrt(n), as that will be constant through all iterations of the loop.

 unsigned i = 1;
 double sqrt_n = sqrt(n);
 //minor optimization, as there will be no multiples or sqrt(arr[i]) that are lower than n
 for ( ; i < sqrt_n; ++i) {

I also moved the comment to its own line for readability.

 for (unsigned j = 1; j < ( n / (arr[i].getn()) ); ++j) {

And again

 for (unsigned j = 1, m = n / (arr[i].getn()); j < m; ++j) {

But we can do even better

 for (unsigned j = i + arr[i].getn(); j < n; j += arr[i].getn()) {

Note that this allows us to simplify the array dereference in

 arr[i + j * arr[i].getn()].setflagged(true); //flags every arr[i]-th number in the array.

Now becomes just

 //flags every arr[i]-th number in the array.
 arr[j].setflagged(true);

This changes a complex calculation to a simple one. I also find it clearer as to what it is doing than the original.

Avoid using namespace declarations

 cout << "Prime numbers up to " << n << ": \n";
 for (i = 1; i < n; ++i) {
 if ( !(arr[i].isflagged()) ) {
 cout << arr[i].getn() << " ";
 }
 }

In this example it is both shorter and more readable to say

 std::cout << "Prime numbers up to " << n << ": \n";
 for (i = 1; i < n; ++i) {
 if ( !(arr[i].isflagged()) ) {
 std::cout << arr[i].getn() << " ";
 }
 }

and save the using namespace std; declaration.

This is also better in general, as it reduces the chance of name clashes. While it is unlikely that someone else will try to create a function called cout, it is quite possible that they might want to create ones called swap or move. If you say std::swap, std::move, etc. every time, then it is much easier to see which function is being used.

Naming

It's generally easier to use the same naming convention throughout. You use two here: nospaces and snake_case. Personally, I think nospaces is horrid, so I'd standardize on snake_case. So is_flagged, set_flagged, set_n, get_n, etc.

To understand why I dislike nospaces, consider a variable nos_paces. Under nospaces, that would becomes nospaces. That's hard to read. None of your variables have that problem, but it is still easier to read with something separating the words. We don't have to think about where to split up the name. It just tells us.

Try to avoid names like num, arr, or isflagged in favor of names that describe what the variable holds in more detail than just the type. arr is an array of numbers, so call it numbers. isflagged returns true when n is a composite (not prime) value, so call it is_composite. num exists to associate a number with its primality, so call it Number_Primality (another suggestion given later but requires changing its type).

Plan code for reuse

 num arr[n];

What happens if you add a second array?

 num arr[n];
 num numbers[n];

Now numbers[0].getn() will return n which seems unlikely to be what you want to happen.

It's not really clear why you need an object here. An array of Booleans would hold the same information more efficiently and without the extra code. This wastes the first element of the array but is otherwise simpler. You would have to change a few < symbols to <= but the algorithm is basically the same.

 bool is_composite[n+1] = { false };

If you're doing this purely as a coding exercise, that's fine. There are however other problems that better use the expressiveness of objects.

Stack Exchange Network

Object-oriented approach to sieve of Eratosthenes

2 Answers 2

Do calculations before loops

Avoid using namespace declarations

Naming

Plan code for reuse

Is "n" really a good idea?

Using constant expressions in loop conditions

Simplifying the loop

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Object-oriented approach to sieve of Eratosthenes

2 Answers 2

Do calculations before loops

Avoid using namespace declarations

Naming

Plan code for reuse

Is "n" really a good idea?

Using constant expressions in loop conditions

Simplifying the loop

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions