Deviation and Variance

Question 1

This assignment requires obtaining the deviation and variance of \$n\$ numbers stored in one single array. Is there any way to make this code more efficient?

#include <iostream>
#include <cmath>
using namespace std;
int main()
{
 const int arrSize = 14;
 double average , sum = 0 , dev = 0;
 double deviation[arrSize];
 int grades[arrSize] = { 89, 95, 72, 83, 99, 54, 86, 75, 92, 73, 79, 75, 82, 73 };
 // Calculating the average
 for ( int i = 0; i < arrSize; i++ )
 {
 sum += grades[i];
 average = ( sum / arrSize );
 }
 cout << "average is : " << average << endl;
 cout << endl;
 //Calculating the deviation and variance
 double sumVar = 0 , totalVar = 0;
 for ( int i = 0; i < arrSize; i++ )
 {
 deviation[i] = ( grades[i] - average );
 cout << "Grades = " << grades[i] << " and deviation is = " << deviation[i] << endl;
 double variance = pow( deviation[i] , 2 );
 sumVar = sumVar + variance;
 totalVar = ( sumVar / arrSize );
 }
 cout << endl;
 cout << "variance = " << totalVar << endl;
 system( "pause" );
 return 0;
}

Question 2

efficiency

Yes, this can be improved.

One obvious problem is that you're re-computing your average and your standard deviation on every iteration of a loop, even though only the last result (after the last iteration) is ever actually used.

For example:

for (int i = 0; i < arrSize; i++)
{
 sum += grades[i];
 average = (sum / arrSize);
}

You're computing average on every iteration, but only need or use the last value you compute. You can compute it once with code like:

for (int i = 0; i < arrSize; i++)
{
 sum += grades[i];
}
average = (sum / arrSize);

Your computation of the standard deviation is much the same way.

Use of pow

I'd avoid using pow to compute a square. It often imposes quite a bit of overhead, so pow(deviation[i], 2) will often be substantially slower than deviation[i]*deviation[i].

Formatting

Looking at the code more generally, you really need to fix your indentation.

`std::endl`

I would advise against using std::end. Normally, you just want '\n', which also gives you a new line, but will nearly always be (much) faster. In the case above, it won't make much difference, but if you're writing a lot of data to a file (for example) the difference can get very large, very quickly (e.g., a slowdown of 8:1 or 10:1 is fairly typical).

Question 3

So although this is already marked as accepted i would like to add, that if this is indeed C++ you should definitely use containers (std::vector in this case).

That will also allow you to utilize range based loops, which condense everything a little bit. Also you can use the algorithm library too

std::vector<int> grades = { 89, 95, 72, 83, 99, 54, 86, 75, 92, 73, 79, 75, 82, 73 };
// Calculating the average 
// If you ommit the static_cast, the result will be rounded down to the next integer 
double average = (static_cast<double>(std::accumulate(grades.begin(), grades.end(), 0))/ grades.size());

Question 4

using namespace std;

It is generally considered a poor idea to import all names from a namespace (with some exceptions, such as the namespaces for literals, or if you are actually implementing that namespace). It creates a risk of identifier clashes that could bite you when you compile against future C++ standards.

double dev = 0;

Unused variable.

system( "pause" );

sh: 1: pause: not found

And you didn't include <cstdlib> for this, anyway.

Arithmetic: there's a subtle problem in accumulating int values into a double, which you won't see on such small input sets. As the double re-scales itself, less and less of each int is significant to the sum. For large input sets, you may need to accumulate into a long (or unsigned long if your inputs can't be negative) and only add that to sum when it's about to overflow:

long acc = 0;
for ( int i = 0; i < arrSize; i++ )
{
 if (grades[i] > LONG_MAX-acc) {
 sum += acc;
 acc = 0;
 }
 acc += grades[i];
}
sum += acc;

(You'll need to include climits to define LONG_MAX)

General: it's a good idea to separate your printing from your algorithm. Create a function to do the computation; then your main() should call it appropriately and print the output. To get you started:

std::pair<double, double> mean_and_variance<const std::vector<int>& values);

Jerry Coffin Jerry Coffin 34.1k4 gold badges77 silver badges144 bronze badges · Accepted Answer · 2016-12-15 02:43:08Z

efficiency

Yes, this can be improved.

One obvious problem is that you're re-computing your average and your standard deviation on every iteration of a loop, even though only the last result (after the last iteration) is ever actually used.

For example:

for (int i = 0; i < arrSize; i++)
{
 sum += grades[i];
 average = (sum / arrSize);
}

You're computing average on every iteration, but only need or use the last value you compute. You can compute it once with code like:

for (int i = 0; i < arrSize; i++)
{
 sum += grades[i];
}
average = (sum / arrSize);

Your computation of the standard deviation is much the same way.

Use of pow

I'd avoid using pow to compute a square. It often imposes quite a bit of overhead, so pow(deviation[i], 2) will often be substantially slower than deviation[i]*deviation[i].

Formatting

Looking at the code more generally, you really need to fix your indentation.

`std::endl`

I would advise against using std::end. Normally, you just want '\n', which also gives you a new line, but will nearly always be (much) faster. In the case above, it won't make much difference, but if you're writing a lot of data to a file (for example) the difference can get very large, very quickly (e.g., a slowdown of 8:1 or 10:1 is fairly typical).

Stack Exchange Network

Deviation and Variance

3 Answers 3

efficiency

Use of pow

Formatting

`std::endl`

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Deviation and Variance

3 Answers 3

efficiency

Use of pow

Formatting

std::endl

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

`std::endl`