We were discussing Standard Deviations in chat, so I decided to write a standard deviation calculator:
static double PopulationStandardDeviation(List<double> numberSet)
{
double mean = numberSet.Average();
return Math.Sqrt(numberSet.Sum(x => Math.Pow(x - mean, 2)) / numberSet.Count);
}
static double SampleStandardDeviation(List<double> numberSet)
{
double mean = numberSet.Sum() / numberSet.Count;
return Math.Sqrt(numberSet.Sum(x => Math.Pow(x - mean, 2)) / (numberSet.Count - 1));
}
How could these be improved? Because they are both Standard Devation calculators, should I combine them into one method with a header of static double StandardDeviation(List<double> numberSet, bool isSample)
? Are they an optimal solution for the problem?
3 Answers 3
It's good to have 2 methods. For the reason that @britishtea said in a comment, which is so spot-on I'm just going to quote it verbatim:
A method that takes a boolean value as a parameter and behaves differently depending on that value is not one method, but two methods! You should keep them separated. You couldc of coursec implement them using shared functionality, but your public API should offer two methods. :) – @britishtea
And indeed, this is how the .NET framework does it too (See StandardDeviation() and StandardDeviationP(), thanks @200_success).
At the same time, of course don't repeat yourself. Extract the common logic to a private helper method, and rewrite your public methods in terms of the private one:
private static double StandardDeviation(List<double> numberSet, double divisor)
{
double mean = numberSet.Average();
return Math.Sqrt(numberSet.Sum(x => Math.Pow(x - mean, 2)) / divisor);
}
static double PopulationStandardDeviation(List<double> numberSet)
{
return StandardDeviation(numberSet, numberSet.Count);
}
static double SampleStandardDeviation(List<double> numberSet)
{
return StandardDeviation(numberSet, numberSet.Count - 1);
}
Lastly, the parameter name "numberSet" is a misnomer, suggesting a Set, when it's really a List.
double mean = numberSet.Average(); double mean = numberSet.Sum() / numberSet.Count;
I would stick to using numberSet.Average()
in both cases.
Performance-wise, use (x - mean) * (x - mean)
instead of Math.Pow(x - mean, 2)
.
-
\$\begingroup\$ If performance is important, also don't use LINQ. \$\endgroup\$Joel Mueller– Joel Mueller2022年02月16日 03:28:44 +00:00Commented Feb 16, 2022 at 3:28
StandardDeviation()
StandardDeviationP()
\$\endgroup\$double? stdDev = EntityFunctions.StandardDeviation(new List<double> { 1, 2, 3 });
will throw an exception. \$\endgroup\$