Standard deviation in one pass in C++ - follow-up

Question 1

(See the previous and initial iteration.)

This time, I use tag dispatching in order to make sure that non-random access iterators are not put into a std::distance, which would take \$\Theta(n)\$ time to complete, but instead, computes the distance while making the only pass over the data. See what I have now:

coderodde_sd.h

#ifndef CODERODDE_SD_H
#define CODERODDE_SD_H
#include <cmath>
#include <iterator>
#include <sstream>
#include <stdexcept>
#define DEBUG
#ifdef DEBUG
#include <iostream>
using std::cout;
#endif
namespace net {
 namespace coderodde {
 namespace stat {
 template<typename RandomAccessIter>
 double sd(RandomAccessIter begin, 
 RandomAccessIter end, 
 std::random_access_iterator_tag)
 {
 #ifdef DEBUG
 cout << "[DEBUG] RandomAccessIter version of sd is "
 "running. ";
 #endif
 auto distance = std::distance(begin, end);
 if (distance < 2)
 {
 std::stringstream ss;
 ss << "The standard deviation cannot be computed for "
 "less than two elements. The input sequence has "
 << distance 
 << (distance == 1 ? " element." : " elements.");
 throw std::runtime_error(ss.str());
 }
 double x = 0.0;
 double x_squared = 0.0;
 for (RandomAccessIter it = begin; it != end; ++it) 
 {
 x += *it;
 x_squared += (*it) * (*it);
 }
 return std::sqrt((x_squared - (x * x) / distance) / 
 (distance - 1)
 );
 }
 template<typename ForwardIter>
 double sd(ForwardIter begin, 
 ForwardIter end, 
 std::forward_iterator_tag) 
 {
 #ifdef DEBUG
 cout << "[DEBUG] ForwardIter version of sd is running. "; 
 #endif 
 double x = 0.0;
 double x_squared = 0.0;
 size_t distance = 0;
 for (ForwardIter iter = begin; iter != end; ++iter) 
 {
 x += *iter;
 x_squared += (*iter) * (*iter);
 ++distance;
 }
 if (distance < 2) 
 {
 std::stringstream ss;
 ss << "The standard deviation cannot be computed for "
 "less than two elements. The input sequence has "
 << distance
 << (distance == 1 ? " element." : " elements.");
 throw std::runtime_error(ss.str());
 }
 return std::sqrt((x_squared - (x * x) / distance) /
 (distance - 1)
 );
 }
 template<typename Iter>
 double sd(Iter begin, Iter end) 
 {
 return sd(begin,
 end, 
 typename std::iterator_traits<Iter>
 ::iterator_category());
 }
 } /* net::coderodde::stat */
 } /* net::coderodde */
} /* net */
#endif /* CODERODDE_SD_H */

main.cpp:

#include <iostream>
#include <list>
#include <vector>
#include "coderodde_sd.h"
using net::coderodde::stat::sd;
using std::cout;
using std::list;
using std::vector;
int main(int argc, char** argv) {
 double bad_array[]{1.0};
 try 
 {
 sd(bad_array, bad_array);
 }
 catch (std::runtime_error& error)
 {
 cout << "ERROR: " << error.what() << "\n";
 }
 try 
 {
 sd(bad_array, bad_array + 1);
 }
 catch (std::runtime_error& error)
 {
 cout << "ERROR: " << error.what() << "\n";
 }
 list<int> my_list = { 1, 5, 2, 4, 3 };
 vector<int> my_vector { 3, 4, 2, 5, 1 };
 int my_array[] = { 2, 4, 5, 3, 1 };
 cout << "Standard deviation (list): " 
 << sd(my_list.begin(), my_list.end())
 << "\n";
 cout << "Standard deviation (vector): "
 << sd(my_vector.begin(), my_vector.end())
 << "\n";
 cout << "Standard deviation (array): "
 << sd(my_array, my_array + sizeof(my_array) / sizeof(*my_array))
 << "\n";
 return 0;
}

Critique request

I want to improve this even more. Please tell me anything that comes to mind.

Question 2

first of all you need to remove the code duplication.

Question 3

Potentially off-topic if this is purely a coding exercise, but this is a good example of why you should use a library routine rather than rolling your own numerical code! "The sum of the squares less the square of the sum" is numerically inaccurate when you are dealing with a dataset that has a small deviation around a relatively large mean.

Question 4

If you want to see some analysis of nigel's comment, I suggest you look at Incremental calculation of weighted mean and variance; section 3 in particular.

Question 5

I would remove the code duplication like this:

 double compute_sd(double x, double x_sqr, double n){
 if (distance < 2) {
 std::stringstream ss;
 ss << "The standard deviation cannot be computed for "
 "less than two elements. The input sequence has "
 << distance 
 << (distance == 1 ? " element." : " elements.");
 throw std::runtime_error(ss.str());
 }
 return std::sqrt((x_sqr - (x * x) / n) / (n - 1));
 }
 template<typename RandomAccessIter>
 double sd(RandomAccessIter begin, 
 RandomAccessIter end, 
 std::random_access_iterator_tag)
 {
 #ifdef DEBUG
 cout << "[DEBUG] RandomAccessIter version of sd is "
 "running. ";
 #endif
 auto distance = std::distance(begin, end);
 double x = 0.0;
 double x_squared = 0.0;
 for (RandomAccessIter it = begin; it != end; ++it) {
 x += *it;
 x_squared += (*it) * (*it);
 }
 return compute_sd(x, x_squared, distance);
 }
 template<typename ForwardIter>
 double sd(ForwardIter begin, 
 ForwardIter end, 
 std::forward_iterator_tag) 
 {
 #ifdef DEBUG
 cout << "[DEBUG] ForwardIter version of sd is running. "; 
 #endif 
 double x = 0.0;
 double x_squared = 0.0;
 size_t distance = 0;
 for (ForwardIter iter = begin; iter != end; ++iter) {
 x += *iter;
 x_squared += (*iter) * (*iter);
 ++distance;
 }
 return compute_sd(x, x_squared, distance);
 }

Note that for the random access iterator the sum and sum of squares are calculated before the distance < 2 check. I'm expecting an optimizing compiler to see that distance doesn't depend on the value of the loop and move the check earlier. Even if it doesn't the < 2 case should be rare in a well designed application.

I am also wondering if the there is an actual measurable performance difference between the two implementations to warrant the extra complexity. I'm thinking that the distance incrementation is performed simultaneously with the floating point ops in the ALU on a super scalar CPU, so the increment doesn't affect the loop latency. Asssuming a typical x86 super scalar CPU. Measure, Measure, Measure!

Question 6

Your two algorithms are almost identical, and parts of it can be factored out into their own functions. For example,

void throw_if_too_small(size_t distance) 
{
 if (distance < 2)
 {
 std::stringstream ss;
 ss << "The standard deviation cannot be computed for "
 "less than two elements. The input sequence has "
 << distance 
 << (distance == 1 ? " element." : " elements.");
 throw std::runtime_error(ss.str());
 }
}
template <typename Iter>
double compute_sd(Iter begin, Iter end)
{
 double x = 0.0;
 double x_squared = 0.0;
 for (auto it = begin; it != end; ++it) 
 {
 x += *it;
 x_squared += (*it) * (*it);
 }
 return std::sqrt((x_squared - (x * x) / distance) / 
 (distance - 1)
 );
}

Then you have to make the ForwardIter version a bit uglier (or you can put extra code into compute_sd, but I think I'd prefer changing the ForwardIter version.

template<typename RandomAccessIter>
double sd(RandomAccessIter begin, 
 RandomAccessIter end, 
 std::random_access_iterator_tag)
{
 #ifdef DEBUG
 cout << "[DEBUG] RandomAccessIter version of sd is "
 "running. ";
 #endif
 auto distance = std::distance(begin, end);
 throw_if_too_small(distance);
 return compute_sd(begin, end);
}
template<typename ForwardIter>
double sd(ForwardIter begin, 
 ForwardIter end, 
 std::forward_iterator_tag) 
{
 #ifdef DEBUG
 cout << "[DEBUG] ForwardIter version of sd is running. "; 
 #endif 
 size_t distance = 0;
 for (auto iter = begin; iter != end || distance < 2; ++iter) 
 {
 ++distance;
 }
 throw_if_too_small(distance);
 return compute_sd(begin, end);
}

There are a few ways you could handle the ForwardIter version - This is probably the most obvious solution. I think a better one is to push the tag identification down a layer, into a distance_at_least_two function, and then specialize there.

template <typename ForwardIter>
bool distance_at_least_two(ForwardIter begin, ForwardIter end, std::forward_iterator_tag)
{
 size_t distance = 0;
 for (auto iter = begin; iter != end || distance < 2; ++iter) 
 {
 ++distance;
 }
 return distance == 2;
}
template <typename RandomAccessIter>
bool distance_at_least_two(RandomAccessIter begin, RandomAccessIter end, std::random_access_iterator_tag)
{
 return std::distance(begin, end) >= 2;
}

and then the calling function can change into this

template<typename Iter>
double sd(Iter begin, Iter end) 
{
 if (distance_at_least_two(begin, end, std::iterator_traits<Iter>::iterator_category()))
 {
 return compute_sd(begin, end);
 }
 else
 {
 std::stringstream ss;
 ss << "The standard deviation cannot be computed for "
 "less than two elements. The input sequence has "
 << distance 
 << (distance == 1 ? " element." : " elements.");
 throw std::runtime_error(ss.str());
 }
}

And you then don't need two separate sd specializations. You could probably also play with std::enable_if to avoid explicitly passing the iterator tag.

I don't like the chunk of debug code there - that sort of logging/debugging seems unnecessary. You could also DRY it out and hide the macro ugliness too.

void debug(std::string name)
{
 #ifdef DEBUG
 cout << "[DEBUG " << name << "version of sd is running. ";
 #endif
}

You could also have it take an iterator tag as the parameter, and then overload the function and just hardcode the name. That smells worse to me, but just a thought.

I mentioned it before so I won't explain it again, but with a little extra work you could write a more functional solution, which are (imo) usually easier to read than explicit for loops.

Question 7

would readable use range for in function sd. for (const auto &el :{ begin,end}) {...

Question 8

I do not agree with your rationale of not calling std::distance: computing distance manually imposes the same \$\Theta(n)\$ penalty. However, the ForwardIterator requirement is too strong. Your algorithm works with InputIterator as well, and using them indeed justifies manual distance tracking.

I also recommend to rename x and x_squared (the latter being especially misleading) to sum and sum_squares.

Emily L. Emily L. 16.7k1 gold badge39 silver badges89 bronze badges · Accepted Answer · 2016-07-14 21:21:54Z

I would remove the code duplication like this:

 double compute_sd(double x, double x_sqr, double n){
 if (distance < 2) {
 std::stringstream ss;
 ss << "The standard deviation cannot be computed for "
 "less than two elements. The input sequence has "
 << distance 
 << (distance == 1 ? " element." : " elements.");
 throw std::runtime_error(ss.str());
 }
 return std::sqrt((x_sqr - (x * x) / n) / (n - 1));
 }
 template<typename RandomAccessIter>
 double sd(RandomAccessIter begin, 
 RandomAccessIter end, 
 std::random_access_iterator_tag)
 {
 #ifdef DEBUG
 cout << "[DEBUG] RandomAccessIter version of sd is "
 "running. ";
 #endif
 auto distance = std::distance(begin, end);
 double x = 0.0;
 double x_squared = 0.0;
 for (RandomAccessIter it = begin; it != end; ++it) {
 x += *it;
 x_squared += (*it) * (*it);
 }
 return compute_sd(x, x_squared, distance);
 }
 template<typename ForwardIter>
 double sd(ForwardIter begin, 
 ForwardIter end, 
 std::forward_iterator_tag) 
 {
 #ifdef DEBUG
 cout << "[DEBUG] ForwardIter version of sd is running. "; 
 #endif 
 double x = 0.0;
 double x_squared = 0.0;
 size_t distance = 0;
 for (ForwardIter iter = begin; iter != end; ++iter) {
 x += *iter;
 x_squared += (*iter) * (*iter);
 ++distance;
 }
 return compute_sd(x, x_squared, distance);
 }

Note that for the random access iterator the sum and sum of squares are calculated before the distance < 2 check. I'm expecting an optimizing compiler to see that distance doesn't depend on the value of the loop and move the check earlier. Even if it doesn't the < 2 case should be rare in a well designed application.

I am also wondering if the there is an actual measurable performance difference between the two implementations to warrant the extra complexity. I'm thinking that the distance incrementation is performed simultaneously with the floating point ops in the ALU on a super scalar CPU, so the increment doesn't affect the loop latency. Asssuming a typical x86 super scalar CPU. Measure, Measure, Measure!

Stack Exchange Network

Standard deviation in one pass in C++ - follow-up

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Standard deviation in one pass in C++ - follow-up

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions