I wrote a program which computes the harmonic partial sum to N terms with multithreading capability. I've been working on this to sharpen my C++ skills for the upcoming semester. Just wondering if there's anything that doesn't follow the C++ standard and/or glaringly wrong. Also wondering if I can make any further optimizations.
Basically I have my program set up to only use threads if the input is divisible by 4, since I am using 4 threads due to my quad core CPU. I want this to be more flexible but I haven't thought of a better solution yet. If anyone has a suggestions, that'd be appreciated.
#include <iostream>
#include <cstdio>
#include <thread>
#include <ctime>
#include <cassert>
using namespace std;
const int NUM_THREADS = 4;
// H(N) is defined to be the harmonic partial sum to N terms.
// calculateHarmonic(): function given to threads to compute H(N).
// unsigned long start: starting point for the loop; different for each thread.
// unsigned long N: amount of terms to compute; different for each thread; divided evenly by # of threads.
// double *sum: pointer to a double which holds a thread's individual sum for H(N).
void calculateHarmonic(long long start, long long N, double *sum)
{
for(; start <= N; start++)
{
*sum += 1.0/start;
}
}
void executeWithThreads(long long N)
{
thread t[NUM_THREADS]; // using four threads
double sum[NUM_THREADS] = {0.0}; // array of each sum calculated by each thread
double finalSum = 0.0;
clock_t start = clock();
// Assign threads their task.
for(int i = 0; i < NUM_THREADS; i++)
{
// thread([name of function], [starting point], [N], [address of individual element in sum array])
t[i] = thread(calculateHarmonic, (((N/4)*i)+1), ((N/4)*(i+1)), &(sum[i]));
}
// Join threads to terminate and sum up all individual sums into finalSum.
for(int i = 0; i < NUM_THREADS; i++)
{
t[i].join();
finalSum += sum[i];
}
clock_t end = clock();
printf("Harmonic sum of %lld is %1.18lf\n", N, finalSum);
printf("Calculation took %1.18lf seconds.\n", (((double)(end-start)/CLOCKS_PER_SEC))/NUM_THREADS);
}
void executeNoThreads(long long N)
{
double sum = 0.0;
clock_t start = clock();
for(int i = 1; i <= N; i++)
{
sum += 1.0/i;
}
clock_t end = clock();
printf("Harmonic sum of %lld is %1.18lf\n", N, sum);
printf("Calculation took %1.18lf seconds.\n", (((double)(end-start)/CLOCKS_PER_SEC))/NUM_THREADS);
}
int main()
{
long long N;
cout << "Enter the amount of terms to calculate the Harmonic sum to: ";
cin >> N;
assert(N > 0 && "N must be a nonzero positive integer.");
cout << endl;
if(N > NUM_THREADS && N % 4 == 0)
{
cout << "Calculating with threads..." << endl;
executeWithThreads(N);
}
else if(N > NUM_THREADS && N % 4 != 0)
{
cout << "Calculating without threads..." << endl;
executeNoThreads(N);
}
else
{
cout << "Calculating without threads due to small N..." << endl;
executeNoThreads(N);
}
return 0;
}
Compiled with:
g++ -std=c++11 -pthread -Ofast harm_sum.cpp -o harm.out ; ./harm.out
2 Answers 2
Try not to use
using namespace std
in global scope. If you must use it, keep it in local scope, such as in a function. More info about that here.I don't need any need to use
printf()
overstd::cout
here. You are also using C-style casting and nostd::
forclock_t
, so some of this code still looks C-like.If you're using
printf()
because you're having trouble with formatting, then take a look at the<iomanip>
library for something useful.Unless you're debugging,
assert
may be unnecessary. You can easily output an error message tostd::cerr
and returnEXIT_FAILURE
frommain()
.The value
4
used inmain()
appears to be a magic number and should be a constant.Each use of
std::endl
flushes the buffer and takes extra time. Luckily, you're not using it in the timed section. It's still not needed in most cases and you can also output a newline with"\n"
, which doesn't do a flush as well. More info about that here.
Generally, you should prefer to use std::async
over naked use of threads. There are a few reasons for this: firstly, std::async
can actually return a value (wrapped in a std::future
), which means less mucking about with passing out parameters by pointer/reference, and secondly because (unless explicitly specified with launch flags) it will leave it up to the runtime to coordinate and dispatch threads appropriately.
This would look something like:
double calculate_harmonic(long long start, long long N)
{
double sum = 0.0;
for (; start <= N; ++start)
{
sum += 1.0 / start;
}
return sum;
}
int main()
{
const std::size_t n_async = 4;
const std::size_t N = 150;
std::array<std::future<double>, n_async> futures;
for (auto i = 0U; i < n_async; ++i) {
futures[i] = std::async(calculate_harmonic, ((N / 4) * i) + 1,
((N / 4) * (i + 1)));
}
double final_sum = 0.0;
for (auto&& future : futures) {
final_sum += future.get();
}
std::cout << "final sum: " << final_sum << "\n";
}
Note that this also cleans up having to do annoying things like call join()
.
The other nitpick is that for large harmonic series, you likely want to start with the smallest value and work your way backwards. The reason for this is because of the wonderful complexities of floating point arithmetic, but basically, floating point is not associative: a + b is not guaranteed to be equal to b + a for all floating point operands. To preserve as much precision as possible, you generally want to start working from smallest to largest value.
-
1\$\begingroup\$ Good answer! I'd suggest that you could set the value of
n_async = std::thread::hardware_concurrence()
and then reset it to a fallback value of1
ifhardware_concurrence()
returns0
. \$\endgroup\$Edward– Edward2015年01月19日 14:34:36 +00:00Commented Jan 19, 2015 at 14:34 -
\$\begingroup\$ @Yuushi Thanks for the response. I've spent some time researching and implementing what you've suggested. I'm sort of struggling to grasp
auto&&
actually is. Do you mind explaining it? \$\endgroup\$ctzdev– ctzdev2015年01月19日 22:16:56 +00:00Commented Jan 19, 2015 at 22:16 -
1\$\begingroup\$ :) Sorry for the typo! Looks like I made the same error twice, so I hope it didn't waste too much of your time. \$\endgroup\$Edward– Edward2015年01月19日 22:56:58 +00:00Commented Jan 19, 2015 at 22:56
-
1\$\begingroup\$ @ChrisTarazi
auto&&
is a bit tricky, but effectively it can bind to an lvalue, a const lvalue, or an rvalue. In this case you can replace it with(auto& future : futures)
if that's more understandable to you. \$\endgroup\$Yuushi– Yuushi2015年01月20日 00:40:51 +00:00Commented Jan 20, 2015 at 0:40 -
1
clock
functions now that we have<chrono>
. \$\endgroup\$