I have a set of functions in my program for conveniently generating random number in various distributions. I'm interested in the questions listed below, though other comments are also welcome.
- Am I using C++'s random tools correctly and clearly?
- From watching rand() Considered Harmful and reading this answer, I learned that RNG calls are not thread-safe. I changed my code just by adding the specifier
thread_local
to the generator and distribution variables. Is this all that's needed to make the functions thread-safe? These are called withinstd::async
procedures in my project. - Does it matter whether one
std::mt19937_64
generator is used for all the functions or if each function has its own copy?
In Random.h
#ifndef RANDOM_H
#define RANDOM_H
#include <random>
#include <algorithm>
namespace Random
{
// Random number with normal distribution and mean of zero
double random_normal(double standard_deviation);
// Random number with inclusive range
double random_real(double min, double max);
int random_integer(int min, int max);
// Return true with probability 50%
bool coin_flip();
// Return true with given probability
bool success_probability(double probability);
namespace
{
thread_local std::mt19937_64 generator(std::random_device{}());
}
// Shuffles the order of the list
template<class List>
void shuffle(List& list)
{
std::shuffle(list.begin(), list.end(), generator);
}
}
#endif // RANDOM_H
In Random.cpp
#include "Random.h"
#include <random>
int Random::random_integer(int min, int max)
{
using uid = std::uniform_int_distribution<int>;
thread_local static auto dist = uid{};
return dist(generator, uid::param_type{min, max});
}
double Random::random_normal(double standard_deviation)
{
using nd = std::normal_distribution<double>;
thread_local static auto dist = nd{};
return dist(generator, nd::param_type{0.0, standard_deviation});
}
double Random::random_real(double min, double max)
{
using urd = std::uniform_real_distribution<double>;
thread_local static auto dist = urd{};
return dist(generator, urd::param_type{min, max});
}
bool Random::coin_flip()
{
return success_probability(0.5);
}
bool Random::success_probability(double probability)
{
return random_real(0.0, 1.0) < probability;
}
An example usage:
#include <iostream>
#include <future>
#include "Random.h"
double random_walk(int number_of_steps, double step_size)
{
double position = 0;
for(int i = 0; i < number_of_steps; ++i)
{
position += Random::random_normal(step_size);
}
return position;
}
int main()
{
// Simulate 1,000,000-step continuous random walks
const int number_of_walks = 100; // adjust to suit your computer
const int number_of_steps = 1000000;
const double step_size = 1.0;
std::vector<std::future<double>> results;
for(int i = 0; i < number_of_walks; ++i)
{
results.emplace_back(std::async(random_walk, number_of_steps, step_size));
}
for(auto& result : results)
{
std::cout << result.get() << std::endl;;
}
return 0;
}
2 Answers 2
To answer your questions:
- Sure looks like it.
- No, see next point.
- No, it is not safe.
std::mersenne_twister_engine::operator()
reads and increments its internal state. AFAIK, this makes it not thread safe at all.
So, here are a few suggestions:
An extra semicolon hid itself in your code:
std::cout << result.get() << std::endl;; ^
Don't use
std::endl
. It implicitly usesstd::flush
to flush the stream. This is not very cheap, so it is better for performance if you just output'\n'
.If you do want to flush the stream, it would be better to state your intent and use
std::flush
, but this depends on you.Mark functions that do not throw
noexcept
. That way, you state your intent that the functions will not throw, and it helps the compiler optimize a bit.Currently, your
shuffle
takes a container. While that is desirable in most cases, sometimes, you just need to shuffle a subrange of a container. It's for exactly this reason that all the standard algorithms take an iterator pair instead of a container.If you try to include
Random.h
more than once, you will get symbol collision for the variablegenerator
. The same thing happens for functions. Here's what you can do:Use C++17, and mark the variable
inline
, which behaves exactly like in functions:thread_local inline std::mt19937_64 generator(std::random_device{}());
Mark the variable
extern
, and put its definition inRandom.cpp
.
You don't need
return 0;
at the end ofmain()
. The compiler implicitly adds it for you.I like to use types as parameter that try to match the preconditions of a function. For example, I would use
unsigned
integers everywhere where you usesigned
integers, because I do not see the point in calling let's sayrandom_walk
with a negativenumber_of_steps
.You can shorten the definitions in
Random.cpp
by removingRandom::
if you wrap them aroundnamespace Random
, like in the declarations.
-
\$\begingroup\$ You probably want to rewrite point 5. And the interaction between TU-private
generator
and the template-functionshuffle()
looks fascinating. \$\endgroup\$Deduplicator– Deduplicator2017年06月14日 16:27:04 +00:00Commented Jun 14, 2017 at 16:27 -
\$\begingroup\$ @Deduplicator What's so fascinating about it? \$\endgroup\$Rakete1111– Rakete11112017年06月14日 16:28:26 +00:00Commented Jun 14, 2017 at 16:28
-
\$\begingroup\$ Well, you have multiple mismatched definitions of that template-function, as it always refers to the TU-local
generator
. I don't think it's healthy if it's instantiated with the same type in multiple TUs, I think it's UB. \$\endgroup\$Deduplicator– Deduplicator2017年06月14日 16:30:55 +00:00Commented Jun 14, 2017 at 16:30 -
\$\begingroup\$ @Deduplicator It should be ok I think. If it's not, then every template in the Standard Library can be a source of undefined behavior if not used correctly (in only one TU). \$\endgroup\$Rakete1111– Rakete11112017年06月14日 16:36:07 +00:00Commented Jun 14, 2017 at 16:36
-
\$\begingroup\$ The problem is that the same function is defined with different semantics in different TUs. That doesn't generally happen with the Standard Library. \$\endgroup\$Deduplicator– Deduplicator2017年06月14日 18:19:48 +00:00Commented Jun 14, 2017 at 18:19
I would strongly suggest, that you create convenience classes for that use case, which act similar to functions.
The reason is, that the random number generators generally increment their state and more importantly that initialization is costly. Therefore, create a class that holds the generator and the distribution and create an operator()
#include <random>
class randomStreamNormal {
public:
explicit randomStreamNormal(double mean, double stddev)
: mt(rand()), norm_dist(mean, stddev) {}
explicit randomStreamNormal(double mean, double stddev, double seed)
: mt(seed), norm_dist(mean, stddev) {}
double operator ()(void) { return norm_dist(mt); }
private:
std::mt19937_64 mt;
std::normal_distribution<double> norm_dist;
};
class randomStreamUniformInt {
public:
explicit randomStreamUniformInt(int lower_bound, int upper_bound)
: mt(rand()), uniform_dist(lower_bound, upper_bound) {}
explicit randomStreamUniformInt(int lower_bound, int upper_bound, double seed)
: mt(seed), uniform_dist(lower_bound, upper_bound) {}
int operator ()(void) { return uniform_dist(mt); }
private:
std::mt19937_64 mt;
std::uniform_int_distribution<> uniform_dist;
};
Noter that i used rand() because I needed to be able to reproduce the results.
-
\$\begingroup\$ How would this class be used? Should it be instantiated once and passed around to every function that needs random numbers? Or, statically constructed in functions that need it? Both of these seem to run into the same problem of multiple
std::async
calls stomping on the internal state of the generators. Non-statically constructing instances of this class inside functions would generate the same values over and over again. \$\endgroup\$Mark H– Mark H2017年06月14日 21:40:08 +00:00Commented Jun 14, 2017 at 21:40 -
\$\begingroup\$ This class should be instanciated once per independent random stream and passed as arguments. Note that initialization of a rng is costly, so within functions performance will be terrible. Also note, that different threads should not access the same random stream as that tampers with the independence of the random processes. \$\endgroup\$miscco– miscco2017年06月15日 05:07:03 +00:00Commented Jun 15, 2017 at 5:07
Explore related questions
See similar questions with these tags.