I wrote a class around a UniformRandomBitGenerator
, which in turn also satisfies the concept of a UniformRandomBitGenerator
. It uses a buffer and parallelism (openmp
) to quickly generate large amounts of random numbers.
I'm happy about all kinds of feedback, but I'm most unsure about my use of openmp
and ensuring that the same seed always generates the same sequence of pseudo-random numbers.
Here's the class:
#pragma once
#include <random>
#include <vector>
#include <omp.h>
template<class RNG_type>
class BP_RNG {
public:
using unsigned_t = unsigned;
using result_type = typename RNG_type::result_type;
constexpr static result_type min() { return RNG_type::min(); };
constexpr static result_type max() { return RNG_type::max(); };
BP_RNG(int seed = 42, unsigned_t buffer_size = 1e8, unsigned_t nthreads = 4);
result_type operator()();
private:
using buffer_t = std::vector<result_type>;
using iter_t = typename std::vector<result_type>::iterator;
void refill();
RNG_type engine_;
buffer_t buffer_;
iter_t position_;
const unsigned_t nthreads_;
};
template<class RNG_type>
BP_RNG<RNG_type>::BP_RNG(int seed, unsigned_t buffer_size, unsigned_t nthreads):
engine_(seed),
buffer_(buffer_size),
position_(buffer_.begin()),
nthreads_(nthreads)
{
refill();
}
template<class RNG_type>
void BP_RNG<RNG_type>::refill()
{
// Get seeds for RNGs in parallel section beforehand to ensure determinism
buffer_t seedvals(nthreads_);
for (auto& seed : seedvals) seed = engine_();
#pragma omp parallel num_threads(nthreads_)
{
RNG_type local_engine( seedvals[omp_get_thread_num()] );
#pragma omp for
// Pedestrian vector-iteration so that omp works
for (unsigned_t i = 0; i < buffer_.size(); ++i) buffer_[i] = local_engine();
}
position_ = buffer_.begin();
}
template<class RNG_type>
typename BP_RNG<RNG_type>::result_type BP_RNG<RNG_type>::operator()()
{
if (position_ == buffer_.end()) refill();
return *(position_++);
}
And the main.cpp
I used to test it:
#include <iostream>
#include <chrono>
#include "buffered_parallel_rng.hpp"
template<class RNG>
void time_rng(RNG& eng, unsigned long iter, std::string name)
{
typename RNG::result_type a;
auto start = std::chrono::high_resolution_clock::now();
for (unsigned long i = 0; i < iter; ++i)
{
a = eng();
}
auto stop = std::chrono::high_resolution_clock::now();
double elapsed = static_cast<std::chrono::duration<double> >(stop-start).count();
std::cout << name << " : " << elapsed << "\n";
}
void test_determinism(unsigned ntests)
{
BP_RNG<std::mt19937_64> seeder(42);
for (unsigned j = 0; j < ntests; ++j)
{
int seed = seeder();
BP_RNG<std::mt19937_64> myeng1(seed),myeng2(seed);
for (unsigned i = 0; i < 10; ++i)
{
if (myeng1() != myeng2()) {
std::cout << "Different results from same seed!\n"
<< "Seed is: " << seed << "\n";
return;
}
}
}
}
int main() {
unsigned iter = 1e9;
std::mt19937_64 ref_eng;
ref_eng.seed(100);
BP_RNG<std::mt19937_64> my_eng(100);
time_rng(ref_eng, iter, "Other");
time_rng(my_eng , iter, "Mine ");
test_determinism(100);
}
-
2\$\begingroup\$ Welcome to Code Review! I hope you get some great answers. \$\endgroup\$Phrancis– Phrancis2017年10月22日 13:58:02 +00:00Commented Oct 22, 2017 at 13:58
1 Answer 1
There are a few (relatively minor) things here that can be improved on:
Order your includes. Specifically, group your includes by their source (i.e. current project headers, STL headers, library headers...). Most commonly, the following order is used:
- Header file the current source file implements (i.e.
a.h
for a file calleda.cpp
) if any - Headers from the same project you're working on
- Headers from different projects/libraries (in your case,
omp.h
, for example) - Headers from the standard library
This, on one hand, serves the purposes of checking whether each header file actually contains all required includes (say, for example, your header actually used something from
chrono
; with your current include order, your compiler would not complain because you includechrono
inmain.cpp
beforebuffered_parallel_rng.hpp
, thus hiding the missing include), on the other hand helps readers of you source code to verify includes quickly.- Header file the current source file implements (i.e.
using unsigned_t = unsigned;
Quite frankly, I don't see the reason you do this here. Why not just useunsigned
?Anyhow, if you really want to have this (and also have a good reason to), I would expect a similar
using
forint
here because of consistency. Alternatively, one could argue that you should be taking anSizeType
and aSeedType
as template parameters, especially since theresult_type
s and thus the seed types of random number generators in the standard library are different for different generators (for example,std::mt19937_64
usesstd::uint_fast64_t
whereasstd::minstd_rand0
usesstd::uint_fast32_t
).Instead of
using iter_t = typename std::vector<result_type>::iterator;
, you should writeusing iter_t = typename buffer_t::iterator;
to prevent subtle bugs when changing the type ofbuffer_t.
The name
BP_RNG
is not very clear. Although the abbreviation "rng" is quite common, what "bp" means will not be clear to most people on first sight. Since your header is namedbuffered_parallel_rng.hpp
, I would expect your class to have the same name.time_rng
currently takes astd::string
as argument by value, but you only ever output it tostd::cout
, so you should take it by const reference instead.Don't write
"\n"
where'\n'
would suffice. For example, when writing tostd::cout
, the former implies a call tostd::strlen
and an additional indirection while the latter just passes the value directly.iter
is not really a fitting name for the second parameter you pass totime_rng
. I was confused the first time I glossed over that function because, for me at least,iter
almost exclusively stands foriterator
. I would have grasped that parameter's meaning much more quickly if you had actually named ititerations
or anything along that line.
All in all, you code is mostly well written. None of the points above constitute a "Act now! Your code is on fire!"-grade problem, only tips that could make your already good code a little better and prevent some subtle, thus nasty bugs.
However, I can't tell whether your approach is really worth anything. You should definitely perform some benchmarks to see if there are any performance improvements compared to a single threaded approach and, if yes, how big they are.
Another thing you should consider is learning to do proper testing. There are a variety of testing frameworks (for example, Google Test) which provide functionality for writing and executing tests effectively. Especially since you are concerned about determinism here, writing and executing tests at a much larger scale should help you verify that your code works correctly.
-
\$\begingroup\$ Thanks for the extensive feedback! I use
unsigned_t
to have to option to easily change the type if needed. I'll definitely look into a more structured approach to testing. \$\endgroup\$ochsnerd– ochsnerd2017年10月22日 19:21:10 +00:00Commented Oct 22, 2017 at 19:21 -
\$\begingroup\$ @ochsnerd Having
unsigned_t
does not really allow you to change the type easily, since you'll also have to adapt every use of your class. If you need that much flexibility, you should make it a template parameter. \$\endgroup\$Ben Steffan– Ben Steffan2017年10月22日 19:24:15 +00:00Commented Oct 22, 2017 at 19:24