I have implemented the shuffling algorithm of Fisher-Yates in C++, but I've stumbled across the modulo bias. Is this random number generation by rand()
correct for Fisher-Yates?
srand(time(NULL));
vector<int> elements;
// push numbers from 1 to 9 into the vector
for (int i = 1; i < 9; ++i)
{
elements.push_back(i);
}
// the counter to be used to generate the random number
int currentIndexCounter = elements.size();
for (auto currentIndex = elements.rbegin(); currentIndex != elements.rend() - 1;
currentIndex++ , --currentIndexCounter)
{
int randomIndex = rand() % currentIndexCounter;
// if its not pointing to the same index
if (*currentIndex != elements.at(randomIndex))
{
//then swap the elements
swap(elements.at(randomIndex), *currentIndex);
}
}
2 Answers 2
@Loki Astari is correct in his comment:
rand()
is well know for its bad distribution. But you are using it correctly. Note:rand() % currentIndexCounter
does not give you a perfect distribution unlesscurrentIndexCounter
is an exact divisor ofRAND_MAX
. You may want to look at C++11 random header and the generators provided inside.
Since you're using C++11, you should instead utilize the <random>
library. It includes things such as pseudo-random number generators and random number distributions.
Beyond that, I'll just point out some additional things:
With C++11, you should now be using
nullptr
instead ofNULL
.This should be changed in
srand()
:std::srand(std::time(nullptr));
However, as mentioned above, you should not be using this with C++11. But in any situation where you must stay with
rand
, the above would be recommended.With C++11, you should also have access to initializer lists. This will allow you to initialize the vector instead of calling
push_back()
multiple times:std::vector<int> elements { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
Also note that your
push_back()
was only pushing numbers 1-8 into the vector, despite what your comment said. You've used<
instead of<=
in the loop statement, so the 9 was excluded.currentIndexCounter
should be of typestd::vector<int>::size_type
, which is the return type ofsize()
(or you can just useauto
). This should've given you compiler warnings as it's a type-mismatch and a possible loss of data. Make sure your compiler warnings are turned up high.currentIndex
is a misleading name because it's being used with iterators, not indices.Here,
auto
givescurrentIndex
the typestd::reverse_iterator
fromstd::rbegin()
. You could simply name this something likeiter
. It's okay for iterator names to be short.In the second
for
loop, you just needelements.rend()
, without the- 1
. The iterator will already go through the entire vector and stop at the last reverse element.The iterator increment should be prefix instead in order to improve performance a bit by avoiding a copy. It's best to do this with all nontrivial types such as iterators.
I've tested this by giving
randomIndex
a value greater thanelements.size()
, causing an exception to be thrown. Even ifstd::rand()
or another RNG is guaranteed to give you an intended value, it may be best to handle this properly.
Here's the final version with my own given changes, which I've also ran here. I've also included a concise way of displaying the vector before and after the shuffling.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <random>
#include <vector>
int main()
{
// seed the RNG
std::random_device rd;
std::mt19937 mt(rd());
std::vector<int> elements { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
std::cout << "Before: ";
std::copy(elements.cbegin(), elements.cend(),
std::ostream_iterator<int>(std::cout, " "));
auto currentIndexCounter = elements.size();
for (auto iter = elements.rbegin(); iter != elements.rend();
++iter, --currentIndexCounter)
{
// get int distribution with new range
std::uniform_int_distribution<> dis(0, currentIndexCounter);
const int randomIndex = dis(mt);
if (*iter != elements.at(randomIndex))
{
std::swap(elements.at(randomIndex), *iter);
}
}
std::cout << "\nAfter: ";
std::copy(elements.cbegin(), elements.cend(),
std::ostream_iterator<int>(std::cout, " "));
}
Output:
Before: 1 2 3 4 5 6 7 8 9 After: 7 9 3 2 4 5 6 1 8
-
\$\begingroup\$ is it not possible that
randomIndex = elements.size()
on the first iteration of the loop thus resulting in undefined behavior? Unless the prefix decrement oncurrentIndexCounter
executes on entry of the for loop, possibly? \$\endgroup\$z.karl– z.karl2019年12月05日 05:57:36 +00:00Commented Dec 5, 2019 at 5:57 -
\$\begingroup\$ @z.karl: That is possible, though that's not one aspect that I caught from the OP's code previously (I only added
auto
). \$\endgroup\$Jamal– Jamal2019年12月07日 01:48:51 +00:00Commented Dec 7, 2019 at 1:48
Here is my take on it with the modern <random>
:
static const void fisher_yates_shuffle(std::deque<card>& cards)
{
std::random_device random ;
std::mt19937 mersenne_twister(random());
std::uniform_int_distribution<std::uint8_t> distribution ;
for (auto i = cards.size() - 1; i > 0; --i)
{
distribution.param(std::uniform_int_distribution<std::uint8_t>::param_type(0, i));
std::swap(cards[i], cards[distribution(mersenne_twister)]);
}
};
-
1\$\begingroup\$ You have presented an alternative solution, but haven't reviewed the code. Please edit to show what aspects of the question code prompted you to write this version, and in what ways it's an improvement over the original. It may be worth (re-)reading How to Answer. \$\endgroup\$Toby Speight– Toby Speight2019年11月05日 17:18:58 +00:00Commented Nov 5, 2019 at 17:18
-
1\$\begingroup\$ I just used this to implement a fisher_yates_shuffle on a Poker program and it works wonderfully, better than other answers. \$\endgroup\$artemis– artemis2019年11月29日 20:36:19 +00:00Commented Nov 29, 2019 at 20:36
rand() % currentIndexCounter
does not give you a perfect distribution unlesscurrentIndexCounter
is an exact divisor ofRAND_MAX
. You may want to look at C++11 random header and the generators provided inside. \$\endgroup\$