This is a follow on from a previously posted question:
How to generate a random number in C?
I wish to be able to generate a random number from within a particular range, such as 1 to 6 to mimic the sides of a die.
How would I go about doing this?
-
3if you look at the second answer to the question you refer to you have the answer. rand() % 6.Mats Fredriksson– Mats Fredriksson2010年03月24日 16:58:55 +00:00Commented Mar 24, 2010 at 16:58
-
2I didn't understand how it worked, so I decided to make a separate question for clarity.Jamie Keeling– Jamie Keeling2010年03月24日 17:29:15 +00:00Commented Mar 24, 2010 at 17:29
-
2Random thought: If you polled a random cross-section of programmers, you'd find a random number of them are randomly thinking of ways to randomly generate numbers. Considering the Universe is governed by precise and predictable laws, isn't it interesting that we try to generate things more randomly? Questions like this always tend to bring out the 10k+ posters.Armstrongest– Armstrongest2010年03月24日 19:00:24 +00:00Commented Mar 24, 2010 at 19:00
-
2@Mats rand() % 6 can return a 0. Not good for a die.new123456– new1234562011年03月05日 19:33:47 +00:00Commented Mar 5, 2011 at 19:33
-
Can you mark stackoverflow.com/a/6852396/419 as the accepted answer instead of the answer that links to it :) Thanks.Kev– Kev2012年06月27日 13:15:49 +00:00Commented Jun 27, 2012 at 13:15
11 Answers 11
All the answers so far are mathematically wrong. Returning rand() % N does not uniformly give a number in the range [0, N) unless N divides the length of the interval into which rand() returns (i.e. is a power of 2). Furthermore, one has no idea whether the moduli of rand() are independent: it's possible that they go 0, 1, 2, ..., which is uniform but not very random. The only assumption it seems reasonable to make is that rand() puts out a Poisson distribution: any two nonoverlapping subintervals of the same size are equally likely and independent. For a finite set of values, this implies a uniform distribution and also ensures that the values of rand() are nicely scattered.
This means that the only correct way of changing the range of rand() is to divide it into boxes; for example, if RAND_MAX == 11 and you want a range of 1..6, you should assign {0,1} to 1, {2,3} to 2, and so on. These are disjoint, equally-sized intervals and thus are uniformly and independently distributed.
The suggestion to use floating-point division is mathematically plausible but suffers from rounding issues in principle. Perhaps double is high-enough precision to make it work; perhaps not. I don't know and I don't want to have to figure it out; in any case, the answer is system-dependent.
The correct way is to use integer arithmetic. That is, you want something like the following:
#include <stdlib.h> // For random(), RAND_MAX
// Assumes 0 <= max <= RAND_MAX
// Returns in the closed interval [0, max]
long random_at_most(long max) {
unsigned long
// max <= RAND_MAX < ULONG_MAX, so this is okay.
num_bins = (unsigned long) max + 1,
num_rand = (unsigned long) RAND_MAX + 1,
bin_size = num_rand / num_bins,
defect = num_rand % num_bins;
long x;
do {
x = random();
}
// This is carefully written not to overflow
while (num_rand - defect <= (unsigned long)x);
// Truncated division is intentional
return x/bin_size;
}
The loop is necessary to get a perfectly uniform distribution. For example, if you are given random numbers from 0 to 2 and you want only ones from 0 to 1, you just keep pulling until you don't get a 2; it's not hard to check that this gives 0 or 1 with equal probability. This method is also described in the link that nos gave in their answer, though coded differently. I'm using random() rather than rand() as it has a better distribution (as noted by the man page for rand()).
If you want to get random values outside the default range [0, RAND_MAX], then you have to do something tricky. Perhaps the most expedient is to define a function random_extended() that pulls n bits (using random_at_most()) and returns in [0, 2**n), and then apply random_at_most() with random_extended() in place of random() (and 2**n - 1 in place of RAND_MAX) to pull a random value less than 2**n, assuming you have a numerical type that can hold such a value. Finally, of course, you can get values in [min, max] using min + random_at_most(max - min), including negative values.
27 Comments
max - min > RAND_MAX, which is more serious than the issue I stated above (e.g. VC++ has RAND_MAX of only 32767).do {} while().Following on from @Ryan Reich's answer, I thought I'd offer my cleaned up version. The first bounds check isn't required given the second bounds check, and I've made it iterative rather than recursive. It returns values in the range [min, max], where max >= min and 1+max-min < RAND_MAX.
unsigned int rand_interval(unsigned int min, unsigned int max)
{
int r;
const unsigned int range = 1 + max - min;
const unsigned int buckets = RAND_MAX / range;
const unsigned int limit = buckets * range;
/* Create equal size buckets all in a row, then fire randomly towards
* the buckets until you land in one of them. All buckets are equally
* likely. If you land off the end of the line of buckets, try again. */
do
{
r = rand();
} while (r >= limit);
return min + (r / buckets);
}
4 Comments
limit an int (and optionally bucket too) since RAND_MAX / range < INT_MAX and buckets * range <= RAND_MAX. EDIT: I've submitted and edit proposal.max >= min, use max-min < RAND_MAX - 1, not 1+max-min < RAND_MAX.Here is a formula if you know the max and min values of a range, and you want to generate numbers inclusive in between the range:
r = (rand() % (max + 1 - min)) + min
3 Comments
int overflow with max+1-min.unsigned int
randr(unsigned int min, unsigned int max)
{
double scaled = (double)rand()/RAND_MAX;
return (max - min +1)*scaled + min;
}
See here for other options.
11 Comments
(((max-min+1)*rand())/RAND_MAX)+min and get probably the exact same distribution (assuming that RAND_MAX is small enough relative to int to not overflow).max + 1, if either rand() == RAND_MAX, or rand() is very close to RAND_MAX and floating-point errors push the final result past max + 1. To be safe, you should check that the result is within range before returning it.RAND_MAX + 1.0. I'm still not sure that's good enough to prevent a max + 1 return, though: in particular, the + min at the end involves a round that could end up producing max + 1 for large values of rand(). Safer to abandon this approach altogether and use integer arithmetic.RAND_MAX is replaced by RAND_MAX+1.0 as Christoph suggests, then I believe that this is safe provided that the + min is done using integer arithmetic: return (unsigned int)((max - min + 1) * scaled) + min. The (non-obvious) reason is that assuming IEEE 754 arithmetic and round-half-to-even, (and also that max - min + 1 is exactly representable as a double, but that'll be true on a typical machine), it's always true that x * scaled < x for any positive double x and any double scaled satisfying 0.0 <= scaled && scaled < 1.0.randr(0, UINT_MAX): always generates 0.Wouldn't you just do:
srand(time(NULL));
int r = ( rand() % 6 ) + 1;
% is the modulus operator. Essentially it will just divide by 6 and return the remainder... from 0 - 5
8 Comments
rand() includes the low-order bits of the generator's state (if it uses an LCG). I haven't seen one so far—all of them (yes, including MSVC with RAND_MAX being just 32767) remove the low-order bits. Using modulus isn't recommended for other reasons, namely that it skews the distribution in favor of smaller numbers.For those who understand the bias problem but can't stand the unpredictable run-time of rejection-based methods, this series produces a progressively less biased random integer in the [0, n-1] interval:
r = n / 2;
r = (rand() * n + r) / (RAND_MAX + 1);
r = (rand() * n + r) / (RAND_MAX + 1);
r = (rand() * n + r) / (RAND_MAX + 1);
...
It does so by synthesising a high-precision fixed-point random number of i * log_2(RAND_MAX + 1) bits (where i is the number of iterations) and performing a long multiplication by n.
When the number of bits is sufficiently large compared to n, the bias becomes immeasurably small.
It does not matter if RAND_MAX + 1 is less than n (as in this question), or if it is not a power of two, but care must be taken to avoid integer overflow if RAND_MAX * n is large.
7 Comments
RAND_MAX is often INT_MAX, so RAND_MAX + 1 --> UB (like INT_MIN)RAND_MAX * n is large". You need to arrange to use appropriate types for your requirements.RAND_MAX is often INT_MAX" Yes, but only on 16 bit systems! Any reasonably modern architechture will put INT_MAX at 2^32 / 2 and RAND_MAX at 2^16 / 2. Is this an incorrect assumption?int compilers, I found RAND_MAX == 32767 on one and RAND_MAX == 2147483647 on another. My overall experience (decades) is that RAND_MAX == INT_MAX more often. So disagree that a reasonably modern 32-bit architecture will certainly have a RAND_MAX at 2^16 / 2. Since the C spec allows 32767 <= RAND_MAX <= INT_MAX, I code to that anyways rather than a tendency.Here is a slight simpler algorithm than Ryan Reich's solution:
/// Begin and end are *inclusive*; => [begin, end]
uint32_t getRandInterval(uint32_t begin, uint32_t end) {
uint32_t range = (end - begin) + 1;
uint32_t limit = ((uint64_t)RAND_MAX + 1) - (((uint64_t)RAND_MAX + 1) % range);
/* Imagine range-sized buckets all in a row, then fire randomly towards
* the buckets until you land in one of them. All buckets are equally
* likely. If you land off the end of the line of buckets, try again. */
uint32_t randVal = rand();
while (randVal >= limit) randVal = rand();
/// Return the position you hit in the bucket + begin as random number
return (randVal % range) + begin;
}
Example (RAND_MAX := 16, begin := 2, end := 7)
=> range := 6 (1 + end - begin)
=> limit := 12 (RAND_MAX + 1) - ((RAND_MAX + 1) % range)
The limit is always a multiple of the range,
so we can split it into range-sized buckets:
Possible-rand-output: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Buckets: [0, 1, 2, 3, 4, 5][0, 1, 2, 3, 4, 5][X, X, X, X, X]
Buckets + begin: [2, 3, 4, 5, 6, 7][2, 3, 4, 5, 6, 7][X, X, X, X, X]
1st call to rand() => 13
→ 13 is not in the bucket-range anymore (>= limit), while-condition is true
→ retry...
2nd call to rand() => 7
→ 7 is in the bucket-range (< limit), while-condition is false
→ Get the corresponding bucket-value 1 (randVal % range) and add begin
=> 3
1 Comment
RAND_MAX + 1 can readily overflow int addition. In that case, (RAND_MAX + 1) % range will generate questionable results. Consider (RAND_MAX + (uint32_t)1)In order to avoid the modulo bias (suggested in other answers) you can always use:
arc4random_uniform(MAX-MIN)+MIN
Where "MAX" is the upper bound and "MIN" is lower bound. For example, for numbers between 10 and 20:
arc4random_uniform(20-10)+10
arc4random_uniform(10)+10
Simple solution and better than using "rand() % N".
2 Comments
#include <bsd/stdlib.h> first. Also, any idea how to get this on Windows without MinGW or CygWin?While Ryan is correct, the solution can be much simpler based on what is known about the source of the randomness. To re-state the problem:
- There is a source of randomness, outputting integer numbers in range
[0, MAX)with uniform distribution. - The goal is to produce uniformly distributed random integer numbers in range
[rmin, rmax]where0 <= rmin < rmax < MAX.
In my experience, if the number of bins (or "boxes") is significantly smaller than the range of the original numbers, and the original source is cryptographically strong - there is no need to go through all that rigamarole, and simple modulo division would suffice (like output = rnd.next() % (rmax+1), if rmin == 0), and produce random numbers that are distributed uniformly "enough", and without any loss of speed. The key factor is the randomness source (i.e., kids, don't try this at home with rand()).
Here's an example/proof of how it works in practice. I wanted to generate random numbers from 1 to 22, having a cryptographically strong source that produced random bytes (based on Intel RDRAND). The results are:
Rnd distribution test (22 boxes, numbers of entries in each box): 1: 409443 4.55% 2: 408736 4.54% 3: 408557 4.54% 4: 409125 4.55% 5: 408812 4.54% 6: 409418 4.55% 7: 408365 4.54% 8: 407992 4.53% 9: 409262 4.55% 10: 408112 4.53% 11: 409995 4.56% 12: 409810 4.55% 13: 409638 4.55% 14: 408905 4.54% 15: 408484 4.54% 16: 408211 4.54% 17: 409773 4.55% 18: 409597 4.55% 19: 409727 4.55% 20: 409062 4.55% 21: 409634 4.55% 22: 409342 4.55% total: 100.00%
This is as close to uniform as I need for my purpose (fair dice throw, generating cryptographically strong codebooks for WWII cipher machines such as http://users.telenet.be/d.rijmenants/en/kl-7sim.htm, etc). The output does not show any appreciable bias.
Here's the source of cryptographically strong (true) random number generator: Intel Digital Random Number Generator and a sample code that produces 64-bit (unsigned) random numbers.
int rdrand64_step(unsigned long long int *therand)
{
unsigned long long int foo;
int cf_error_status;
asm("rdrand %%rax; \
mov 1,ドル%%edx; \
cmovae %%rax,%%rdx; \
mov %%edx,%1; \
mov %%rax, %0;":"=r"(foo),"=r"(cf_error_status)::"%rax","%rdx");
*therand = foo;
return cf_error_status;
}
I compiled it on Mac OS X with clang-6.0.1 (straight), and with gcc-4.8.3 using "-Wa,q" flag (because GAS does not support these new instructions).
3 Comments
gcc randu.c -o randu -Wa,q (GCC 5.3.1 on Ubuntu 16) or clang randu.c -o randu (Clang 3.8.0) works, but dumps core at runtime with Illegal instruction (core dumped). Any ideas?rand(). I tried some tests and posted this question but I cannot find a definitive answer yet.As said before modulo isn't sufficient because it skews the distribution. Heres my code which masks off bits and uses them to ensure the distribution isn't skewed.
static uint32_t randomInRange(uint32_t a,uint32_t b) {
uint32_t v;
uint32_t range;
uint32_t upper;
uint32_t lower;
uint32_t mask;
if(a == b) {
return a;
}
if(a > b) {
upper = a;
lower = b;
} else {
upper = b;
lower = a;
}
range = upper - lower;
mask = 0;
//XXX calculate range with log and mask? nah, too lazy :).
while(1) {
if(mask >= range) {
break;
}
mask = (mask << 1) | 1;
}
while(1) {
v = rand() & mask;
if(v <= range) {
return lower + v;
}
}
}
The following simple code lets you look at the distribution:
int main() {
unsigned long long int i;
unsigned int n = 10;
unsigned int numbers[n];
for (i = 0; i < n; i++) {
numbers[i] = 0;
}
for (i = 0 ; i < 10000000 ; i++){
uint32_t rand = random_in_range(0,n - 1);
if(rand >= n){
printf("bug: rand out of range %u\n",(unsigned int)rand);
return 1;
}
numbers[rand] += 1;
}
for(i = 0; i < n; i++) {
printf("%u: %u\n",i,numbers[i]);
}
}
2 Comments
v = rand(); if (v > RAND_MAX - (RAND_MAX % range) -> reject and try again; else return v % range; I understand that modulo is a much slower operation than masking, but I still think ..... it should be tested.rand() returns an int in the range [0..RAND_MAX]. That range can easily be a subrange of uint32_t and then randomInRange(0, ,b) never generates values in the range (INT_MAX...b].Will return a floating point number in the range [0,1]:
#define rand01() (((double)random())/((double)(RAND_MAX)))