How to generate a random integer number from within a range

Question 1

This is a follow on from a previously posted question:

I wish to be able to generate a random number from within a particular range, such as 1 to 6 to mimic the sides of a die.

How would I go about doing this?

Question 2

if you look at the second answer to the question you refer to you have the answer. rand() % 6.

Question 3

I didn't understand how it worked, so I decided to make a separate question for clarity.

Question 4

Random thought: If you polled a random cross-section of programmers, you'd find a random number of them are randomly thinking of ways to randomly generate numbers. Considering the Universe is governed by precise and predictable laws, isn't it interesting that we try to generate things more randomly? Questions like this always tend to bring out the 10k+ posters.

Question 5

@Mats rand() % 6 can return a 0. Not good for a die.

Question 6

Can you mark stackoverflow.com/a/6852396/419 as the accepted answer instead of the answer that links to it :) Thanks.

Question 7

All the answers so far are mathematically wrong. Returning rand() % N does not uniformly give a number in the range [0, N) unless N divides the length of the interval into which rand() returns (i.e. is a power of 2). Furthermore, one has no idea whether the moduli of rand() are independent: it's possible that they go 0, 1, 2, ..., which is uniform but not very random. The only assumption it seems reasonable to make is that rand() puts out a Poisson distribution: any two nonoverlapping subintervals of the same size are equally likely and independent. For a finite set of values, this implies a uniform distribution and also ensures that the values of rand() are nicely scattered.

This means that the only correct way of changing the range of rand() is to divide it into boxes; for example, if RAND_MAX == 11 and you want a range of 1..6, you should assign {0,1} to 1, {2,3} to 2, and so on. These are disjoint, equally-sized intervals and thus are uniformly and independently distributed.

The suggestion to use floating-point division is mathematically plausible but suffers from rounding issues in principle. Perhaps double is high-enough precision to make it work; perhaps not. I don't know and I don't want to have to figure it out; in any case, the answer is system-dependent.

The correct way is to use integer arithmetic. That is, you want something like the following:

#include <stdlib.h> // For random(), RAND_MAX
// Assumes 0 <= max <= RAND_MAX
// Returns in the closed interval [0, max]
long random_at_most(long max) {
 unsigned long
 // max <= RAND_MAX < ULONG_MAX, so this is okay.
 num_bins = (unsigned long) max + 1,
 num_rand = (unsigned long) RAND_MAX + 1,
 bin_size = num_rand / num_bins,
 defect = num_rand % num_bins;
 long x;
 do {
 x = random();
 }
 // This is carefully written not to overflow
 while (num_rand - defect <= (unsigned long)x);
 // Truncated division is intentional
 return x/bin_size;
}

The loop is necessary to get a perfectly uniform distribution. For example, if you are given random numbers from 0 to 2 and you want only ones from 0 to 1, you just keep pulling until you don't get a 2; it's not hard to check that this gives 0 or 1 with equal probability. This method is also described in the link that nos gave in their answer, though coded differently. I'm using random() rather than rand() as it has a better distribution (as noted by the man page for rand()).

If you want to get random values outside the default range [0, RAND_MAX], then you have to do something tricky. Perhaps the most expedient is to define a function random_extended() that pulls n bits (using random_at_most()) and returns in [0, 2**n), and then apply random_at_most() with random_extended() in place of random() (and 2**n - 1 in place of RAND_MAX) to pull a random value less than 2**n, assuming you have a numerical type that can hold such a value. Finally, of course, you can get values in [min, max] using min + random_at_most(max - min), including negative values.

Question 8

@Adam Rosenfield,@Ryan Reich : In a related question where Adam had answered:stackoverflow.com/questions/137783/… the most upvoted answer : The usage of 'modulus' would then be incorrect, no? To generate 1..7 from 1..21,the procedure what Ryan has described should be used.Please correct me if I am wrong.

Question 9

On further review, another issue here is that this won't work when max - min > RAND_MAX, which is more serious than the issue I stated above (e.g. VC++ has RAND_MAX of only 32767).

Question 10

The while loop could be made more readable. Rather than performing assignment in the conditional, you probably want a do {} while().

Question 11

Hey, This answer is cited by Comet OS book ;) First time I see that in a teaching book

Question 12

It is also cited in the OSTEP book :) pages.cs.wisc.edu/~remzi/OSTEP (Chapter 9, Page 4)

Question 13

Following on from @Ryan Reich's answer, I thought I'd offer my cleaned up version. The first bounds check isn't required given the second bounds check, and I've made it iterative rather than recursive. It returns values in the range [min, max], where max >= min and 1+max-min < RAND_MAX.

unsigned int rand_interval(unsigned int min, unsigned int max)
{
 int r;
 const unsigned int range = 1 + max - min;
 const unsigned int buckets = RAND_MAX / range;
 const unsigned int limit = buckets * range;
 /* Create equal size buckets all in a row, then fire randomly towards
 * the buckets until you land in one of them. All buckets are equally
 * likely. If you land off the end of the line of buckets, try again. */
 do
 {
 r = rand();
 } while (r >= limit);
 return min + (r / buckets);
}

Question 14

Note this will get stuck in an infinite loop if range >= RAND_MAX. Ask me how I know :/

Question 15

Note that you are comparing an int to an unsigned int (r >= limit). The problem is easily solved by making limit an int (and optionally bucket too) since RAND_MAX / range < INT_MAX and buckets * range <= RAND_MAX. EDIT: I've submitted and edit proposal.

Question 16

the solution from @Ryan Reich still gives me better (less-biased) distribution

Question 17

To assess restrictions without overflow, after max >= min, use max-min < RAND_MAX - 1, not 1+max-min < RAND_MAX.

Question 18

Here is a formula if you know the max and min values of a range, and you want to generate numbers inclusive in between the range:

r = (rand() % (max + 1 - min)) + min

Question 19

As noted in Ryan's answer, this produces a biased result.

Question 20

Biased result, potential int overflow with max+1-min.

Question 21

this work only with integer min and max. If the min and max are float, it's no possible to do the % operation

Question 22

unsigned int
randr(unsigned int min, unsigned int max)
{
 double scaled = (double)rand()/RAND_MAX;
 return (max - min +1)*scaled + min;
}

See here for other options.

Question 23

@S.Lott - not really. Each distributes the slightly-higher-odds cases differently, that's all. The double math gives the impression that there's more precision there, but you could just as easily use (((max-min+1)*rand())/RAND_MAX)+min and get probably the exact same distribution (assuming that RAND_MAX is small enough relative to int to not overflow).

Question 24

This is slightly dangerous: it's possible for this to (very rarely) return max + 1, if either rand() == RAND_MAX, or rand() is very close to RAND_MAX and floating-point errors push the final result past max + 1. To be safe, you should check that the result is within range before returning it.

Question 25

@Christoph: I agree about RAND_MAX + 1.0. I'm still not sure that's good enough to prevent a max + 1 return, though: in particular, the + min at the end involves a round that could end up producing max + 1 for large values of rand(). Safer to abandon this approach altogether and use integer arithmetic.

Question 26

If RAND_MAX is replaced by RAND_MAX+1.0 as Christoph suggests, then I believe that this is safe provided that the + min is done using integer arithmetic: return (unsigned int)((max - min + 1) * scaled) + min. The (non-obvious) reason is that assuming IEEE 754 arithmetic and round-half-to-even, (and also that max - min + 1 is exactly representable as a double, but that'll be true on a typical machine), it's always true that x * scaled < x for any positive double x and any double scaled satisfying 0.0 <= scaled && scaled < 1.0.

Question 27

Fails for randr(0, UINT_MAX): always generates 0.

Question 28

Wouldn't you just do:

srand(time(NULL));
int r = ( rand() % 6 ) + 1;

% is the modulus operator. Essentially it will just divide by 6 and return the remainder... from 0 - 5

Question 29

It will give results from 1 - 6. That's what the + 1 is for.

Question 30

Simon, show me a libc in use anywhere where rand() includes the low-order bits of the generator's state (if it uses an LCG). I haven't seen one so far—all of them (yes, including MSVC with RAND_MAX being just 32767) remove the low-order bits. Using modulus isn't recommended for other reasons, namely that it skews the distribution in favor of smaller numbers.

Question 31

@Johannes: So it's safe to say slot machines don't use modulus?

Question 32

How would I exclude a 0? It seems that if I run it in a loop of 30, maybe the second or third time it runs there's a 0 roughly half way into it. Is this some sort of fluke?

Question 33

@Johannes: Maybe it's not so much an issue nowadays, but traditionally using the low-order bits isn't advisable. c-faq.com/lib/randrange.html

Question 34

For those who understand the bias problem but can't stand the unpredictable run-time of rejection-based methods, this series produces a progressively less biased random integer in the [0, n-1] interval:

r = n / 2;
r = (rand() * n + r) / (RAND_MAX + 1);
r = (rand() * n + r) / (RAND_MAX + 1);
r = (rand() * n + r) / (RAND_MAX + 1);
...

It does so by synthesising a high-precision fixed-point random number of i * log_2(RAND_MAX + 1) bits (where i is the number of iterations) and performing a long multiplication by n.

When the number of bits is sufficiently large compared to n, the bias becomes immeasurably small.

It does not matter if RAND_MAX + 1 is less than n (as in this question), or if it is not a power of two, but care must be taken to avoid integer overflow if RAND_MAX * n is large.

Question 35

RAND_MAX is often INT_MAX, so RAND_MAX + 1 --> UB (like INT_MIN)

Question 36

@chux that's what I mean about "care must be taken to avoid integer overflow if RAND_MAX * n is large". You need to arrange to use appropriate types for your requirements.

Question 37

@chux "RAND_MAX is often INT_MAX" Yes, but only on 16 bit systems! Any reasonably modern architechture will put INT_MAX at 2^32 / 2 and RAND_MAX at 2^16 / 2. Is this an incorrect assumption?

Question 38

@cat Tested today 2 32-bit int compilers, I found RAND_MAX == 32767 on one and RAND_MAX == 2147483647 on another. My overall experience (decades) is that RAND_MAX == INT_MAX more often. So disagree that a reasonably modern 32-bit architecture will certainly have a RAND_MAX at 2^16 / 2. Since the C spec allows 32767 <= RAND_MAX <= INT_MAX, I code to that anyways rather than a tendency.

Question 39

Still covered by "care must be taken to avoid integer overflow".

Question 40

Here is a slight simpler algorithm than Ryan Reich's solution:

/// Begin and end are *inclusive*; => [begin, end]
uint32_t getRandInterval(uint32_t begin, uint32_t end) {
 uint32_t range = (end - begin) + 1;
 uint32_t limit = ((uint64_t)RAND_MAX + 1) - (((uint64_t)RAND_MAX + 1) % range);
 /* Imagine range-sized buckets all in a row, then fire randomly towards
 * the buckets until you land in one of them. All buckets are equally
 * likely. If you land off the end of the line of buckets, try again. */
 uint32_t randVal = rand();
 while (randVal >= limit) randVal = rand();
 /// Return the position you hit in the bucket + begin as random number
 return (randVal % range) + begin;
}

Example (RAND_MAX := 16, begin := 2, end := 7)
 => range := 6 (1 + end - begin)
 => limit := 12 (RAND_MAX + 1) - ((RAND_MAX + 1) % range)
The limit is always a multiple of the range,
so we can split it into range-sized buckets:
 Possible-rand-output: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
 Buckets: [0, 1, 2, 3, 4, 5][0, 1, 2, 3, 4, 5][X, X, X, X, X]
 Buckets + begin: [2, 3, 4, 5, 6, 7][2, 3, 4, 5, 6, 7][X, X, X, X, X]
1st call to rand() => 13
 → 13 is not in the bucket-range anymore (>= limit), while-condition is true
 → retry...
2nd call to rand() => 7
 → 7 is in the bucket-range (< limit), while-condition is false
 → Get the corresponding bucket-value 1 (randVal % range) and add begin
 => 3

Question 41

RAND_MAX + 1 can readily overflow int addition. In that case, (RAND_MAX + 1) % range will generate questionable results. Consider (RAND_MAX + (uint32_t)1)

Question 42

In order to avoid the modulo bias (suggested in other answers) you can always use:

arc4random_uniform(MAX-MIN)+MIN

Where "MAX" is the upper bound and "MIN" is lower bound. For example, for numbers between 10 and 20:

arc4random_uniform(20-10)+10
arc4random_uniform(10)+10

Simple solution and better than using "rand() % N".

Question 43

Woohoo, this is a billion times better than the other answers. Worth noting you need to #include <bsd/stdlib.h> first. Also, any idea how to get this on Windows without MinGW or CygWin?

Question 44

No, it is not per se better than the other answers, because the other answers are more generic. Here you are limited to arc4random, the other answers allow you to choose a different random source, operate with different number-types,... and last but not least they might help someone to understand the problem. Don't forget that the question is also interesting for other people who might have some special requirements or no access to arc4random... Nonetheless, if you have access to it and want a quick solution, it is indeed a very good answer 😊

Question 45

While Ryan is correct, the solution can be much simpler based on what is known about the source of the randomness. To re-state the problem:

There is a source of randomness, outputting integer numbers in range [0, MAX) with uniform distribution.
The goal is to produce uniformly distributed random integer numbers in range [rmin, rmax] where 0 <= rmin < rmax < MAX.

In my experience, if the number of bins (or "boxes") is significantly smaller than the range of the original numbers, and the original source is cryptographically strong - there is no need to go through all that rigamarole, and simple modulo division would suffice (like output = rnd.next() % (rmax+1), if rmin == 0), and produce random numbers that are distributed uniformly "enough", and without any loss of speed. The key factor is the randomness source (i.e., kids, don't try this at home with rand()).

Here's an example/proof of how it works in practice. I wanted to generate random numbers from 1 to 22, having a cryptographically strong source that produced random bytes (based on Intel RDRAND). The results are:

Rnd distribution test (22 boxes, numbers of entries in each box): 
 1: 409443 4.55%
 2: 408736 4.54%
 3: 408557 4.54%
 4: 409125 4.55%
 5: 408812 4.54%
 6: 409418 4.55%
 7: 408365 4.54%
 8: 407992 4.53%
 9: 409262 4.55%
10: 408112 4.53%
11: 409995 4.56%
12: 409810 4.55%
13: 409638 4.55%
14: 408905 4.54%
15: 408484 4.54%
16: 408211 4.54%
17: 409773 4.55%
18: 409597 4.55%
19: 409727 4.55%
20: 409062 4.55%
21: 409634 4.55%
22: 409342 4.55% 
total: 100.00%

This is as close to uniform as I need for my purpose (fair dice throw, generating cryptographically strong codebooks for WWII cipher machines such as http://users.telenet.be/d.rijmenants/en/kl-7sim.htm, etc). The output does not show any appreciable bias.

Here's the source of cryptographically strong (true) random number generator: Intel Digital Random Number Generator and a sample code that produces 64-bit (unsigned) random numbers.

int rdrand64_step(unsigned long long int *therand)
{
 unsigned long long int foo;
 int cf_error_status;
 asm("rdrand %%rax; \
 mov 1,ドル%%edx; \
 cmovae %%rax,%%rdx; \
 mov %%edx,%1; \
 mov %%rax, %0;":"=r"(foo),"=r"(cf_error_status)::"%rax","%rdx");
 *therand = foo;
 return cf_error_status;
}

I compiled it on Mac OS X with clang-6.0.1 (straight), and with gcc-4.8.3 using "-Wa,q" flag (because GAS does not support these new instructions).

Question 46

Compiled with gcc randu.c -o randu -Wa,q (GCC 5.3.1 on Ubuntu 16) or clang randu.c -o randu (Clang 3.8.0) works, but dumps core at runtime with Illegal instruction (core dumped). Any ideas?

Question 47

First, I don't know whether your CPU actually supports RDRAND instruction. Your OS is fairly recent, but the CPU may not be. Second (but this is less likely) - I've no idea what kind of assembler Ubuntu includes (and Ubuntu tends to be fairly backwards wrt. updating packages). Check the Intel site I referred to for ways to test whether your CPU supports RDRAND.

Question 48

You have indeed good points. What I still cannot get is what is so wrong with rand(). I tried some tests and posted this question but I cannot find a definitive answer yet.

Question 49

As said before modulo isn't sufficient because it skews the distribution. Heres my code which masks off bits and uses them to ensure the distribution isn't skewed.

static uint32_t randomInRange(uint32_t a,uint32_t b) {
 uint32_t v;
 uint32_t range;
 uint32_t upper;
 uint32_t lower;
 uint32_t mask;
 if(a == b) {
 return a;
 }
 if(a > b) {
 upper = a;
 lower = b;
 } else {
 upper = b;
 lower = a; 
 }
 range = upper - lower;
 mask = 0;
 //XXX calculate range with log and mask? nah, too lazy :).
 while(1) {
 if(mask >= range) {
 break;
 }
 mask = (mask << 1) | 1;
 }
 while(1) {
 v = rand() & mask;
 if(v <= range) {
 return lower + v;
 }
 }
}

The following simple code lets you look at the distribution:

int main() {
 unsigned long long int i;
 unsigned int n = 10;
 unsigned int numbers[n];
 for (i = 0; i < n; i++) {
 numbers[i] = 0;
 }
 for (i = 0 ; i < 10000000 ; i++){
 uint32_t rand = random_in_range(0,n - 1);
 if(rand >= n){
 printf("bug: rand out of range %u\n",(unsigned int)rand);
 return 1;
 }
 numbers[rand] += 1;
 }
 for(i = 0; i < n; i++) {
 printf("%u: %u\n",i,numbers[i]);
 }
}

Question 50

Becomes quite inefficient when you reject numbers from the rand(). This will be especially inefficient when the range has a size that can be written as 2^k + 1. Then nearly half of all your attempts from a slow rand() call will be rejected by the condition. Would it may be better to calc RAND_MAX modulo range. Like: v = rand(); if (v > RAND_MAX - (RAND_MAX % range) -> reject and try again; else return v % range; I understand that modulo is a much slower operation than masking, but I still think ..... it should be tested.

Question 51

rand() returns an int in the range [0..RAND_MAX]. That range can easily be a subrange of uint32_t and then randomInRange(0, ,b) never generates values in the range (INT_MAX...b].

Question 52

Will return a floating point number in the range [0,1]:

#define rand01() (((double)random())/((double)(RAND_MAX)))

Ryan Reich 2,4762 gold badges18 silver badges15 bronze badges · Accepted Answer · 2011-07-27 23:20:38Z

All the answers so far are mathematically wrong. Returning rand() % N does not uniformly give a number in the range [0, N) unless N divides the length of the interval into which rand() returns (i.e. is a power of 2). Furthermore, one has no idea whether the moduli of rand() are independent: it's possible that they go 0, 1, 2, ..., which is uniform but not very random. The only assumption it seems reasonable to make is that rand() puts out a Poisson distribution: any two nonoverlapping subintervals of the same size are equally likely and independent. For a finite set of values, this implies a uniform distribution and also ensures that the values of rand() are nicely scattered.

This means that the only correct way of changing the range of rand() is to divide it into boxes; for example, if RAND_MAX == 11 and you want a range of 1..6, you should assign {0,1} to 1, {2,3} to 2, and so on. These are disjoint, equally-sized intervals and thus are uniformly and independently distributed.

The suggestion to use floating-point division is mathematically plausible but suffers from rounding issues in principle. Perhaps double is high-enough precision to make it work; perhaps not. I don't know and I don't want to have to figure it out; in any case, the answer is system-dependent.

The correct way is to use integer arithmetic. That is, you want something like the following:

#include <stdlib.h> // For random(), RAND_MAX
// Assumes 0 <= max <= RAND_MAX
// Returns in the closed interval [0, max]
long random_at_most(long max) {
 unsigned long
 // max <= RAND_MAX < ULONG_MAX, so this is okay.
 num_bins = (unsigned long) max + 1,
 num_rand = (unsigned long) RAND_MAX + 1,
 bin_size = num_rand / num_bins,
 defect = num_rand % num_bins;
 long x;
 do {
 x = random();
 }
 // This is carefully written not to overflow
 while (num_rand - defect <= (unsigned long)x);
 // Truncated division is intentional
 return x/bin_size;
}

The loop is necessary to get a perfectly uniform distribution. For example, if you are given random numbers from 0 to 2 and you want only ones from 0 to 1, you just keep pulling until you don't get a 2; it's not hard to check that this gives 0 or 1 with equal probability. This method is also described in the link that nos gave in their answer, though coded differently. I'm using random() rather than rand() as it has a better distribution (as noted by the man page for rand()).

If you want to get random values outside the default range [0, RAND_MAX], then you have to do something tricky. Perhaps the most expedient is to define a function random_extended() that pulls n bits (using random_at_most()) and returns in [0, 2**n), and then apply random_at_most() with random_extended() in place of random() (and 2**n - 1 in place of RAND_MAX) to pull a random value less than 2**n, assuming you have a numerical type that can hold such a value. Finally, of course, you can get values in [min, max] using min + random_at_most(max - min), including negative values.

@Adam Rosenfield,@Ryan Reich : In a related question where Adam had answered:stackoverflow.com/questions/137783/… the most upvoted answer : The usage of 'modulus' would then be incorrect, no? To generate 1..7 from 1..21,the procedure what Ryan has described should be used.Please correct me if I am wrong.
On further review, another issue here is that this won't work when max - min > RAND_MAX, which is more serious than the issue I stated above (e.g. VC++ has RAND_MAX of only 32767).
The while loop could be made more readable. Rather than performing assignment in the conditional, you probably want a do {} while().
Hey, This answer is cited by Comet OS book ;) First time I see that in a teaching book
It is also cited in the OSTEP book :) pages.cs.wisc.edu/~remzi/OSTEP (Chapter 9, Page 4)

CollectivesTM on Stack Overflow

How to generate a random integer number from within a range

11 Answers 11

27 Comments

4 Comments

3 Comments

11 Comments

8 Comments

7 Comments

1 Comment

2 Comments

3 Comments

2 Comments

Comments

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

11 Answers 11

27 Comments

4 Comments

3 Comments

11 Comments

8 Comments

7 Comments

1 Comment

2 Comments

3 Comments

2 Comments

Comments

Linked

Related