Picking a random item from array with equal distribution

Question 1

I want to pick a random item from an array at random.

Math.floor(Math.random() * array.length);

Is the way to go, but as far as I know this will cause a Uniform distribution to occur which means that the average is (lowbound+upperbound)/2 translated to an array with 10 elements the lower bound is the first element and the upper bound is the last element causes an average of 5, which is not random

Therefore, I looked at the frequency distribution of this way of random picking an item by having 10 elements and picking one with the code above. The element represents the index and is pushed into an array. After 10000 numbers, the frequency is counted and given.

This has the following results:

Index: Frequency
0: 1083
1: 996
2: 1022
3: 966
4: 958
5: 962
6: 1044
7: 1045
8: 972
9: 952

Ofc, this is only 1 run of 10k numbers. But it shows that index 0 has a 10.8% chance and index 9 has a 9.5% chance. This difference is 1.3% which I find quite a lot.

Are there methods that can do this better? For example, get to 0.05% difference in numbers? The ideal situation would be that they are all 10% (equally distributed).

Question 2

If you can precompute the result (i.e. you need a finite number of results, not an infinite stream) and the number of results is divisible by the number of items, you can get a perfect distribution:

Generate an array that repeats the items until you've got enough, i.e. [1, 2, 3, 1, 2, 3, 1, 2, 3, ...]. The array is thus guaranteed to have exactly as many instances of each item.
Shuffle the array with a fair shuffle algorithm, e.g. Fisher-Yates. The array still has exactly as many instances of each item.

If you do need an infinite stream, you could use something like an "item bag" model (which, btw, is how the blocks in Tetris are chosen):

Fill a "bag" with your items ([1, 2, 3]). Shuffle it (as above).
When you need an item, pop the first one from the shuffled bag.
If the bag is empty, re-fill it according to step 1.

The only case where this doesn't have a perfect distribution is if you stop "mid-bag".

Question 3

Thanks for the answer! In the first part, it seems you use the value of the items, rather than the index of the items. What I want is that all 10 (or if the array is larger more) items are equally likely to be drawn. Here I care about the indexes, not about the values (in the end all values are drawn with the same probability). As for the 2nd, I thought about that one, but does it give a perfect distribtion?

Question 4

You can consider the array an array of indexes or an array of items, however you like. (I.e. shuffledIndices, and you'd get an item with items[shuffledIndices[i]].)

Question 5

For the bag approach, the long-term distribution will be (according to my intuition) perfect, but it doesn't work like classical probability, since you won't have a chance of streaks of the same item occurring.

Question 6

@AKX YOu will have streaks of f the same item, but only streaks of the length of two.

Question 7

Well sure. That can also be averted, if required, by adapting the re-bag step to make sure that doesn't happen.

Question 8

Here another method - count number of samples already happen, select values from Categorical distribution but with probability INVERSE to the count, thus more frequent item will be less probable to be sampled next time.

Some code (in C#)

import MathNet.Numerics.Distributions;
static void Main() {
 const int N = 4;
 var counts = new int [N] {1, 1, 1, 1};
 var weights = new double [N] {1.0, 1.0, 1.0, 1.0};
 while (true) {
 int v = Categorical.Sample(weights[k]); // sample one value in [0...N)
 // update counts and weights
 counts[v] += 1;
 weights[v] = 1.0/(double)counts[v];
 
 // use v here for something
 ...
 }
}

Actually, any monotonically growing function of count will do, f.e.

weights[v] = 1.0/(1.0 + .5*(double)counts[v]);

might work, or

var squared => (x) => x*x;
weights[v] = 1.0/(7.0 + .25*squared((double)counts[v]));

or

weights[v] = 1.0/(3.0 + Math.Sqrt((double)counts[v]));

Question 9

What you have shown in your question is simply the fact that the JavaScript random number generator is simulating not just uniformly distributed, but also independent random numbers; each chosen number behaves as though it were independent of any other choice. Because of this independence, each number "doesn't care" how often each other number was chosen, as long as with each choice, each possible outcome is as likely as any other (according to the JavaScript generator).

If you want a distribution that "feels" more uniform, you will have to adjust the chances of each outcome, so that the chances depend on previous outcomes. A previous answer showed some ways how this can be done. Here is another, which I gave as an answer to a similar question.

Give each item the same weight, specified as a positive integer. For example, give a weight of 20 to each item.
Use a weighted-choice-with-replacement algorithm. Perhaps the simplest is rejection sampling, described as follows. Assume that the highest weight is max and each weight is 0 or greater. To choose an integer in the interval [1, weights.length] using rejection sampling:
1. Choose a uniform random integer i in [1, weights.length].
2. With probability weights[i]/max, return i. Otherwise, go to step 1. (For example, if all the weights are integers greater than 0, choose a uniform random integer in [1, max] and if that number is weights[i] or less, return i, or go to step 1 otherwise.)
There are many other ways to make a weighted choice besides rejection sampling; see my note on weighted choice algorithms.
As each item is chosen, reduce its weight by 1 to make it less likely to be chosen.
If all the weights are 0, assign each item the same weight chosen in step 1 (in this example, 20).

You didn't specify the kind of application you had in mind, but I see this desire for a "more uniform" distribution come up most often in games that wish to control which random numbers appear, to make the random outcomes appear "fairer" to players. In that case, however, you should also consider whether it may be better to make an (ordinary) independent uniform random choice instead, especially if you care whether players could gain an unfair advantage by predicting the random outcomes.

Question 10

Thanks for the insight. I have an array with X items (assume 30, don't know the exact number). These items describe an event that will happen, thus I do not want 1 event to happen more often than another event.

Question 11

In any case, you can use a "shuffle bag" approach given in another answer, or you can use the method in this answer (assigning a weight of 1 to each item). If this or any or the other answers doesn't fully solve your problem, can you show what you want to achieve and how these answers don't fully solve your problem?

Question 12

I did not have the time, yet, to try out one of the solutions. The shuffle bag approach seems the first one I'm going to try. If it doesn't work and others do not work. I will report back with data!

AKX 171k16 gold badges146 silver badges228 bronze badges · Accepted Answer · 2020-09-08 19:36:13Z

If you can precompute the result (i.e. you need a finite number of results, not an infinite stream) and the number of results is divisible by the number of items, you can get a perfect distribution:

Generate an array that repeats the items until you've got enough, i.e. [1, 2, 3, 1, 2, 3, 1, 2, 3, ...]. The array is thus guaranteed to have exactly as many instances of each item.
Shuffle the array with a fair shuffle algorithm, e.g. Fisher-Yates. The array still has exactly as many instances of each item.

If you do need an infinite stream, you could use something like an "item bag" model (which, btw, is how the blocks in Tetris are chosen):

Fill a "bag" with your items ([1, 2, 3]). Shuffle it (as above).
When you need an item, pop the first one from the shuffled bag.
If the bag is empty, re-fill it according to step 1.

The only case where this doesn't have a perfect distribution is if you stop "mid-bag".

Thanks for the answer! In the first part, it seems you use the value of the items, rather than the index of the items. What I want is that all 10 (or if the array is larger more) items are equally likely to be drawn. Here I care about the indexes, not about the values (in the end all values are drawn with the same probability). As for the 2nd, I thought about that one, but does it give a perfect distribtion?
You can consider the array an array of indexes or an array of items, however you like. (I.e. shuffledIndices, and you'd get an item with items[shuffledIndices[i]].)
For the bag approach, the long-term distribution will be (according to my intuition) perfect, but it doesn't work like classical probability, since you won't have a chance of streaks of the same item occurring.
@AKX YOu will have streaks of f the same item, but only streaks of the length of two.
Well sure. That can also be averted, if required, by adapting the re-bag step to make sure that doesn't happen.

CollectivesTM on Stack Overflow

Picking a random item from array with equal distribution

3 Answers 3

5 Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

3 Answers 3

5 Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related