I want to create a list of random integers in the range [0, 9], that sums up to 100 in Excel, using VBA. The list should be printed in a single column.
The routine I've written is as follows:
Sub RandomList()
Dim arr(100) As Integer
Dim i As Integer
i = 0
Do
arr(i) = Int(10 * Rnd())
i = i + 1
Loop Until WorksheetFunction.Sum(arr) >= 100
arr(i - 1) = 100 - WorksheetFunction.Sum(arr) + arr(i - 1)
Range("A1").Resize(i, 1) = Application.Transpose(arr)
End Sub
This gives the correct result, but I'm sure there are ways to improve it:
- Initializing the size to 100, seems odd. I want to use a dynamic sized array, but I can't get it to work. I've tried different approaches, but all result in a list containing only zeros (or an error).
- I tried to sum all elements except the last one, but I didn't manage to make it work. Therefore I had to sum all elements, and subtract the last one. I can't believe that's optimal.
- Should I delete the
(100-i)
last elements of the list somehow, before writing it to Excel? How and why? - Is the last line a good way of writing to Excel, or is there a better way?
- I'm looking for a nice and clean way of doing this task, not necessarily the fastest way
I can do the same thing using Function RandomList()
too, but as far as I can tell, that's used when I want to call the function from the worksheet.
PS! I'm not interested in some builtin function that can do the task directly, as I'm doing this to learn VBA, not to do that specific task.
This is inspired by this (bad) question on SO. There are other approaches there that can be used, but they are substantially different. I'm wondering if the way I'm doing this is a "good way" of doing it, or if it's filled with "Bad practice" code.
4 Answers 4
Your logic is hacky, and your array should be dynamic.
Then, you should generalise your function so it can be used in other circumstances.
Also, Logic Error:
You've *assumed* that you won't need more elements than your target sum. But what if Rnd()
returns 100 zeroes? You'll get to i = 101
, try to set the value of arr(101)
and error because you only sized it for 100 elements. Hence why you need a dynamic array.
Dynamic array:
Dim dynamicArray () As Integer
ReDim dynamicArray (1 to 1)
Dim i As Long
For i = 1 to 100
ReDim Preserve dynamicArray (1 to i)
Next i
Redim Preserve
extends the dimension whilst preserving all the values already in it. It only works on the last dimension so if, for instance, you had a 2-D Array, you could only Redim Preserve
the 2nd Dimension.
Better Logic:
Start a counter at 100 (call it remainder
)
Add random Int from 0-9 to list
Remove Int from counter
Do until counter <= 0
Reduce the last Int by the negative amount now held by remainder
Like so:
Dim sumList() As Long
ReDim sumList(1 To 1)
Dim remainder As Long
remainder = 100
Dim i As Long
Dim rndNum As Double
Do Until remainder <= 0
i = i + 1
ReDim Preserve sumList(1 To i)
rndNum = WorksheetFunction.Floor(10 * Rnd(), 0)
sumList(i) = rndNum
remainder = remainder - rndNum
Loop
sumList(i) = sumList(i) + remainder
Refactoring/Generalising:
Your sub is very specific (0-9, sum of 100, integers). Why not make it a function that accepts these things as parameters? Then, whenever you need any kind of random list, you can just call your function with whatever parameters you need without writing a new one.
Parameters:
- Allowed Value Range (2 parameters. A
minValue
and amaxValue
) targetSum
- (Optional) level of precision (integers (0 DP), 1 Decimal Place, 2 etc.)
Mathematical Note:
Allowing for any value space (say, numbers from 2.7 to 3.81 with values of 3 decimal places):
First, compute the size of the space (max - min). For a range of 2 to 7, the size would be 7-2 = 5.
We then add an extra buffer so that Floor()
has an equal chance of hitting the top of our value space. This will be 1 * 10 ^ (-numDecimalPlaces)
.
We then multiply Rnd()
by the size of the space to get a random position within our value space.
Then add that value to our minValue
, and Floor()
to the appropriate level of precision.
Function:
Public Function RandList(ByVal minValue As Double, ByVal maxValue As Double, ByVal targetSum As Double, Optional ByVal numDecimalPlaces As Long = 0) As Double()
sumTotal = WorksheetFunction.Floor(sumTotal, numDecimalPlaces) '/ In Case, e.g. the total is 100.24 but we're only working in whole numbers
Dim sumList() As Double
ReDim sumList(1 To 1)
Dim remainder As Double
remainder = targetSum
Dim valueSpace As Double
valueSpace = maxValue - minValue + 1 * 10 ^ (-numDecimalPlaces) '/ 1 * 10 ^ (-numDecimalPlaces) is our Floor() buffer so we can still hit the top of the value space.
Dim i As Long
Dim rndNum As Double
Do Until remainder <= 0
i = i + 1
ReDim Preserve sumList(1 To i)
rndNum = WorksheetFunction.Floor(minValue + (Rnd() * valueSpace), numDecimalPlaces)
sumList(i) = rndNum
remainder = remainder - rndNum
Loop
sumList(i) = sumList(i) + remainder
RandList = sumList
End Function
Which you can then call like so:
Public Sub DescriptiveName()
Dim sumList() As Double
sumList = RandList(0, 9, 100)
'/ Stuff to print sumList goes here
End Sub
N.B. Thanks to @arcadeprecinct and @raystafarian for suggestions and improvements.
-
1\$\begingroup\$ Wouldn't
sumList(i) = sumList(i) + remainder
have the same effect as the second half of the function? \$\endgroup\$arcadeprecinct– arcadeprecinct2016年06月15日 15:10:51 +00:00Commented Jun 15, 2016 at 15:10 -
\$\begingroup\$ You know, it would ^^ \$\endgroup\$Kaz– Kaz2016年06月15日 15:11:56 +00:00Commented Jun 15, 2016 at 15:11
-
\$\begingroup\$ Wait, why is this a function? \$\endgroup\$Raystafarian– Raystafarian2016年06月15日 15:32:59 +00:00Commented Jun 15, 2016 at 15:32
-
\$\begingroup\$ @Raystafarian Oh, yeah, whoops. I should put the return statement in. \$\endgroup\$Kaz– Kaz2016年06月15日 15:33:41 +00:00Commented Jun 15, 2016 at 15:33
-
1\$\begingroup\$ Thanks! This is great! =) For the record (in my defense): It wasn't really a logical flaw, it was based on probability. The likelihood of getting to 101 elements is extremely low, but you are right... It could happen, especially since the numbers
[0,9], n=100
can change. \$\endgroup\$Stewie Griffin– Stewie Griffin2016年06月15日 16:36:54 +00:00Commented Jun 15, 2016 at 16:36
The procedure is doing a number of completely unrelated things that should be separated:
- Populating an array with random integers
- Dumping an array of random integers onto a worksheet
Make the procedure a Function
, and rename it to something that starts with a verb, something that tells the reader what it does. GenerateRandomValues
for example.
I'd take the magic cap-value 100
as a (perhaps Optional
) parameter.
Every time you use WorksheetFunction.Sum
, you're making Excel iterate all values in the array, add up all the values and return a value that you should already know.
Declare a local variable to hold the cumulative sum, and add every new generated integer to it as you add it to the array: then when you need to know what the total is, you don't need to calculate or iterate anything, it's right there waiting to be used.
Make the function return the array, and then make another procedure responsible for calling that function, and consuming its result: then you can write as many test methods (in a dedicated test module) as you like, to document your specifications and programmatically validate that the function works as intended given any input!
I'd rename arr
to either result
or values
.
Initializing the size to 100, seems odd. I want to use a dynamic sized array
Indeed. See @Zak's answer, but then I'd make sure to minimize the amount of resizing happening, and initialize the dynamic array with 11 items, since it's a given that you're going to need at least that many iterations to get to a sum of 100 if you're generating random numbers between 0 and 9. This could be calculated if the 100 is a parameter.
Is the last line a good way of writing to Excel, or is there a better way?
You're dumping an entire array of values in a single write, it doesn't get any better than that! If transposing the output makes it look the way it needs to be, then having Excel to the hard work of transposing the values is the best way to go about it IMO. The only problem I'm seeing is that you're implicitly referring to the active sheet, which may or may not be where you want to dump the results: it's always best to use an explicit worksheet reference, rather than [_Global].Range()
.
After generating the random values, you should shuffle them once. Otherwise the last elements will on average be smaller than 4.5, the expected average.
-
\$\begingroup\$ Good point, one of the numbers won't be truly random. Depending on what the list is for it's probably good enough, but if it's for something "important" like cryptography or gambling, I guess it's possible it matters :) \$\endgroup\$Colin Coghill– Colin Coghill2016年06月16日 23:27:59 +00:00Commented Jun 16, 2016 at 23:27
I think it's pretty good. An alternate way of handling some of that would be
Option Explicit
Public Sub RandomList()
Columns(1).Clear
Dim myNumbers(100) As Long
Dim i As Long
Dim j As Long
Dim k as Long
i = 0
k = 0
Do
j = Int(10 * Rnd())
'If j = 0 Then j = 1 optional if you want to keep 0s
If k + j > 100 Then j = 100 - k
k = k + j
myNumbers(i) = j
i = i + 1
Loop Until k = 100
Range("A1").Resize(i, 1) = Application.Transpose(myNumbers)
End Sub
It's just eliminating 0
s and fixing the last element before exiting the loop.
Actually, since you're looking for n I might do this
Option Explicit
Public Sub RandomList()
Const mySum = 100
Columns(1).Clear
Dim myNumbers(mySum) As Long
Dim i As Long
Dim j As Long
Dim k As Long
i = 0
k = 0
Do
j = Int(10 * Rnd())
'If j = 0 Then j = 1 optional if you want to keep 0s
If k + j > mySum Then j = mySum - k
k = k + j
myNumbers(i) = j
i = i + 1
Loop Until k = mySum
Range("A1").Resize(i, 1) = Application.Transpose(myNumbers)
End Sub