I have this pseudorandom number generator:
package com.github.coderodde.util;
public class Random {
private static final long MASK = 0xba42_87df_9103_cd55L;
private static final long LARGE_PRIME = 2396568697254547L;
private static final int TURN_OFF_INT_SIGN_BIT = 0x7fff_ffff;
private long counter;
private final long seed;
public Random(long seed) {
this.seed = seed;
}
public Random() {
this(System.nanoTime());
}
public int nextInt() {
return ((int)((seed + (counter += LARGE_PRIME)) ^ MASK))
& TURN_OFF_INT_SIGN_BIT;
}
public int nextInt(int bound) {
checkBound(bound);
return nextInt() % bound;
}
private static void checkBound(int bound) {
if (bound < 1) {
throw new IllegalArgumentException(
"Bad bound: " + bound + ". Must be positive.");
}
}
}
And the demo program is:
package com.github.coderodde.util;
public class Demo {
public static void main(String[] args) {
Random random = new Random(1L);
for (int i = 0; i < 20; i++) {
System.out.println(random.nextInt(1000));
}
}
}
Typical output
977
282
95
928
493
782
715
828
697
218
687
896
341
86
651
284
873
50
423
472
Critique request
As always, I would like to hear anything that comes to mind.
-
\$\begingroup\$ Define "more random"? \$\endgroup\$Simon Forsberg– Simon Forsberg2022年10月28日 10:51:06 +00:00Commented Oct 28, 2022 at 10:51
-
\$\begingroup\$ I have no definition for "more random". But I wouldn’t be surprised if there are some better practices. \$\endgroup\$coderodde– coderodde2022年10月28日 10:54:21 +00:00Commented Oct 28, 2022 at 10:54
-
\$\begingroup\$ Usually, a PRNG will have documentation outlining the statistical properties of the generated sequence as well as basic properties of the PRNG such as its period or its state space. I don't see that here. Is that intentional? Because without such documentation, it is essentially useless. How can I use a PRNG if it doesn't say what it can be used for? \$\endgroup\$Jörg W Mittag– Jörg W Mittag2022年11月02日 15:21:07 +00:00Commented Nov 2, 2022 at 15:21
1 Answer 1
This PRNG has some unfortunate characteristics.
The easiest to spot is that the least significant bit of the
counter
has a period of 2. It will only ever alternate between even and odd. Adding the seed and XOR-ing byMASK
does not change anything about that, and for an evenbound
that property leaks out into the result.Given a
M
of the form(1 << k) - 1
, both(a + b) & M == (a & M) + (b & M) & M
and(a ^ b) & MA == (a & M) ^ (b & M)
, which means that the bitwise AND withTURN_OFF_INT_SIGN_BIT
could be moved inside the expression, and then it becomes clear that this is really a 31-bit PRNG.counter
andseed
andMASK
andLARGE_PRIME
could beint
, and the PRNG would work exactly the same. By the wayLARGE_PRIME & TURN_OFF_INT_SIGN_BIT
(and this is the quantity that matters, notLARGE_PRIME
itself) is no longer a prime, but I'm also not convinced that it needs to be a prime.Basically this PRNG comes down to doing this, this changes the values that
counter
goes through but it shouldn't change the actual results (I didn't verify that):public int nextInt() { return (((counter++ * 568406675) + seed) ^ 0x9103_cd55) & 0x7fff_ffff; }
The XOR doesn't fundamentally change anything (it flips some bits in a completely static way, if we wear "XOR goggles" with the same constant then its influence disappears entirely), so let's ignore it. The part that is supposed to be random, is the bottom 31 bits of
counter++ * 568406675
. Now admittedly, taking a simple counter and "mixing it up" is a legitimate strategy for implementing a PRNG (as long as there is enough mixing), and multiplication by a well-chosen constant is a good piece of the puzzle, and 568406675 is not even a bad choice (well, a fortunate accident, not really a choice). But just one step of mixing is not much. Subsequent results are highly correlated, and the mixing only goes from the least significant bit up, the upper bits are never mixed into the lower bits. That gives the lower bits very short periods (like an LCG, which this PRNG is also similar to, but with the multiplier set to 1 - which is not good).The overload
int nextInt(int bound)
uses the classic biased range reduction, which can be noticeable for large values ofbound
. This range reduction is also especially weak whenbound
is a low power of two, resulting in an output sequence with a very short period.
-
\$\begingroup\$ What is "period"? \$\endgroup\$coderodde– coderodde2022年10月30日 05:28:52 +00:00Commented Oct 30, 2022 at 5:28
-
\$\begingroup\$ @coderodde the number of steps until the sequence repeats itself \$\endgroup\$user555045– user5550452022年10月30日 05:42:52 +00:00Commented Oct 30, 2022 at 5:42