I have the following Java class:
import java.util.Random;
public class RandomNameGenerator {
private Random rand;
private static String[] prefixes = { "Ultimate", "Bloody", "Crooked",
"Hallowed", "Magnificent", "Heavy", "Jagged", "Grand", "Shiny",
"Rusty" };
private static String[] items = { "Chainsaw", "Towel", "Ping-Pong Ball",
"Longsword", "Scissors", "Dagger", "Blade", "Bow", "Axe", "Dagger",
"Spoon", "Fork", "Coat", "Chain Mail", "Plate Mail", "Cloak",
"Cape", "Mirror", "Cauldron", "Pouch", "Boots", "Shoes", "Greaves",
"Pants", "Robes", "Locket", "Ring", "Amulet", "Potion", "Fish",
"Teapot", "Hood", "Crown", "Cap", "Helmet" };
private static String[] addons = { "Glorious", "Bloody", "Prolonged",
"Bitter", "Wicked", "Furious" };
private static String[] postfixes1 = { "Destruction", "Feminism",
"Twilight", "Massacre", "Dread", "Terror", "Mutual Understanding",
"Spite", "Immobility", "Mediocrity", "Anger" };
private static String[] postfixes2 = { "the Occult", "the Captain",
"the Warrior", "the Hunter", "the Haunted", "the Dead",
"the Fallen", "the Hitchhiker", "the Wicked King", "the Grue" };
public RandomNameGenerator() {
rand = new Random(System.currentTimeMillis());
}
public String getRandomName() {
StringBuilder str = new StringBuilder();
if (rand.nextInt(100) < 10) {
str.append("+");
str.append(Integer.toString(rand.nextInt(10) + 1));
str.append(" ");
}
if (rand.nextInt(100) < 50) {
str.append(prefixes[rand.nextInt(prefixes.length)]);
str.append(" ");
}
str.append(items[rand.nextInt(items.length)]);
str.append(" of ");
if (rand.nextInt(postfixes1.length + postfixes2.length) > postfixes2.length) {
if (rand.nextInt(100) < 70) {
str.append(addons[rand.nextInt(addons.length)]);
str.append(" ");
}
str.append(postfixes1[rand.nextInt(postfixes1.length)]);
} else {
str.append(postfixes2[rand.nextInt(postfixes2.length)]);
}
return str.toString();
}
}
I use it in a small program to generate sample item names like
Shiny Dagger of the Haunted
Shoes of Immobility
Bow of Mediocrity
In general I get the results that I want, however I can't help but notice, that certain combinations pop up very frequently, for example
Ping-Pong Ball of the Occult
occurs very often.
Is there a design flaw that leads to this outcome? Is there a better way to format what I'm trying to achieve?
5 Answers 5
Your randomness for calculating the last postfixes is not regular.... there are very slight deviations
(Note update on the doubling of daggers at the end though... that's not 'slight').
Let's do a couple of things here. Firstly, rename them:
postfixes1
-> concepts
:
private static String[] concepts = { "Destruction", "Feminism",
"Twilight", "Massacre", "Dread", "Terror", "Mutual Understanding",
"Spite", "Immobility", "Mediocrity", "Anger" };
postfixes2
-> people
:
private static String[] people = {"the Occult", "the Captain",
"the Warrior", "the Hunter", "the Haunted", "the Dead",
"the Fallen", "the Hitchhiker", "the Wicked King", "the Grue"};
Then, you can make the logic clearer near the end. Your code is (with the renames):
if (rand.nextInt(concepts.length + people.length) > people.length) { if (rand.nextInt(100) < 70) { str.append(addons[rand.nextInt(addons.length)]); str.append(" "); } str.append(concepts[rand.nextInt(concepts.length)]); } else { str.append(people[rand.nextInt(people.length)]); }
Now, what's the problem there?
Well, there are 2 ...
people should not have 'the' prepended to all of them. It is repeating yourself. Instead you should have:
str.append("the ").append(people[rand.nextInt(people.length)]);
there is an off-by-one error in
if (rand.nextInt(concepts.length + people.length) > people.length) {
. This is hard to explain... but....nextInt(limit)
selects a random number up to, but not including the limit.in this case, the limit is
concepts.length + people.length
, and, if you put those two arrays together, there are 21 members (11 concepts, 10 people). So, you select a random number up to, but not including 21... then, if it is greater thanpeople.length
(10), you choose to add a concept. So, the chances are 10 in 21 that it will be> 10 (values 11, 12, 13, ... 20), and then you chose to do the side that has 11 entries. You are using the proportions of people to decide whether to do a concept.....Your code should be:
if (rand.nextInt(concepts.length + people.length) >= people.length) { // add a concept
or, more logically:
if (rand.nextInt(concepts.length + people.length) < concepts.length) { // add a concept
So, you do have an imbalance. You are selecting people about 10% more often than you should....
Update: Jeroen and I were discussing some odd values coming out with dagger, and I hacked out some of your code, and ran it through with this loop:
public static void main(String[] args) {
RandomNameGenerator rng = new RandomNameGenerator();
Map<String, AtomicInteger> counts = new TreeMap<>();
for (int i = 0; i < 100000; i++) {
String val = rng.getRandomName();
if (!counts.containsKey(val)) {
counts.put(val, new AtomicInteger());
}
counts.get(val).incrementAndGet();
}
counts.forEach((k,v) -> System.out.printf("%6d %s%n", v.get(), k));
}
Note, what I am doing is just counting the occurrences of a String.
I then stripped out some of the 'qualifiers' from your code, and got results like:
110 Amulet of Anger
126 Amulet of Destruction
114 Amulet of Dread
......
127 Chain Mail of Feminism
125 Chain Mail of Immobility
118 Chain Mail of Massacre
134 Chain Mail of Mediocrity
......
140 Crown of the Hitchhiker
154 Crown of the Hunter
168 Crown of the Occult
155 Crown of the Warrior
141 Crown of the Wicked King
243 Dagger of Anger
233 Dagger of Destruction
271 Dagger of Dread
245 Dagger of Feminism
245 Dagger of Immobility
240 Dagger of Massacre
254 Dagger of Mediocrity
...
318 Dagger of the Hunter
299 Dagger of the Occult
328 Dagger of the Warrior
295 Dagger of the Wicked King
131 Fish of Anger
130 Fish of Destruction
101 Fish of Dread
129 Fish of Feminism
Note how Dagger appears about twice as often as other values?
That's because you have "Dagger" in the input data twice:
private static String[] items = { .....,
"Longsword", "Scissors", "Dagger", "Blade", "Bow", "Axe", "Dagger",
//
// Dagger twice ^^^^^^^^ ^^^^^^^^
.....
"Teapot", "Hood", "Crown", "Cap", "Helmet" };
-
6\$\begingroup\$ Note update on Daggers..... \$\endgroup\$rolfl– rolfl2014年10月01日 15:58:45 +00:00Commented Oct 1, 2014 at 15:58
-
\$\begingroup\$ That's some good catches, thank you very much for your input. I've incorporated the changes you've suggested. \$\endgroup\$Etheryte– Etheryte2014年10月01日 20:22:39 +00:00Commented Oct 1, 2014 at 20:22
-
\$\begingroup\$ I think that having "the " on all the people is a good idea, in case he ever wants to add "Morgan Freeman". \$\endgroup\$tobyink– tobyink2014年10月03日 08:51:27 +00:00Commented Oct 3, 2014 at 8:51
-
1\$\begingroup\$ @tobyink better to rename the categories and put Morgan Freeman in the first. As is, the effective difference between the two seems to be that the second group requires a definite article, and the first does not. If additional options are added that require different articles, further dividing the categories:a Jagged Fish of some Pretty Pretty Princess, or an Ultimate Spoon of some Long Forgotten Country \$\endgroup\$Mr.Mindor– Mr.Mindor2014年10月03日 19:02:41 +00:00Commented Oct 3, 2014 at 19:02
prefixes[rand.nextInt(prefixes.length)]
You have this pattern many many times in your code, and this is a small code duplication in this line as prefixes
is repeated twice. If you ever would accidentally change one of them but not the other, you'd end up with a major ArrayIndexOutOfBoundsException
.
I think this is enough to warrant a method extraction. (This is a slightly modified variant of a real method I wrote years ago. I find it immensely useful)
public static <E> E randomElement(E[] list, Random random) {
if (list == null || list.length == 0) {
return null;
}
return list[random.nextInt(list.length)];
}
It's up to you how you want to treat the handling of null elements there, although as you always use the same random
variable your method can be:
public <E> E randomElement(E[] list) {
return list[rand.nextInt(list.length)];
}
Now you can use it like this:
str.append(randomElement(prefixes));
str.append(randomElement(postfixes1));
str.append(randomElement(people)); // with the name change suggestion mentioned by @rolfl
if (rand.nextInt(100) < 70) {
Randomizing a boolean that has 70% chance of being true is more commonly written as
if (rand.nextDouble() < 0.7) {
This makes it easier to see the direct connection to the exact percentage, and it also reduces your dependency on the "magic number" 100
-
4\$\begingroup\$ I would NOT test for
list == null || list.length == 0
. This makes no sense, as any use here gets a non-null non-empty list. If the program changes and this no longer hold, would you prefer to getUltimate null of null
or immediately get an exception and fix the program? Fail fast, tolerate no nulls. \$\endgroup\$maaartinus– maaartinus2014年10月01日 15:58:04 +00:00Commented Oct 1, 2014 at 15:58 -
3\$\begingroup\$ @maaartinus
Ultimate null of null
seems to be perfectly in line with the rest of the program though :) \$\endgroup\$Simon Forsberg– Simon Forsberg2014年10月01日 16:00:46 +00:00Commented Oct 1, 2014 at 16:00 -
\$\begingroup\$ Thank you for your input, I've incorporated your suggestions, especially
public static <E> E randomElement(E[] list, Random random)
is a very neat solution. \$\endgroup\$Etheryte– Etheryte2014年10月01日 20:19:28 +00:00Commented Oct 1, 2014 at 20:19
It's likely not to be a mistake. More likely it's confirmation bias. I was wrong! @rolfl has spotted "Dagger" being in the list twice.
I'd suggest letting your code generate ~100k items, then counting the amount you get of each. Chances are that it's pretty well distributed.
You could split some things up. Right now you have one big long function and I don't like it. It's too large.
public String getRandomName() {
StringBuilder str = new StringBuilder();
if (rand.nextInt(100) < 10) {
str.append("+");
str.append(Integer.toString(rand.nextInt(10) + 1));
str.append(" ");
}
if (rand.nextInt(100) < 50) {
str.append(prefixes[rand.nextInt(prefixes.length)]);
str.append(" ");
}
str.append(items[rand.nextInt(items.length)]);
str.append(" of ");
if (rand.nextInt(postfixes1.length + postfixes2.length) > postfixes2.length) {
if (rand.nextInt(100) < 70) {
str.append(addons[rand.nextInt(addons.length)]);
str.append(" ");
}
str.append(postfixes1[rand.nextInt(postfixes1.length)]);
} else {
str.append(postfixes2[rand.nextInt(postfixes2.length)]);
}
return str.toString();
}
On a higher level, it looks like this:
public String getRandomName() {
StringBuilder str = new StringBuilder();
str.append(getRandomBonusLevel());
str.append(getRandomPrefix());
str.append(getRandomItemName());
str.append(getRandomPostfix());
return str.toString();
}
With subfunctions like
private String getRandomItemName(){
return items[rand.nextInt(items.length)] + " of ";
}
On a lower level, you have a couple hardcoded things, such as these magic numbers:
if (rand.nextInt(100) < 10) {
if (rand.nextInt(100) < 50) {
Why not store these as constants?
private static final int BONUS_LEVEL_PERCENTAGE_CHANCE = 10;
private static final int PREFIX_PERCENTAGE_CHANCE = 50;
-
\$\begingroup\$ that isn't really a percentage of chance though, the chance that the number returned is under 10 and/or under 50 is not 10% or 50%. you actually will have a higher percentage of numbers in the middle of the range than at the extremes. and I wish I could reference something here, but I don't remember why/how I know that. \$\endgroup\$Malachi– Malachi2014年10月01日 15:14:44 +00:00Commented Oct 1, 2014 at 15:14
-
3\$\begingroup\$ @Malachi - you are wrong with that statement. \$\endgroup\$rolfl– rolfl2014年10月01日 15:15:25 +00:00Commented Oct 1, 2014 at 15:15
-
\$\begingroup\$ I don't have access to running Java Code, but I could create a random number generator with C# and see what happens.... \$\endgroup\$Malachi– Malachi2014年10月01日 15:17:42 +00:00Commented Oct 1, 2014 at 15:17
-
\$\begingroup\$ @Malachi and you'd be testing the effect of using C#'s RNG with java calls. Java's
Random.nextInt(int n)
, returns0-(n-1)
, pseudorandomly distributed. \$\endgroup\$Pimgd– Pimgd2014年10月01日 15:19:09 +00:00Commented Oct 1, 2014 at 15:19 -
1\$\begingroup\$ The distribution of elements is very uneven here. I have tested it with 500.000 and 5.000.000 iterations of generating a name and both iterations give me 10 'Dagger of the' items in the first 10 spots with the 10th having at least twice the amount of occurrences as whatever comes on the 11th place. So there is a definite favoritism for 'Dagger of the'. Furthermore, when mapping the distribution with ranges, you get something like this: gist.github.com/Vannevelj/3b57a23c2cd77523b641 \$\endgroup\$Jeroen Vannevel– Jeroen Vannevel2014年10月01日 15:40:38 +00:00Commented Oct 1, 2014 at 15:40
rand = new Random(System.currentTimeMillis());
This is no good. Simple using new Random()
is much better as it assures a different seed even when multiple instances per millisecond get created.
To elaborate a bit on this answer
While numbers returned from a single Random object are random with respect to each other, There is no guarantee that sequences returned from different seeds will not be correlated in some way. Perhaps this is the problem if you are generating just a few numbers before restarting the program.
Yes and no. While java.util.Random
is far from perfect, it has 48 bits of state and chances you'll observe correlation from a few samples are IMHO negligible. Try SecureRandom
to see if it's a flaw in you program. SecureRandom
is guaranteed to be indistinguishable from a perfect randomness source.
-
2\$\begingroup\$ I would argue that this doesn't apply here. I think it's rather clear that the class is written in a manner in which multiple instances of it are never needed. \$\endgroup\$Etheryte– Etheryte2014年10月01日 16:28:26 +00:00Commented Oct 1, 2014 at 16:28
-
\$\begingroup\$ @Nit True, but we haven't seen your calling code so we can't guarantee that. If you use the no-arg constructor of
Random
, you are instead usingSystem.nanoTime()
which has a better precision, then it is a higher guarantee that the random seed is different. Although the best is of course to always re-use the class. \$\endgroup\$Simon Forsberg– Simon Forsberg2014年10月01日 16:48:39 +00:00Commented Oct 1, 2014 at 16:48 -
1\$\begingroup\$ @Nit The class is not written as a singleton, so we can't say anything about the number of its instances. Sure, there will probably be just one, so it's no big deal, but the shorter and better constructor is unconditionally better. \$\endgroup\$maaartinus– maaartinus2014年10月01日 17:32:49 +00:00Commented Oct 1, 2014 at 17:32
-
1\$\begingroup\$ Thanks for your input. I've switched to the shorter constructor as well as made the class a singleton. \$\endgroup\$Etheryte– Etheryte2014年10月01日 20:20:12 +00:00Commented Oct 1, 2014 at 20:20
-
\$\begingroup\$ @Nit Note that I'm no fan of singletons... apart from this one. But having a lightly used singleton is not bad as you can easily replace its singleton-ness by DI. \$\endgroup\$maaartinus– maaartinus2014年10月01日 20:52:37 +00:00Commented Oct 1, 2014 at 20:52
While numbers returned from a single Random object are random with respect to each other, There is no guarantee that sequences returned from different seeds will not be correlated in some way. Perhaps this is the problem if you are generating just a few numbers before restarting the program.
The following code demonstrates this effect.
import java.util.Random;
class Rander
{
public static void main(String[] args)
{
for (long i = 0; i < 10; ++i)
{
Random r = new Random(i);
System.out.println(r.nextInt());
}
}
}
Here I seed Random with 10 different seeds. The output is as follows.
-1155484576
-1155869325
-1154715079
-1155099828
-1157023572
-1157408321
-1156254074
-1156638823
-1158562568
-1158947317
These represent the first number in the random sequences that will be produced. As you can see they are not random with respect to each other at all!
Seeding the random generator does make your runs independent!
Let's try with SecureRandom.
import java.security.SecureRandom;
class SecureRander
{
public static void main(String[] args)
{
for (long i = 0; i < 10; ++i)
{
SecureRandom r = new SecureRandom();
System.out.println(r.nextInt());
}
}
}
Now you can see the sequences will not be correlated.
-2023551863
675514394
-124350109
-793410496
-1967376167
1440653928
629893770
-27120645
1006510325
1812049818
In summary, while numbers returned from a single Random object will be random enough for your purposes, different runs will not be random with respect to each other even if you use seeding. Seeding will just stop them being exactly the same.
fur coat
orbronze teapot
orbiker jacket
instead ofjacket
. you do a lot of stuff with the length of your strings, could it be that the weight for that string is more because of it's length being longer than the others? \$\endgroup\$