I'm working on developing a small Ruby program to generate random data for a Neo4J database, including people, addresses, phone numbers, etc.
I'm a total beginner to Ruby, so I wanted to post my progress so far here to get it reviewed. I've just completed the "people" generating functionality.
EntityFaker.rb
=begin
EntityFaker.rb
=end
require_relative "EntityFactory"
class Main
public
def self.generate_entities
puts "Generating entities..."
EntityFactory.test_function
end
generate_entities
end
EntityFactory.rb
=begin
Entity-Factory
=end
require 'faker'
require 'pp'
require_relative 'Entities/Person'
class EntityFactory
@@person_array = []
public
def self.test_function()
generate_people(15)
end
private
def self.generate_people(number)
number.times do |n|
sex = Faker::Gender.binary_type
age = rand(18...65)
person = Person.new(
rand_bool ? Faker::Name.prefix : nil,
(sex == 'Male') ? Faker::Name.unique.male_first_name : Faker::Name.unique.female_first_name,
rand_bool ? Faker::Name.middle_name : nil,
Faker::Name.unique.last_name,
rand_bool ? Faker::Name.last_name : nil,
rand_bool ? Faker::Name.suffix : nil,
Time.now.to_i - age * 31556900, # dob in seconds since epoch
Person.random_height(sex),
Person.random_weight,
sex,
rand_bool ? sex : Faker::Gender.type,
Person.random_blood_type,
Faker::Color.color_name,
Faker::Color.color_name,
age,
Person.random_complexion,
Person.random_build,
Faker::Demographic.race
)
@@person_array.push(person)
end
pp @@person_array
end
private
def self.rand_bool
[true, false].sample
end
end
Person.rb
=begin
Person.rb
=end
class Person
# http://chartsbin.com/view/38919
@@min_male_height = 166
@@max_male_height = 184
# http://chartsbin.com/view/38920
@@min_female_height = 147
@@max_female_height = 170
# For males and females combined, no data could be found seperating the two.
# https://en.wikipedia.org/wiki/Human_body_weight#Average_weight_around_the_world
@@min_weight = 49.591
@@max_weight = 87.398
# https://github.com/rubocop-hq/ruby-style-guide/issues/289
def initialize(
prefix,
given_name,
middle_name,
family_name,
maiden_name,
suffix,
dob,
height,
weight,
sex,
gender,
blood_type,
eye_colour,
hair_colour,
age,
complexion,
build,
race
)
@prefix = prefix
@given_name = given_name
@middle_name = middle_name
@family_name = family_name
@maiden_name = maiden_name
@suffix = suffix
create_full_name(prefix, given_name, middle_name, family_name, suffix)
@dob = dob
@height = height
@weight = weight
@sex = sex
@gender = gender
@blood_type = blood_type
@eye_colour = eye_colour
@hair_colour = hair_colour
@age = age
@complexion = complexion
@build = build
@race = race
end
private
def create_full_name(prefix, given_name, middle_name, family_name, suffix)
@legal_name = @full_name = [prefix, given_name, middle_name, family_name, suffix].compact.join(" ")
end
public
def self.random_weight
range(@@min_weight, @@max_weight)
end
public
def self.random_height(sex)
(sex == "Male") ? range(@@min_male_height, @@max_male_height) : range(@@min_female_height, @@max_female_height)
end
public
def self.random_complexion
# https://www.quora.com/What-are-all-the-different-types-of-skin-tones-or-complexions
["Type I", "Type II", "Type III", "Type IV", "Type V", "Type VI"].sample
end
public
def self.random_blood_type
# https://www.livescience.com/36559-common-blood-type-donation.html
["O-positive", "O-negative", "A-positive", "A-negative", "B-positive", "B-negative", "AB-positive", "AB-negative"].sample
end
public
def self.random_build
# https://www.cityofsacramento.org/-/media/Corporate/Files/Police/Resources/Suspect-Description-Form-SPD.pdf?la=en
["Slender", "Medium", "Heavy", "Fat", "Muscular"].sample
end
private
def self.range(min, max)
(rand * (max - min) + min).round(1)
end
end
-
1\$\begingroup\$ Should the data be realistic? For example, height, weight, and build should be correlated. Also, weight and height in a population would each follow a normal distribution. \$\endgroup\$200_success– 200_success2018年07月17日 04:24:09 +00:00Commented Jul 17, 2018 at 4:24
-
\$\begingroup\$ @200_success. There’s no requirement for the data to be totally realistic, but I’d like it to be as realistic as possible without wasting too much dev time on it, so I’d love to hear suggestions on how to improve the realism if you have any. \$\endgroup\$PythonNewb– PythonNewb2018年07月17日 07:40:21 +00:00Commented Jul 17, 2018 at 7:40
-
\$\begingroup\$ Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers . \$\endgroup\$Vogel612– Vogel6122018年07月20日 06:52:43 +00:00Commented Jul 20, 2018 at 6:52
-
\$\begingroup\$ @Vogel612 Sorry, I didn't know. I'll accept the answer and move on. \$\endgroup\$PythonNewb– PythonNewb2018年07月20日 06:59:15 +00:00Commented Jul 20, 2018 at 6:59
1 Answer 1
I'm going to be reviewing your code largely for ruby style and best practices, rather then concrete improvements because it appears that you're unfamiliar with the way that ruby is typically written and are trying to shoehorn ideas from other languages (specifically, it looks like Java) into ruby syntax. This isn't a bad thing, it just shows that you (and the rest of us) are all still learning.
Let's start by going through for syntax and simpler code style changes, then we'll dive into some structural changes.
First, let's look at your EntityFaker.rb
file:
The first thing I noticed is the Main
class which you appear to be bringing over from Java. We don't need a main class, we can just put that in the outermost scope, and we don't need to wrap it in a method. Also, in ruby we typically avoid block comments with =begin
and =end
, and prefer #
even for multiline comments. Thus, our EntityFaker.rb
file ends up looking like this:
# EntityFaker.rb
require_relative "EntityFactory"
puts "Generating entities..."
EntityFactory.test_function
Now, let's move on to some structural changes in the other two files. First, you're misusing the idea of class vs instance methods and misusing the idea of public/private methods in ruby. The purpose of a class is so that it can be instantiated. By declaring class methods with the def self.name
syntax, you're effectively making it a namespace for holding methods rather than a class. In addition, when you declare class methods (def self.name
), they're all public, and so using the public or private keywords have no effect. Those keywords only apply to instance methods (def name
).
You're also using a class as a Factory of sorts. This isn't what a class is supposed to be. A class is supposed to be a template for producing instances of the class. You could build a class which generated classes, but that's not what you're doing here. It seems like you want to create a class and append instances of that class to an array.
You use your factory to generate random values for the attributes of the class, but I think it'd be much more elegant to allow the user to optionally pass in values, but if no values are passed, to pick randomly. This will let us merge your two files into one nice neat class like so:
class Person
# I've changed all these class variables because 1) class variables
# are considered bad practice (they have some weird behaviors) and 2)
# because these numbers are... well... constants.
# http://chartsbin.com/view/38919
MIN_MALE_HEIGHT = 166
MAX_MALE_HEIGHT = 184
# http://chartsbin.com/view/38920
MIN_FEMALE_HEIGHT = 147
MAX_FEMALE_HEIGHT = 170
# For males and females combined, no data could be found seperating the two.
# https://en.wikipedia.org/wiki/Human_body_weight#Average_weight_around_the_world
MIN_WEIGHT = 49.591
MAX_WEIGHT = 87.398
# The general gist of what I've done here is I've take the arguments as
# a hash instead of as a long list, which could become unclear. Taking
# arguments as a hash lets you create a person like so:
# Person.new({sex: "Male", age: 64}) which is the same thing as:
# Person.new(sex: "Male", age: 64) which is the same thing as:
# Person.new sex: "Male", age: 64 which is much clearer and neater.
# Also, I've changed it so that any options not passed in the opts hash
# will be randomly generated.
#
# You may be wondering about my usage of "||" (the OR operator). What
# I'm doing is checking if opts[:whatever] exists. If it does not exist
# it will return nil, which evaluates to false, and so will move on to
# to the other side of the operator which will do the random generation.
# This is a common idiom in ruby.
def initialize(**opts)
# Note that I've moved @sex to the top, because it's used in @given_name
# I've also moved @age, because it's used in a couple other places
@sex = opts[:sex] || Faker::Gender.binary_type
@age = opts[:age] || rand(18...65)
# I've taken the default values of @sex and @age from EntityFactory
@prefix = opts[:prefix] || rand_bool ? Faker::Name.prefix : nil
@given_name = opts[:given_name] || (@sex == 'Male') ? Faker::Name.unique.male_first_name : Faker::Name.unique.female_first_name
@middle_name = opts[:middle_name] || rand_bool ? Faker::Name.middle_name : nil
@family_name = opts[:family_name] || Faker::Name.unique.last_name
@maiden_name = opts[:maiden_name] || Faker::Name.unique.last_name
@suffix = opts[:suffix] || rand_bool ? Faker::Name.suffix : nil
create_full_name(prefix, given_name, middle_name, family_name, suffix)
@dob = opts[:dob] || Time.now.to_i - age * 31556900 # dob in seconds since epoch
@height = opts[:height] || random_height(sex)
@weight = opts[:weight] || random_weight
@gender = opts[:gender] || rand_bool ? @sex : Faker::Gender.type
@blood_type = opts[:blood_type] || random_blood_type
@eye_colour = opts[:eye_colour] || Faker::Color.color_name
@hair_colour = opts[:hair_colour] || Faker::Color.color_name
@complexion = opts[:complexion] || random_complexion
@build = opts[:build] || random_build
@race = opts[:race] || Faker::Demographic.race
end
private # Note that this private keyword applies to all instance methods below it
def rand_bool
[true, false].sample
end
def create_full_name(prefix, given_name, middle_name, family_name, suffix)
@legal_name = @full_name = [prefix, given_name, middle_name, family_name, suffix].compact.join(" ")
end
def random_weight
range(MIN_WEIGHT, MAX_WEIGHT)
end
# This could use some restructuring, which I'll go into later
def random_height(sex)
(sex == "Male") ? range(MIN_MALE_HEIGHT, MAX_MALE_HEIGHT) : range(MIN_FEMALE_HEIGHT, MAX_FEMALE_HEIGHT)
end
def random_complexion
# https://www.quora.com/What-are-all-the-different-types-of-skin-tones-or-complexions
["Type I", "Type II", "Type III", "Type IV", "Type V", "Type VI"].sample
end
def random_blood_type
# https://www.livescience.com/36559-common-blood-type-donation.html
["O-positive", "O-negative", "A-positive", "A-negative", "B-positive", "B-negative", "AB-positive", "AB-negative"].sample
end
def random_build
# https://www.cityofsacramento.org/-/media/Corporate/Files/Police/Resources/Suspect-Description-Form-SPD.pdf?la=en
["Slender", "Medium", "Heavy", "Fat", "Muscular"].sample
end
def range(min, max)
(rand * (max - min) + min).round(1)
end
end
Finally, to generate 15 people like you've set it up to do, run this:
people = (0..15).map do
Person.new
end
This uses rubys range to make an array of numbers from 0 to 15, then iterates of the array and changes each number to a new Person.
I may come back to this answer and edit in some more ideas, but for now I've gotta run. Enjoy!
-
\$\begingroup\$ Thank you for such a detailed answer! This is exactly what I was looking for since, like you said, I have no idea how Ruby should be written. My script has progressed a little between the time I posted the question and the time that you answered, so I've updated my original post and I've implemented your improvements into my code. \$\endgroup\$PythonNewb– PythonNewb2018年07月20日 06:31:47 +00:00Commented Jul 20, 2018 at 6:31
-
\$\begingroup\$ My apologies, I didn't know it was against the rules of the site to update my original question. I've implemented your improvements into my code, I will accept your answer and leave it there. \$\endgroup\$PythonNewb– PythonNewb2018年07月20日 07:00:37 +00:00Commented Jul 20, 2018 at 7:00