I wrote a bunch of Ruby code for a book project I've just finished. One criticism is that it is fine code but not very "ruby like". I agree my style was simplified for communication reasons, and although it's procedural code, it still feels "ruby-like" to me.
For the representative example below, any ideas on making it more "Ruby like"?
# Genetic Algorithm in the Ruby Programming Language
# The Clever Algorithms Project: http://www.CleverAlgorithms.com
# (c) Copyright 2010 Jason Brownlee. Some Rights Reserved.
# This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 Australia License.
def onemax(bitstring)
sum = 0
bitstring.size.times {|i| sum+=1 if bitstring[i].chr=='1'}
return sum
end
def random_bitstring(num_bits)
return (0...num_bits).inject(""){|s,i| s<<((rand<0.5) ? "1" : "0")}
end
def binary_tournament(pop)
i, j = rand(pop.size), rand(pop.size)
j = rand(pop.size) while j==i
return (pop[i][:fitness] > pop[j][:fitness]) ? pop[i] : pop[j]
end
def point_mutation(bitstring, rate=1.0/bitstring.size)
child = ""
bitstring.size.times do |i|
bit = bitstring[i].chr
child << ((rand()<rate) ? ((bit=='1') ? "0" : "1") : bit)
end
return child
end
def crossover(parent1, parent2, rate)
return ""+parent1 if rand()>=rate
point = 1 + rand(parent1.size-2)
return parent1[0...point]+parent2[point...(parent1.size)]
end
def reproduce(selected, pop_size, p_cross, p_mutation)
children = []
selected.each_with_index do |p1, i|
p2 = (i.modulo(2)==0) ? selected[i+1] : selected[i-1]
p2 = selected[0] if i == selected.size-1
child = {}
child[:bitstring] = crossover(p1[:bitstring], p2[:bitstring], p_cross)
child[:bitstring] = point_mutation(child[:bitstring], p_mutation)
children << child
break if children.size >= pop_size
end
return children
end
def search(max_gens, num_bits, pop_size, p_crossover, p_mutation)
population = Array.new(pop_size) do |i|
{:bitstring=>random_bitstring(num_bits)}
end
population.each{|c| c[:fitness] = onemax(c[:bitstring])}
best = population.sort{|x,y| y[:fitness] <=> x[:fitness]}.first
max_gens.times do |gen|
selected = Array.new(pop_size){|i| binary_tournament(population)}
children = reproduce(selected, pop_size, p_crossover, p_mutation)
children.each{|c| c[:fitness] = onemax(c[:bitstring])}
children.sort!{|x,y| y[:fitness] <=> x[:fitness]}
best = children.first if children.first[:fitness] >= best[:fitness]
population = children
puts " > gen #{gen}, best: #{best[:fitness]}, #{best[:bitstring]}"
break if best[:fitness] == num_bits
end
return best
end
if __FILE__ == 0ドル
# problem configuration
num_bits = 64
# algorithm configuration
max_gens = 100
pop_size = 100
p_crossover = 0.98
p_mutation = 1.0/num_bits
# execute the algorithm
best = search(max_gens, num_bits, pop_size, p_crossover, p_mutation)
puts "done! Solution: f=#{best[:fitness]}, s=#{best[:bitstring]}"
end
4 Answers 4
First a note on your general design: You use hashmaps to represent members of the population. I'd rather recommend you create a little class to represent them:
class Individual
include Comparable
attr_reader :bitstring, :fitness
def initialize(bitstring)
@bit_string = bitstring
@fitness = onemax(bitstring)
end
def <=>(other)
fitness <=> other.fitness
end
end
You'll now use Individual.new(bitstring)
to create a new individual (which will automatically calculate its own fittness) and individual.bitstring
and individual.fitness
to get an individual's bitstring or fitness respectively.
This has the benefit that now the logic of how to calculate an individuals fitness lives in the Individual
class instead of in multiple place in the search
method. This is a more natural place for it and makes it easier to find and change should you ever decide to use a different fitness function.
Also note that I included Comparable, implementing <=>
based on fitness. This means that you can now compare individuals to each other such that an individual is greater than another if and only if its fitness is greater. This allows you to use >
on two individuals as well as methods like max
to get the individual with the greatest fitness out of an array of individuals. It also allows you to call sort
on array of individuals without passing a block. This will simplify your code in some places.
In a comment you mentioned that you don't want to use classes. However I still think that it would greatly improve your design to have the logic for creating new individuals and calculating their fitness in one single place, so you should at least factor it out into a method like this:
def create_individual(bitstring)
{:bitstring => bitstring, :fitness => onemax(bitstring)}
end
Now some notes on particular pieces of code:
bitstring.size.times {|i| sum+=1 if bitstring[i].chr=='1'}
As has already been mentioned, this can easily be replaced with a call to count
. However as a general note I want to point out that it's good practice to avoid index-based loops wherever possible, so if count
did not exist, I'd write the above like this:
bitstring.each_char {|c| sum+=1 if c == '1'}
Or for 1.8.6 compability:
bitstring.each_byte {|b| sum+=1 if b.chr == '1'}
(0...num_bits).inject(""){|s,i| s<<((rand<0.5) ? "1" : "0")}
When dealing with inject
, it's usually a good idea to keep the function simple, so it's more easily understandable. Often you can separate the logic of generating the elements from the logic of combining them, by using map
before inject
. This would make your code look like this:
(0...num_bits).map {|i| (rand<0.5) ? "1" : "0"}.inject("") {|s,i| s<<i}
Now there's two things to notice here:
- Whenever you call
map
on a range from 0 to some size, you might rather want to useArray.new(size) {block}
which creates an array of the given size by calling the blocksize
times. - The smaller
inject
is now much easier to understand, but happens to be equivalent to callingjoin
, so let's just do that.
So your code becomes:
Array.new(num_bits) { (rand<0.5) ? "1" : "0" }.join
This way you create a temporary array, which you didn't before, but on the other hand your code became much more understandable, so that little inefficiency should be worth it.
(pop[i][:fitness] > pop[j][:fitness]) ? pop[i] : pop[j]
Thanks to the Individual
class this now looks like this:
(pop[i] > pop[j]) ? pop[i] : pop[j]
This isn't much different, but note that it now has the form (x > y) ? x : y
, which incidentally means "the maximum of two objects". In other words we can simplify this code by using the max
method:
[ pop[i], pop[j] ].max
bitstring.size.times do |i|
bit = bitstring[i].chr
Again this should be bitstring.each_char do |bit|
or bitstring.each_byte do |b| bit = b.chr
.
""+parent1 if rand()>=rate
(削除) I'm not sure what the ""+
is supposed to do here. Remove it. Note that ""+x
is not a way to turn x
into a string in ruby (like it would be in e.g. Java). Calling ""+x
when x
is not a string (or "duck-string"), will cause a type error. (削除ここまで)
The fact that you're using ""+
to create a copy is less than obvious (thus the above paragraph). If you use the dup
method, whose sole purpose it is to create a copy, instead, it will be much more understandable.
I'd also change the crossover
method to return an individual rather than a string, which makes the code a bit simpler.
-
\$\begingroup\$ Some great suggestions, thank-you for taking the time. I especially love the random bitstring suggestion. Regarding the use of the object, I explicitly chose a procedural paradigm for presentation although agree having objects can make the organization of the code much better. I had to drop each_char (and a few other funcs) for 1.8.6 support. The ""+parent1 was to make a quick copy of the string object (overly cautious, i know). Thanks again! \$\endgroup\$jasonb– jasonb2011年01月29日 06:05:08 +00:00Commented Jan 29, 2011 at 6:05
-
\$\begingroup\$ @jasonb: If you can't use
each_char
, useeach_byte
at least (it will be the same except that you get an integer, which you need to callchr
on). That's still better than iterating over the indices. To create a copy of an object, you should usedup
, which makes it clear to the reader what you're doing. \$\endgroup\$sepp2k– sepp2k2011年01月29日 11:50:57 +00:00Commented Jan 29, 2011 at 11:50 -
\$\begingroup\$ @jasonb: Even for procedural code it's still bad design to have the fitness code in several places inside the
search
method. If you don't want to use classes, I'd at least recommend acreate_individual
method which takes a bitstring and returns a hash with the bitstring and the fitness. This way the fitness calculation still takes place in a single location. \$\endgroup\$sepp2k– sepp2k2011年01月29日 11:54:18 +00:00Commented Jan 29, 2011 at 11:54
Ruby library is quite powerful:
onemax(bitstring)
can be simply bitstring.count('1')
random_bitstring(num_bits)
is probably better off implemented as rand(2**num_bits).to_s(2)
The first thing to check out is this http://ruby-doc.org/core-2.0/Enumerable.html
-
\$\begingroup\$ Thanks for pointing out the count method - your onemax is excellent. I really do need to read the API more. The suggestion for random_bitstring, although elegant, I find difficult to parse. Granted mine is worse, but I feel like that a non-ruby person could figure out my version from reading a "quick start language guide", yours would require the API open in another tab. I am finding this case far less clear cut. Thanks again for the feedback! \$\endgroup\$jasonb– jasonb2011年01月29日 05:51:32 +00:00Commented Jan 29, 2011 at 5:51
An easy one is to remove the return
at the end of each method. Ruby automatically does a return of the last value. It's actually an extra method call to use return
.
(削除) The only time you specifically need to use return is (削除ここまで) You don't even need to use return if you are returning multiple values:
def f
[1,2]
end
a,b = f
Again, using return
costs an extra method call and those add up. If you can optimize your code by not using them you've just given yourself some free optimization.
-
\$\begingroup\$ Ok, thanks. In this case, I think it would only apply to the "onemax", "random_bitstring", and "point_mutation" functions. \$\endgroup\$jasonb– jasonb2011年01月28日 05:41:18 +00:00Commented Jan 28, 2011 at 5:41
-
1\$\begingroup\$ No remove it on EVERY function. For instance in the
reproduce
method change the last line fromreturn children
to justchildren
. The only time you need to use return in Ruby is if you are returning multiple values, e.g.def f() return 1,2 end a,b = f()
\$\endgroup\$Mike Bethany– Mike Bethany2011年01月28日 05:44:46 +00:00Commented Jan 28, 2011 at 5:44 -
\$\begingroup\$ better return multiple values like this:
def f() [1, 2] end
\$\endgroup\$glebm– glebm2011年01月28日 07:13:03 +00:00Commented Jan 28, 2011 at 7:13 -
\$\begingroup\$ @glebm Agreed, I stand corrected. Return isn't even needed with multiple return values. \$\endgroup\$Mike Bethany– Mike Bethany2011年01月28日 07:32:13 +00:00Commented Jan 28, 2011 at 7:32
I'm currently going through the book and yes, as admitted it's not the most Ruby-esque code I've ever seen, but I think it does achieve the main objectives of what it was set out to accomplish and I don't think the Ruby gets in the way of that.
The thing I've noticed most when trying to following along, run, and even augment some of the methods was the general flow. There's a few things I can't say "YOU SHOULD DO IT THIS WAY THUS SAYETH THE RUBY" but it would just "feel" more comfortable if:
The configurations were done at the beginning of the file in a Hash. As far as using a Hash, you see Hash configs a lot in Ruby, many times due to the use and reading of .yml config files so I guess I just expect configs to be done in a Hash. (If this were an application, I'd say put the configs in a .yml file, but I understand the desire to have the full solution encapsulated in a single file.)
The "search" method definition were at the beginning of the file, right under a Hash outlining the config. (The calling of the search method still happens at the bottom in your if FILE section.)
Ok so the at the beginning thing is definitely debatable, but I just found myself dropping in to each of these files, scrolling to the bottom, reading the config, scrolling up a little and reading the "search" method to get a high level view of what's going on, then scrolling to the top again to dig into the supporting methods.
Another debatable style I've enjoyed in both reading and writing Ruby is using a space to pad the beginning and end of an inline block.
So
children.sort!{|x,y| y[:fitness] <=> x[:fitness]}
becomes
children.sort!{ |x,y| y[:fitness] <=> x[:fitness] }
Yeah it's a small thing and some people will say anything that adds extra keystrokes to typing is bad, but I find it much more readable.
Also just now noticing some inconsistent spacing when using operators...and as you can probably guess, I vote for more spacing in most situations. ;)
def crossover(parent1, parent2, rate)
return ""+parent1 if rand()>=rate
point = 1 + rand(parent1.size-2)
return parent1[0...point]+parent2[point...(parent1.size)]
end
becomes
def crossover(parent1, parent2, rate)
return "" + parent1 if rand() >= rate
point = 1 + rand(parent1.size-2)
return parent1[0...point] + parent2[point...(parent1.size)]
end
And now I'm getting nit picky...so I'm done for now.
<
and>
characters in it. \$\endgroup\$