Genetic algorithm for Clever Algorithms project

Question 1

I wrote a bunch of Ruby code for a book project I've just finished. One criticism is that it is fine code but not very "ruby like". I agree my style was simplified for communication reasons, and although it's procedural code, it still feels "ruby-like" to me.

For the representative example below, any ideas on making it more "Ruby like"?

# Genetic Algorithm in the Ruby Programming Language
# The Clever Algorithms Project: http://www.CleverAlgorithms.com
# (c) Copyright 2010 Jason Brownlee. Some Rights Reserved. 
# This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 Australia License.
def onemax(bitstring)
 sum = 0
 bitstring.size.times {|i| sum+=1 if bitstring[i].chr=='1'}
 return sum
end
def random_bitstring(num_bits)
 return (0...num_bits).inject(""){|s,i| s<<((rand<0.5) ? "1" : "0")}
end
def binary_tournament(pop)
 i, j = rand(pop.size), rand(pop.size)
 j = rand(pop.size) while j==i
 return (pop[i][:fitness] > pop[j][:fitness]) ? pop[i] : pop[j]
end
def point_mutation(bitstring, rate=1.0/bitstring.size)
 child = ""
 bitstring.size.times do |i|
 bit = bitstring[i].chr
 child << ((rand()<rate) ? ((bit=='1') ? "0" : "1") : bit)
 end
 return child
end
def crossover(parent1, parent2, rate)
 return ""+parent1 if rand()>=rate
 point = 1 + rand(parent1.size-2)
 return parent1[0...point]+parent2[point...(parent1.size)]
end
def reproduce(selected, pop_size, p_cross, p_mutation)
 children = [] 
 selected.each_with_index do |p1, i|
 p2 = (i.modulo(2)==0) ? selected[i+1] : selected[i-1]
 p2 = selected[0] if i == selected.size-1
 child = {}
 child[:bitstring] = crossover(p1[:bitstring], p2[:bitstring], p_cross)
 child[:bitstring] = point_mutation(child[:bitstring], p_mutation)
 children << child
 break if children.size >= pop_size
 end
 return children
end
def search(max_gens, num_bits, pop_size, p_crossover, p_mutation)
 population = Array.new(pop_size) do |i|
 {:bitstring=>random_bitstring(num_bits)}
 end
 population.each{|c| c[:fitness] = onemax(c[:bitstring])}
 best = population.sort{|x,y| y[:fitness] <=> x[:fitness]}.first 
 max_gens.times do |gen|
 selected = Array.new(pop_size){|i| binary_tournament(population)}
 children = reproduce(selected, pop_size, p_crossover, p_mutation) 
 children.each{|c| c[:fitness] = onemax(c[:bitstring])}
 children.sort!{|x,y| y[:fitness] <=> x[:fitness]}
 best = children.first if children.first[:fitness] >= best[:fitness]
 population = children
 puts " > gen #{gen}, best: #{best[:fitness]}, #{best[:bitstring]}"
 break if best[:fitness] == num_bits
 end
 return best
end
if __FILE__ == 0ドル
 # problem configuration
 num_bits = 64
 # algorithm configuration
 max_gens = 100
 pop_size = 100
 p_crossover = 0.98
 p_mutation = 1.0/num_bits
 # execute the algorithm
 best = search(max_gens, num_bits, pop_size, p_crossover, p_mutation)
 puts "done! Solution: f=#{best[:fitness]}, s=#{best[:bitstring]}"
end

Question 2

Looked at your book, algorithm-wise - good explanations. but you should've asked a rubyist to take a look at the code before it got published :(

Question 3

I'm just agreeing all over the place with you glebm. It's a bit insulting really. Like if I wrote a book on Middle East politics and US foreign policy but didn't bother to consult with experts on the matter. Oh well, at least the book is free in PDF form.

Question 4

It is time for the second edition.

Question 5

Please format your code using the code button in the format bar, not pre-tags. The way it is now, some of the code won't display because of < and > characters in it.

Question 6

Agreed, I should have got more ruby people to look at the book before I published it. I simply ran out of time, and in your eyes, it makes the work less professional. To be fair, it is not a ruby book, it is an AI book. From this perspective, the use of rubyisms is important but should be metered by the requirements of good communication (ability for non-ruby people to parse and port to their preferred language). The book is on github, I'd love to see a ruby-purist version of the sample code!!!! In fact, I'd really love to link to a "jason" and "purist" versions of each algorithm on the web.

Question 7

First a note on your general design: You use hashmaps to represent members of the population. I'd rather recommend you create a little class to represent them:

class Individual
 include Comparable
 attr_reader :bitstring, :fitness
 def initialize(bitstring)
 @bit_string = bitstring
 @fitness = onemax(bitstring)
 end
 def <=>(other)
 fitness <=> other.fitness
 end
end

You'll now use Individual.new(bitstring) to create a new individual (which will automatically calculate its own fittness) and individual.bitstring and individual.fitness to get an individual's bitstring or fitness respectively.

This has the benefit that now the logic of how to calculate an individuals fitness lives in the Individual class instead of in multiple place in the search method. This is a more natural place for it and makes it easier to find and change should you ever decide to use a different fitness function.

Also note that I included Comparable, implementing <=> based on fitness. This means that you can now compare individuals to each other such that an individual is greater than another if and only if its fitness is greater. This allows you to use > on two individuals as well as methods like max to get the individual with the greatest fitness out of an array of individuals. It also allows you to call sort on array of individuals without passing a block. This will simplify your code in some places.

In a comment you mentioned that you don't want to use classes. However I still think that it would greatly improve your design to have the logic for creating new individuals and calculating their fitness in one single place, so you should at least factor it out into a method like this:

def create_individual(bitstring)
 {:bitstring => bitstring, :fitness => onemax(bitstring)}
end

Now some notes on particular pieces of code:

bitstring.size.times {|i| sum+=1 if bitstring[i].chr=='1'}

As has already been mentioned, this can easily be replaced with a call to count. However as a general note I want to point out that it's good practice to avoid index-based loops wherever possible, so if count did not exist, I'd write the above like this:

bitstring.each_char {|c| sum+=1 if c == '1'}

Or for 1.8.6 compability:

bitstring.each_byte {|b| sum+=1 if b.chr == '1'}

(0...num_bits).inject(""){|s,i| s<<((rand<0.5) ? "1" : "0")}

When dealing with inject, it's usually a good idea to keep the function simple, so it's more easily understandable. Often you can separate the logic of generating the elements from the logic of combining them, by using map before inject. This would make your code look like this:

(0...num_bits).map {|i| (rand<0.5) ? "1" : "0"}.inject("") {|s,i| s<<i}

Now there's two things to notice here:

Whenever you call map on a range from 0 to some size, you might rather want to use Array.new(size) {block} which creates an array of the given size by calling the block size times.
The smaller inject is now much easier to understand, but happens to be equivalent to calling join, so let's just do that.

So your code becomes:

Array.new(num_bits) { (rand<0.5) ? "1" : "0" }.join

This way you create a temporary array, which you didn't before, but on the other hand your code became much more understandable, so that little inefficiency should be worth it.

(pop[i][:fitness] > pop[j][:fitness]) ? pop[i] : pop[j]

Thanks to the Individual class this now looks like this:

(pop[i] > pop[j]) ? pop[i] : pop[j]

This isn't much different, but note that it now has the form (x > y) ? x : y, which incidentally means "the maximum of two objects". In other words we can simplify this code by using the max method:

[ pop[i], pop[j] ].max

bitstring.size.times do |i|
 bit = bitstring[i].chr

Again this should be bitstring.each_char do |bit| or bitstring.each_byte do |b| bit = b.chr.

""+parent1 if rand()>=rate

(削除) I'm not sure what the ""+ is supposed to do here. Remove it. Note that ""+x is not a way to turn x into a string in ruby (like it would be in e.g. Java). Calling ""+x when x is not a string (or "duck-string"), will cause a type error. (削除ここまで)

The fact that you're using ""+ to create a copy is less than obvious (thus the above paragraph). If you use the dup method, whose sole purpose it is to create a copy, instead, it will be much more understandable.

I'd also change the crossover method to return an individual rather than a string, which makes the code a bit simpler.

Question 8

Some great suggestions, thank-you for taking the time. I especially love the random bitstring suggestion. Regarding the use of the object, I explicitly chose a procedural paradigm for presentation although agree having objects can make the organization of the code much better. I had to drop each_char (and a few other funcs) for 1.8.6 support. The ""+parent1 was to make a quick copy of the string object (overly cautious, i know). Thanks again!

Question 9

@jasonb: If you can't use each_char, use each_byte at least (it will be the same except that you get an integer, which you need to call chr on). That's still better than iterating over the indices. To create a copy of an object, you should use dup, which makes it clear to the reader what you're doing.

Question 10

@jasonb: Even for procedural code it's still bad design to have the fitness code in several places inside the search method. If you don't want to use classes, I'd at least recommend a create_individual method which takes a bitstring and returns a hash with the bitstring and the fitness. This way the fitness calculation still takes place in a single location.

Question 11

Ruby library is quite powerful:

onemax(bitstring) can be simply bitstring.count('1')

random_bitstring(num_bits) is probably better off implemented as rand(2**num_bits).to_s(2)

The first thing to check out is this http://ruby-doc.org/core-2.0/Enumerable.html

Question 12

Thanks for pointing out the count method - your onemax is excellent. I really do need to read the API more. The suggestion for random_bitstring, although elegant, I find difficult to parse. Granted mine is worse, but I feel like that a non-ruby person could figure out my version from reading a "quick start language guide", yours would require the API open in another tab. I am finding this case far less clear cut. Thanks again for the feedback!

Question 13

An easy one is to remove the return at the end of each method. Ruby automatically does a return of the last value. It's actually an extra method call to use return.

~~(削除) The only time you specifically need to use return is (削除ここまで)~~ You don't even need to use return if you are returning multiple values:

def f
 [1,2]
end
a,b = f

Again, using return costs an extra method call and those add up. If you can optimize your code by not using them you've just given yourself some free optimization.

Question 14

Ok, thanks. In this case, I think it would only apply to the "onemax", "random_bitstring", and "point_mutation" functions.

Question 15

No remove it on EVERY function. For instance in the reproduce method change the last line from return children to just children. The only time you need to use return in Ruby is if you are returning multiple values, e.g. def f() return 1,2 end a,b = f()

Question 16

better return multiple values like this: def f() [1, 2] end

Question 17

@glebm Agreed, I stand corrected. Return isn't even needed with multiple return values.

Question 18

I'm currently going through the book and yes, as admitted it's not the most Ruby-esque code I've ever seen, but I think it does achieve the main objectives of what it was set out to accomplish and I don't think the Ruby gets in the way of that.

The thing I've noticed most when trying to following along, run, and even augment some of the methods was the general flow. There's a few things I can't say "YOU SHOULD DO IT THIS WAY THUS SAYETH THE RUBY" but it would just "feel" more comfortable if:

The configurations were done at the beginning of the file in a Hash. As far as using a Hash, you see Hash configs a lot in Ruby, many times due to the use and reading of .yml config files so I guess I just expect configs to be done in a Hash. (If this were an application, I'd say put the configs in a .yml file, but I understand the desire to have the full solution encapsulated in a single file.)
The "search" method definition were at the beginning of the file, right under a Hash outlining the config. (The calling of the search method still happens at the bottom in your if FILE section.)

Ok so the at the beginning thing is definitely debatable, but I just found myself dropping in to each of these files, scrolling to the bottom, reading the config, scrolling up a little and reading the "search" method to get a high level view of what's going on, then scrolling to the top again to dig into the supporting methods.

Another debatable style I've enjoyed in both reading and writing Ruby is using a space to pad the beginning and end of an inline block.

So

children.sort!{|x,y| y[:fitness] <=> x[:fitness]}

becomes

children.sort!{ |x,y| y[:fitness] <=> x[:fitness] }

Yeah it's a small thing and some people will say anything that adds extra keystrokes to typing is bad, but I find it much more readable.

Also just now noticing some inconsistent spacing when using operators...and as you can probably guess, I vote for more spacing in most situations. ;)

def crossover(parent1, parent2, rate)
 return ""+parent1 if rand()>=rate
 point = 1 + rand(parent1.size-2)
 return parent1[0...point]+parent2[point...(parent1.size)]
end

becomes

def crossover(parent1, parent2, rate)
 return "" + parent1 if rand() >= rate
 point = 1 + rand(parent1.size-2)
 return parent1[0...point] + parent2[point...(parent1.size)]
end

And now I'm getting nit picky...so I'm done for now.

sepp2k sepp2k 9,0222 gold badges39 silver badges51 bronze badges · Accepted Answer · 2011-01-28 13:41:09Z

First a note on your general design: You use hashmaps to represent members of the population. I'd rather recommend you create a little class to represent them:

class Individual
 include Comparable
 attr_reader :bitstring, :fitness
 def initialize(bitstring)
 @bit_string = bitstring
 @fitness = onemax(bitstring)
 end
 def <=>(other)
 fitness <=> other.fitness
 end
end

You'll now use Individual.new(bitstring) to create a new individual (which will automatically calculate its own fittness) and individual.bitstring and individual.fitness to get an individual's bitstring or fitness respectively.

This has the benefit that now the logic of how to calculate an individuals fitness lives in the Individual class instead of in multiple place in the search method. This is a more natural place for it and makes it easier to find and change should you ever decide to use a different fitness function.

Also note that I included Comparable, implementing <=> based on fitness. This means that you can now compare individuals to each other such that an individual is greater than another if and only if its fitness is greater. This allows you to use > on two individuals as well as methods like max to get the individual with the greatest fitness out of an array of individuals. It also allows you to call sort on array of individuals without passing a block. This will simplify your code in some places.

In a comment you mentioned that you don't want to use classes. However I still think that it would greatly improve your design to have the logic for creating new individuals and calculating their fitness in one single place, so you should at least factor it out into a method like this:

def create_individual(bitstring)
 {:bitstring => bitstring, :fitness => onemax(bitstring)}
end

Now some notes on particular pieces of code:

bitstring.size.times {|i| sum+=1 if bitstring[i].chr=='1'}

As has already been mentioned, this can easily be replaced with a call to count. However as a general note I want to point out that it's good practice to avoid index-based loops wherever possible, so if count did not exist, I'd write the above like this:

bitstring.each_char {|c| sum+=1 if c == '1'}

Or for 1.8.6 compability:

bitstring.each_byte {|b| sum+=1 if b.chr == '1'}

(0...num_bits).inject(""){|s,i| s<<((rand<0.5) ? "1" : "0")}

When dealing with inject, it's usually a good idea to keep the function simple, so it's more easily understandable. Often you can separate the logic of generating the elements from the logic of combining them, by using map before inject. This would make your code look like this:

(0...num_bits).map {|i| (rand<0.5) ? "1" : "0"}.inject("") {|s,i| s<<i}

Now there's two things to notice here:

Whenever you call map on a range from 0 to some size, you might rather want to use Array.new(size) {block} which creates an array of the given size by calling the block size times.
The smaller inject is now much easier to understand, but happens to be equivalent to calling join, so let's just do that.

So your code becomes:

Array.new(num_bits) { (rand<0.5) ? "1" : "0" }.join

This way you create a temporary array, which you didn't before, but on the other hand your code became much more understandable, so that little inefficiency should be worth it.

(pop[i][:fitness] > pop[j][:fitness]) ? pop[i] : pop[j]

Thanks to the Individual class this now looks like this:

(pop[i] > pop[j]) ? pop[i] : pop[j]

This isn't much different, but note that it now has the form (x > y) ? x : y, which incidentally means "the maximum of two objects". In other words we can simplify this code by using the max method:

[ pop[i], pop[j] ].max

bitstring.size.times do |i|
 bit = bitstring[i].chr

Again this should be bitstring.each_char do |bit| or bitstring.each_byte do |b| bit = b.chr.

""+parent1 if rand()>=rate

(削除) I'm not sure what the ""+ is supposed to do here. Remove it. Note that ""+x is not a way to turn x into a string in ruby (like it would be in e.g. Java). Calling ""+x when x is not a string (or "duck-string"), will cause a type error. (削除ここまで)

The fact that you're using ""+ to create a copy is less than obvious (thus the above paragraph). If you use the dup method, whose sole purpose it is to create a copy, instead, it will be much more understandable.

I'd also change the crossover method to return an individual rather than a string, which makes the code a bit simpler.

Some great suggestions, thank-you for taking the time. I especially love the random bitstring suggestion. Regarding the use of the object, I explicitly chose a procedural paradigm for presentation although agree having objects can make the organization of the code much better. I had to drop each_char (and a few other funcs) for 1.8.6 support. The ""+parent1 was to make a quick copy of the string object (overly cautious, i know). Thanks again!
@jasonb: If you can't use each_char, use each_byte at least (it will be the same except that you get an integer, which you need to call chr on). That's still better than iterating over the indices. To create a copy of an object, you should use dup, which makes it clear to the reader what you're doing.
@jasonb: Even for procedural code it's still bad design to have the fitness code in several places inside the search method. If you don't want to use classes, I'd at least recommend a create_individual method which takes a bitstring and returns a hash with the bitstring and the fitness. This way the fitness calculation still takes place in a single location.

Stack Exchange Network

Genetic algorithm for Clever Algorithms project

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Genetic algorithm for Clever Algorithms project

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions