1

I'm implementing Genetic Algorithm in C# WinForms to find optimal solution of a very simple mathematical equation. The word "simple" here means that the equation must have these qualities:

  • Only consists of whole positive numbers
  • End result of equation must be a positive number lesser than 1001
  • A minimum of 2 variables up to a maximum of 7 variables are allowed
  • Coefficients of each variables must be between 1 to 10 (lower bound and upper bound inclusive)
  • Variable values are from 0 - 500
  • Only allows addition

I thought that I would be able to reach optimum solution with the equation being so "easy", one would say. However, I always only get a solution which is very close to the optimal solution. To evaluate whether the solution is optimal, I use the following formula:

f(x) = absolute((sum_of_all_variables) - equation_result)

For an example, I have the following equation:

1a + 1b = 12

If a is 2 and b is 3, then the f(x) of said individual would be abs(2 + 3 - 12) = 7. My code stops when f(x) reaches 0 or I have generated 50 generations.

My current mutation rate is 50% and my current crossover rate is 25%. Selection method used is roulette-wheel selection. Mutation method is random mutation, i.e. I simply pick a random gene in my pool of genes to change its value, up to 50% genes are changed per generation.

My expectation: the code yields a solution (individual) that has a f(x) value of 0, which is one of the optimal solutions available.

Current result: the code yields a solution that is nearly optimal. One common pattern I've found is that the solution that is nearly optimal are often replicated exactly in the next generation.

My guess on the problem now:

I think that it has something to do with how I do my crossovers. What I do to crossover:

  • Assign a random value to each individual
  • If said random value is less than crossover rate, then save that individual's data to later be used for crossover
  • After gathering all to-be-crossovered individuals, I randomly select a crossover point for each individual
  • I think this is the problem: I do crossover on the first individual with the second individual on the to-be-crossovered-individual list, merging genes from the first individual and second individual that is to be crossover-ed. Then, proceed with the second individual and third individual and so on. The last individual will be crossovered with the first individual.

However, as I said, it doesn't yield optimum result. Is it a problem with my logic or is this the expected behaviour of Genetic Algorithm on this simple mathematical equation?

Crossover methods I'm using for reference (I'm using C# WinForms to display data in ListView):

private static void Crossover(List<Chromosome> chromosomes, Random seed)
{ 
 List<int> crossoverChromosome = new List<int>();
 for (int i = 0; i < 50; i++)
 {
 decimal randomedValue = RandomizeValue(seed); 
 if (randomedValue < Population.CrossoverRate) crossoverChromosome.Add(i); 
 }
 for (int i = 0; i < crossoverChromosome.Count; i++)
 {
 int crossoverPoint = seed.Next(0, chromosomes[0].GeneValues.Count);
 chromosomes[crossoverChromosome[i]] = chromosomes[crossoverChromosome[i]].MixChromosome(chromosomes[crossoverChromosome[(i + 1) % crossoverChromosome.Count]], crossoverPoint);
 }
}
private static decimal RandomizeValue(Random seed)
{
 return Math.Round((decimal)seed.NextDouble(), 5);
}
public Chromosome MixChromosome(Chromosome mixture, int crossoverPoint)
{
 List<Gene> newGenes = new List<Gene>();
 newGenes.AddRange(this.GetGenes(0, crossoverPoint));
 newGenes.AddRange(mixture.GetGenes(crossoverPoint, this.GeneValues.Count));
 return new Chromosome(DesiredValue, OperatorData, newGenes); // Ignore the DesiredValue and Operator Data, it has nothing to do with crossover
}
private List<Gene> GetGenes(int firstIndex, int lastIndex)
{
 List<Gene> slicedGenes = new List<Gene>();
 for (int i = firstIndex; i < lastIndex; i++)
 {
 slicedGenes.Add(Genes[i].CloneGene());
 }
 return slicedGenes;
}

I've only heavily tested equations with two variables.

EDIT

Additional info:

  • I do not use elitism
  • Initial population size = 50 individuals
  • Population size per generation = 50 individuals

10 reruns on the equation 1a + 1b = 10 yields the following best fitness values after 50 generations:

  • First rerun: 24
  • Second rerun: 11
  • Third rerun: 42
  • Fourth rerun: 13
  • Fifth rerun: 5
  • Sixth rerun: 19
  • Seventh rerun: 7
  • Eighth rerun: 1
  • Ninth rerun: 6
  • Tenth rerun: 29
  • Additional info on each rerun: the chromosome which yields the best fitness value in 50 generations is often repeated in the population. For example, on the eighth rerun, I find a lot of occurrences on chromosome with value a 4 and value b 7.
asked Jun 17, 2019 at 10:53
4
  • There are many parameters that play a role in building a genetic algorithm. There is usually significant trial-and-error involved, and even then, due to randomness, there is no guarantee of finding optimal answers. Your question can be improved if you clearly state all the parameters you are using. For example: how big is the initial population? Do you use any kind of elitism? How many times have you run this, and what is the typical/best/worse fitness of runs? Commented Jun 17, 2019 at 11:11
  • @tucuxi I've added more information. Please take a look. Commented Jun 17, 2019 at 11:27
  • 1
    You only posted a small part of your code and it's highly likely for the problem to be how you defined the classes for this specific problem or how you're calling the methods you posted. You should preferably just post a minimal reproducible example. Ideally this would also include a detailed high-level description of what your code is doing. Although generally these sorts of issues are best solved by debugging to see what your code is doing. But a 50% mutation rate is huge - this should probably be less than 10%, maybe closer to 1%. Commented Jun 17, 2019 at 12:03
  • @Dukeling I tweaked my mutation rate to 50% because all the chromosomes in the final generation (50th generation) of each rerun are repeated heavily (more than 50%). Even with such a high mutation rate, I still have heavy repetition of similar chromosomes that are nearest to f(x) = 0. I think that my solutions are converging to a local maximum solution, not the global maximum. Commented Jun 17, 2019 at 14:35

1 Answer 1

3

One mayor problem is (like you assumed) the crossover. In the way you describe it (or at least as far as I understand) you do crossover on every individual of your solution that is randomly chosen to generate the next population.

The basic idea behind a genetic algorithm is Darwin's survival of the fittest and not some random survival (or reproduction).

So I think the problem is that every individual has the same chance for reproduction, which doesn't make much sense in a genetic algorithm. A better solution would be to let the individuals, that lead to a result that is near to the correct one, reproduce more than an individual that doesn't lead to a good result. Otherwhise it would be quite similar to a random search.

Normally it's still usefull to have a random choosing of the individuals for reproduction, because this will lead to results that are not always the best ones but explore a bigger range of the seeking area.

One common way to do this is by using a fitness-proportional selection, that will choose the individuals for reproduction randomly, but based on the fitness values. So individuals with a high fitness (the ones that lead to a result near to the correct one in your example) have a higher chance of reproducing.

Another common way would be a stochastically distributed selection that will also choose random individuals for reproduction, with a higher chance for better individuals, but that will also guarantee that the individuals that are better than the average fitness will be reproducing at least once.

An example implementation of a Fitness-Proportional-Selection could look like this (unfortunately it's java code and no C#, but they are quite similar...):

import java.util.concurrent.ThreadLocalRandom;
import com.google.common.annotations.VisibleForTesting;
/**
 * A selector that randomly chooses pairs to be selected for reproduction based on their probability to be selected.
 */
public class FitnessProportionalSelector implements Selector {
 /**
 * Select pairs of parents (by index) that are combined to build the next generation.
 * 
 * @param selectionProbability
 * The probability to be selected for every DNA in the current population (sums up to 1).
 * 
 * @param numPairs
 * The number of pairs needed (or the number of individuals needed in the next generation).
 * 
 * @return Returns an int-array of size [numPairs * 2] including the pairs that are to be combined to create the next population (a pair is on
 * position [i, i+1] for i % 2 = 0).
 */
 @Override
 public int[] select(double[] selectionProbability, int numPairs) {
 double[] summedProbabilities = Selector.toSummedProbabilities(selectionProbability);
 int[] selectionPairs = new int[numPairs * 2];
 double chosenProbability;
 for (int i = 0; i < numPairs * 2; i++) {
 chosenProbability = getRandomNumber();
 selectionPairs[i] = Selector.getSelectedIndexByBisectionSearch(summedProbabilities, chosenProbability);
 }
 return selectionPairs;
 }
 @Override
 public String toString() {
 return "FitnessProportionalSelector []";
 }
 @VisibleForTesting
 /*private*/ double getRandomNumber() {
 return ThreadLocalRandom.current().nextDouble();
 }
}

Or a solution for the Stochastically-Distributed-Selection (also in java):

import java.util.concurrent.ThreadLocalRandom;
import com.google.common.annotations.VisibleForTesting;
/**
 * A selector that chooses the pairs to be reproduced by a stochastically distributed selection method.
 * 
 * The selection probability is proportional to the given probability, but it's ensured, that individuals with a probability above average are chosen
 * at least once.
 */
public class StochasticallyDistributedSelector implements Selector {
 /**
 * Select pairs of parents (by index) that are combined to build the next generation.
 * 
 * @param selectionProbability
 * The probability to be selected for every DNA in the current population (sums up to 1).
 * 
 * @param numPairs
 * The number of pairs needed (or the number of individuals needed in the next generation).
 * 
 * @return Returns an int-array of size [numPairs * 2] including the pairs that are to be combined to create the next population (a pair is on
 * position [i, i+1] for i % 2 = 0).
 */
 @Override
 public int[] select(double[] selectionProbability, int numPairs) {
 double[] summedProbability = Selector.toSummedProbabilities(selectionProbability);
 int[] selectedPairs = new int[2 * numPairs];
 double startPoint = getRandomNumber();
 double addedAverage = 1d / (2d * numPairs);
 double stochasticallySelectedProbability;
 for (int i = 0; i < numPairs * 2; i++) {
 //select the pairs stochastically
 stochasticallySelectedProbability = startPoint + i * addedAverage;
 stochasticallySelectedProbability %= 1;
 selectedPairs[i] = Selector.getSelectedIndexByBisectionSearch(summedProbability, stochasticallySelectedProbability);
 }
 //shuffle the pairs to distribute them stochastically
 shuffle(selectedPairs);
 return selectedPairs;
 }
 @VisibleForTesting
 /*private*/ void shuffle(int[] selectedPairs) {
 //shuffle the selected pairs in place
 int swapIndex;
 int tmp;
 for (int i = selectedPairs.length - 1; i > 0; i--) {
 swapIndex = (int) (getRandomNumber() * (i + 1));
 tmp = selectedPairs[i];
 selectedPairs[i] = selectedPairs[swapIndex];
 selectedPairs[swapIndex] = tmp;
 }
 }
 @VisibleForTesting
 /*private*/ double getRandomNumber() {
 return ThreadLocalRandom.current().nextDouble();
 }
 @Override
 public String toString() {
 return "StochasticallyDistributedSelector []";
 }
}

I took these example codes from a genetic optimizer project I created for my master thesis. If you want to have a look at it you can find it on my github account

Some further improvements:

  • try using the mean square error instead of the absolute value (f(x) = absolute((sum_of_all_variables) - equation_result))
  • using a selection pressure is another good way to choose the right individuals for reproduction (and getting the parameters to converge); you can find a solution in the github project in the package genetic_optimizer.selection if you need examples.
  • elitism is quite usefull here to not loose the best solution you already have (at least if you don't keep it in the population, keep it back to return it after the calculation)
answered Jun 17, 2019 at 12:39
Sign up to request clarification or add additional context in comments.

7 Comments

Hello, I think I got your answer! So basically, you use roulette-wheel selection to determine which pair of genes to use to do the crossover, right? In your implementation, you use cumulative probability and a random chosen value to determine the position of the first chromosome in the pair. Then, you use the chromosome that comes after that chromosome in the chromosome list to do crossover, right? I'm using one-cut point crossover, will that be okay?
Tobias, I also have a question. Before I do the crossover process, I go through a selection process which is basically selecting chromosomes which will go to the next generation IMMEDIATELY without modification, this means that some chromosomes are discarded and some are replicated using the same principle: fitness-proportionate selection. Is doing a crossover using the same selection method redundant or will this lead into erroneous results? THAT is what I'm currently doing, except for the fact that during crossover, I'm randomly selecting chromosomes.
After reading back, am I mistaking selection and crossover as two different steps when they're actually one? My current code: selection using fitness-proportionate selection, then crossover randomly chosen chromosomes. My thoughts after reading your answer: I should select chromosomes to do crossover using that fitness-proportionate selection principle. However, I have a question: from your answer, if I have 50 chromosomes in each generation, and I've gotten 25 pairs of said chromosomes, how do I turn them into 50 individuals? continue...
Do I crossover only the first chromosome in the pair with the second chromosome using whatever crossover method and just place the second chromosome without modification directly into the population, thus meaning each pair yields 2 chromosomes: the first one altered, the second one not altered? I'm sorry if my questions are lengthy, but I hope that you can get to them all. Please do ask me for more details if you need. I'm an undergraduate trying to implement this algorithm, I'm sorry if I sound ignorant.
@Richard When using the fitness proportional (or rolette-wheel) selection I choose all individuals by randomly selecting them (where better ones have a higher probability to be chosen). So I don't choose one and take the next in the list when selecting the individuals for the cross over. OneCutPoint crossover should be ok.
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.