hmep-0.1.1: HMEP Multi Expression Programming – a genetic programming variant

Safe HaskellNone
LanguageHaskell2010

AI.MEP

Description

Copyright Bogdan Penkovsky (c) 2017

Multiple Expression Programming

Example application: trigonometry cheating

Suppose, you forgot certain trigonometric identities. For instance, you want to express cos^2(x) using sin(x). No problem, set the target function cos^2(x) in the dataset and add sin to the arithmetic set of operators {+,-,*,/}. See app/Main.hs.

After running

 $ stack build && stack exec hmep-demo
 

We obtain

 Average loss in the initial population 15.268705681244962
 Population 10: average loss 14.709728527360586
 Population 20: average loss 13.497114190675477
 Population 30: average loss 8.953185872653737
 Population 40: average loss 8.953185872653737
 Population 50: average loss 3.3219954564955856e-15
 

The average value of 3.3e-15 is close to zero, indicating that the exact expression was found!

The produced output was:

 Interpreted expression:
 v1 = sin x0
 v2 = v1 * v1
 result = 1 - v2
 

From here we can infer that cos^2(x) = 1 - v2 = 1 - v1 * v1 = 1 - sin^2(x).

Sweet!

Synopsis

Documentation

type Chromosome a = Vector (Gene a Int) Source

A chromosome is a vector of genes

data Gene a i Source

Either a terminal symbol or a three-address code (a function and two pointers)

Instances

(Eq a, Eq i) => Eq (Gene a i) Source

Eq instance for Gene

(Show a, Show i) => Show (Gene a i) Source

Show instance for Gene

type Population a = [Chromosome a] Source

List of chromosomes

type Phenotype a = (Double, Chromosome a, Vector Int) Source

Loss value, chromosome, and the best expression indices vector

data Config a Source

MEP configuration

Constructors

Fields

p'const :: Double

Probability of constant generation

p'var :: Double

Probability of variable generation. The probability of operator generation is inferred automatically as 1 - p'const - p'var.

p'mutation :: Double

Mutation probability

p'crossover :: Double

Crossover probability

c'length :: Int

The chromosome length

c'popSize :: Int

A (sub)population size

c'popN :: Int

Number of subpopulations (1 or more) [not implemented]

c'ops :: Vector (F a)

Functions pool with their symbolic representations

c'vars :: Int

The input dimensionality

defaultConfig :: Config Double Source

defaultConfig = Config
 {
 p'const = 0.1
 , p'var = 0.4
 , p'mutation = 0.1
 , p'crossover = 0.9
 , c'length = 50
 , c'popSize = 100
 , c'popN = 1
 , c'ops = V.empty -- <-- To be overridden
 , c'vars = 1
 }

type LossFunction a = (Vector a -> Vector a) -> (Vector Int, Double) Source

A function to minimize.

The argument is a vector evaluation function whose input is a vector (length c'vars) and ouput is a vector with a different length c'length.

The result is a vector of the best indices and a scalar loss value.

Genetic algorithm

initialize :: PrimMonad m => Config Double -> RandT m (Population Double) Source

Randomly generate a new population

evaluatePopulation :: Num a => LossFunction a -> Population a -> Generation a Source

Using LossFunction , find how fit is each chromosome in the population

regressionLoss1 Source

Arguments

:: (Num result, Ord result)
=> (b -> b -> result)

Distance function

-> [(a, b)]

Dataset

-> (Vector a -> Vector b)

Chromosome evaluation function (partially applied evaluate)

-> (Vector Int, result)

Loss function for regression problems with one input and one output. Not normalized with respect to the dataset size.

avgLoss :: Generation Double -> Double Source

Average population loss

best :: Generation a -> Phenotype a Source

The best phenotype in the generation

worst :: Generation a -> Phenotype a Source

The worst phenotype in the generation

evolve Source

Arguments

:: PrimMonad m
=> Config Double

Common configuration

-> LossFunction Double

Custom loss function

-> (Chromosome Double -> RandT m (Chromosome Double))

Mutation

-> (Generation Double -> RandT m (Chromosome Double))

A chromosome selection algorithm. Does not need to be random, but may be.

-> Generation Double

Evaluated population

-> RandT m (Generation Double)

New generation

Selection operator that produces the next evaluated population.

Standard algorithm: the best offspring O replaces the worst individual W in the current population if O is better than W.

binaryTournament :: (PrimMonad m, Ord a) => Generation a -> RandT m (Chromosome a) Source

Binary tournament selection

crossover :: PrimMonad m => Chromosome a -> Chromosome a -> RandT m (Chromosome a, Chromosome a) Source

Uniform crossover operator

mutation3 Source

Arguments

:: PrimMonad m
=> Config Double

Common configuration

Mutation operator with up to three mutations per chromosome

smoothMutation Source

Arguments

:: PrimMonad m
=> Double

Probability of gene mutation

-> Config Double

Common configuration

Mutation operator with a fixed mutation probability of each gene

newChromosome Source

Arguments

:: PrimMonad m
=> Config Double

Common configuration

Randomly initialize a new chromosome. By definition, the first gene is terminal (a constant or a variable).

Expression interpretation

generateCode :: Phenotype Double -> String Source

Generate code for the functions with a single output

Random

data RandT m a :: (* -> *) -> * -> *

Instances

Monad m => Monad (RandT m)
Monad m => Functor (RandT m)

runRandIO :: RandT IO a -> IO a Source

Alias for mwc: Take a RandT value and run it in IO , generating all the random values described by the RandT .

It initializes the random number generator. For performance reasons, it is recommended to minimize the number of calls to runRandIO .

AltStyle によって変換されたページ (->オリジナル) /