Genetic algorithms explained

Genetic algorithms are a class of optimalisation algorithms (algorithms that help to maximise some function by adusting the input paramters) that are know for their abilety to handle large unpredicatable search spaces.

In Cumulus we employ them the bridge the gap between having a forward model (a model that can simulate a biological process given its parameters) and knowing what paramters makes this simulation fit the experimental data best. We do this because creating a backward model, one that tells us what paramters will fit a set of measurements is complicated for even simple systems, let allone for larger devices. Genetic algoritms will help us find the parameters

How genetic algorithms work

A genetic alogoritm mimics the process of population genetics in order to optimise some fitness criteron. In our case this criterion is based on how good the simulated data matches the experimental results.
The following pseusocode shows the basics behind a genetic algoritm.

 selectedpopulation  = initialize()
 while stop criterion has not been met
   newpopulation       = mutate(selectedpopulation)
   newpopulationscores = evaluate(newpopulation)
   selectedpopulation  = select(newpopulation, newpopulationscores)

As you can see in the code above there are five basic steps to a genetic algorithm, three of these are repeating. Each of these steps is is explained below.

Initialization

First of the algorithm needs some sort of starting point. This can be In our case this the estimation of the parameters made by the user. If he has no idea what these should he can draw inspiration by looking at parameters of the same type for different parts.

Mutation

In the mutation step we add new individuals to the population. This can be done in many different ways. Classically crossingover (using values of two individuals and mixing them up into a new individual) and point mutations (adjusting a single value by a small random amount) are very popular.
In cumulus we use a method calles gausian estimation. This method assumes the optimalety surface is roughly in the shape of a n-dimensiona gausian. In the selection step we try to estimate the shape of this guasion by taking the covariance matrix of all the individuals in the population(weigthed by fitness) Then we restore our population by randomly drawing new individuals from this gausian distribution.

Evaluation

In the evaluation step yet unevaluated individuals are evaluated. In most literatures this is seen as a part of the selection step. In our system however this means the running of several model in in even more experimental settings and comparing to aquired measurements. Then combining all these comparisons into a single fitness score. It is safe to say that the brunt of our computation is consumed in this step.

Selection

In the selection step we discard some individuals of our population that we deem not good enough. For us this is as strait-forward as throwing away the worst preforming half individuals. Because of the gausian estimation mutation method we are generating enough different individuals for us not to worry about diversity preservation.

Modularety

We programmed each of the steps (mutation, evaluation, selection) in our system as sepaparate objects. Because of this changing any of the methods by a different algoritm replacing them is as eazy as swapping a class by another one implementing the same abstract class.

Team:Groningen/modeling genetic algorithms

From 2011.igem.org