Latest revision as of 23:32, 21 September 2011

Genetic algorithms explained

Genetic algorithms are a class of optimization algorithms (algorithms that help to maximize some function by adjusting the input parameters) that are know for their ability to handle large unpredictable search spaces.

In Cumulus we employ them the bridge the gap between having a forward model (a model that can simulate a biological process given its parameters) and knowing what parameters makes this simulation fit the experimental data best. We do this because creating a backward model, one that tells us what parameters will fit a set of measurements, is complicated for even simple systems let alone for larger devices. Genetic algorithms will help us find the parameters without the need for a backward model.

How genetic algorithms work

A genetic algorithm mimics the process of population genetics in order to optimize some fitness criterion. In our case this criterion is based on how good the simulated data matches the experimental results.
The following pseudocode shows the basics behind a genetic algorithm.

 selectedpopulation  = initialize()
 while stop criterion has not been met
   newpopulation       = mutate(selectedpopulation)
   newpopulationscores = evaluate(newpopulation)
   selectedpopulation  = select(newpopulation, newpopulationscores)

As you can see in the code above there are four basic steps to a genetic algorithm, three of these are repeating. Each of these steps is is explained below.

Initialization

First of the algorithm needs some sort of starting point. This can be some parameters found in literature but also parameters as they where found by fitting other parts. In our case the estimation of the parameters were made by the user. If he has no idea what these should be he can draw inspiration by looking at parameters of the same type for different parts.

Mutation

In the mutation step we add new individuals to the population. This can be done in many different ways. Classically crossover (using values of two individuals and mixing them up into a new individual) and point mutations (adjusting a single value by a small random amount) are very popular.
In cumulus we use a method called gaussian estimation. This method assumes the optimality surface is roughly in the shape of a n-dimensional gaussian. In the selection step we then try to estimate the shape of this gaussian by taking the covariance matrix of all the individuals in the population(weighted by fitness). After which we replenish our population by randomly drawing new individuals from this gausian distribution.

Evaluation

In the evaluation step yet unevaluated individuals are evaluated. In most literature this is seen as a part of the selection step. In our system however this step means running several models in even more experimental settings and comparing these to measurements, then combining all these comparisons into a single fitness score. It is safe to say that the brunt of our computational power is consumed in this step.

Selection

In the selection step we discard some individuals of our population that we deem not to be good enough. For us this is as strait-forward as throwing away the worst preforming half individuals. Because of the gausian estimation mutation method we are generating enough different individuals for us not to worry about diversity preservation.

Modularity

We programmed each of the steps (mutation, evaluation, selection) in our system as separate objects. Therefore replacing any of the methods by a different algorithm is as easy as swapping a class by another one that implemented the same abstract base class.

@@ Line 1: / Line 1: @@
 {{HeaderGroningen2011}}
 =Genetic algorithms explained=
-Genetic algorithms are a class of optimalisation algorithms (algorithms that help to maximise some function by adusting the input paramters) that are know for their abilety to handle large unpredicatable search spaces.
+Genetic algorithms are a class of optimization algorithms (algorithms that help to maximize some function by adjusting the input parameters) that are know for their ability to handle large unpredictable search spaces.
-In Cumulus we employ them the bridge the gap between having a forward model (a model that can simulate a biological process given its parameters) and knowing what paramters makes this simulation fit the experimental data best.
-We do this because creating a backward model, one that tells us what paramters will fit a set of measurements is  complicated for even simple systems, let allone for larger devices. Genetic algoritms will help us find the parameters
+In Cumulus we employ them the bridge the gap between having a forward model (a model that can simulate a biological process given its parameters) and knowing what parameters makes this simulation fit the experimental data best.
+We do this because creating a backward model, one that tells us what parameters will fit a set of measurements, is  complicated for even simple systems let alone for larger devices. Genetic algorithms will help us find the parameters without the need for a backward model.
 ==How genetic algorithms work==
-A genetic alogoritm mimics the process of population genetics in order to optimise some fitness criteron. In our case this criterion is based on how good the simulated data matches the experimental results. The following pseusocode
+A genetic algorithm mimics the process of population genetics in order to optimize some fitness criterion. In our case this criterion is based on how good the simulated data matches the experimental results.  <BR>
+The following pseudocode shows the basics behind a genetic algorithm.
    selectedpopulation  = initialize()
@@ Line 17: / Line 16: @@
      selectedpopulation  = select(newpopulation, newpopulationscores)
-It repeats the following step a number of times (either a fixed number of times, or contrained by some metrix such as the fitness or the change thereof.)
+As you can see in the code above there are four basic steps to a genetic algorithm, three of these are repeating. Each of these steps is is explained below.
+===Initialization===
+First of the algorithm needs some sort of starting point. This can be some parameters found in literature but also parameters as they where found by fitting other parts. In our case the estimation of the parameters were made by the user. If he has no idea what these should be he can draw inspiration by looking at parameters of the same type for different parts.
 ===Mutation===
-In the mutation step we add new individuals to the population. This can be done in many different ways. Classically crossingover (using values of two individuals and mixing them up into a new individual) and point mutations (adjusting a single value by a small random amount) are very popular. <BR>
+In the mutation step we add new individuals to the population. This can be done in many different ways. Classically crossover (using values of two individuals and mixing them up into a new individual) and point mutations (adjusting a single value by a small random amount) are very popular. <BR>
-In cumulus we use a method calles gausian estimation. This method assumes the optimalety surface is roughly in the shape of a n-dimensiona  gausian. In the selection step we try to estimate the shape of this guasion by taking the covariance matrix of all the individuals in the population(weigthed by fitness)
+In cumulus we use a method called gaussian estimation. This method assumes the optimality surface is roughly in the shape of a n-dimensional gaussian. In the selection step we then try to estimate the shape of this gaussian by taking the covariance matrix of all the individuals in the population(weighted by fitness). After which we replenish our population by randomly drawing new individuals from this gausian distribution.
-Then we restore our population by randomly drawing new individuals from this gausian distribution.
 ===Evaluation===
-In the evaluation step yet unevaluated individuals are evaluated. In most literatures this is seen as a part of the selection step. In our system however this means the running of several model in in even more experimental settings and comparing to aquired measurements. Then combining all these comparisons into a single fitness score. It is safe to say that the brunt of our computation is consumed in this step.
+In the evaluation step yet unevaluated individuals are evaluated. In most literature this is seen as a part of the selection step. In our system however this step means running several models in even more experimental settings and comparing these to measurements, then combining all these comparisons into a single fitness score. It is safe to say that the brunt of our computational power is consumed in this step.
 ===Selection===
-In the selection step we discard some individuals of our population that we deem not good enough. For us this is as strait-forward as throwing away the worst preforming half individuals. Because
+In the selection step we discard some individuals of our population that we deem not to be good enough. For us this is as strait-forward as throwing away the worst preforming half individuals. Because of the gausian estimation mutation method we are generating enough different individuals for us not to worry about diversity preservation.
-==Modularety==
+==Modularity==
-We programmed each of the steps op our system as sepaparate object. So if you want to change any of the methods by a different algoritm replacing them is as eazy as swapping a single constuctor.
+We programmed each of the steps (mutation, evaluation, selection) in our system as separate objects. Therefore replacing any of the methods by a different algorithm is as easy as swapping a class by another one that implemented the same abstract base class.
 {{FooterGroningen2011}}

Team:Groningen/modeling genetic algorithms

From 2011.igem.org