Team:TU Munich/model/guide

From 2011.igem.org

Revision as of 16:09, 9 September 2011 by FabianFroehlich (Talk | contribs)

LyX Document


Introduction:

In the process of developing our model we had certain difficulties to find reference or guidelines to model genetic networks in prokariotic systems. The only book that covered differential equations for gene expression in our library was [reference]. It covers only very rudimentary basics but if you have no prior experience in the topic it will certainly be helpful. To ease the modeling for other team we put together a guide that explains how we developed the model for our system. The guide will not give minute details on what the biological/chemical/physical backgrounds are, for this we linked some of the titles to the corresponding wikipedia page that explains the processes in detail.

Choice of Model

The first step in modeling is always to choose wether you want to model your system in a stochastic or deterministic and time/state-discrete or time/state-continuous manner. The most convenient way to select between a discrete and continuous model is to look at the behaviour of the variables in the real world, since it makes no sense to model single molecule interactions with quantities that include fractions of molecules. A model where the state variable is discrete in most cases limits you to a stochastical model. If you pick a continuous model the choice to model the system with a deterministic or stochastic ansatz is highly dependent on what characteristics of the system you want to analyse in your model. For quantitative data, like what amount of Proten XY is produced, a deterministic model fits your needs. For qualitative data, like the variance in the production of Protein XY. The decision between a stochastic and deterministic models is sometimes also dependant on which parameters for the system are available or rather attainable.
The next step is to pick a level of detail. Here your decision should be based on what level of detail is feasible and what level of detail is needed and you should always choose the lower level of those two.The feasability again depends on the available parameters and how well the considered system is known. In your implementation the the level of detail should be reflected by the tolerances that you pass to your solver and the amount of simplifications that you apply to your equations.
In biological systems negative concentrations or populations do not make any sense. This should reflected by your equations and if you experience any negative data, you should probably check wether there is something wrong with your modeling.
The following sections will cover most of the basics that are needed for the development of a deterministic continuous model. This gui

Mass Action Law

The fundamental mathematical model for chemical reactions is the Mass Action Law, It provides a mathematical model for reactions of the type
α A+ β B ⇋ v r + v r - γ C+ δ D
The resulting differential equations are
[A] ˙ = α v r + [C ] γ [D ] δ - α γ v r - [A ] α [B ] β [B] ˙ = β v r + [C ] γ [D ] δ - β v r - [A ] α [B ] β [C] ˙ = γ v r - [A ] α [B ] β - γ v r + [C ] γ [D ] δ [D] ˙ = δ v r - [A ] α [B ] β - δ v r + [C ] γ [D ] δ
Where the [] brackets denote that the concentration is used
This method can be used for arbitrarly compex reactions that only involve one mechanistic step. In most cases the mass action law is used to model the boning of two elements
α A+ β B v r + γ AB
which is modeled as described above
[A] ˙ = - α v r + [A ] α [B ] β [B] ˙ = - β v r + [A ] α [B ] β [ AB ] ˙ = γ v r + [A ] α [B ] β
Notice that in this case we did not include the backwards reaction. If the compound AB can dissociate in its products the backwards reaction of course should be included.

Degradation

A compound does not always dissociate into its process and there are certain chemical processes that rob RNA and Proteins of their initial functionality. To take account for those degradation processes where the substance X decays, such that the decay-products are of no further importance to the system, we impose a degradation term on all degradable substances:
[X] ˙ = -k⁡[X]
where k is typically the the inverse of the half-value period of the degradation process. Sometimes this k also takes account of cell division, then the inverse of the half-value period of the cell cycle is added.

Transcription

Regulation of gene transcription is the essential to the behaviour of genetic networks so special attention should be given to this part of the modeling. The differential equation for the transcription consists of up to three terms. First an optional term for the constituent transcription of mRNA
[ mRNA ] ˙ = k l + with constituent transcription rate  k l
then the term for the degradation of mRNA
+ 1 τ 1 2 [ mRNA ]+ with half-value period  τ 1 2
and finally the most important term that models the influence of activators and repressors binding to the regulatory region of the promoter:
+ v max g( [A],⁡[R] )
with v max maximum transcription rate of the gene and some function g( [A],⁡[R] ) that can depend on the activator concentration [A] and the repressor concentration [R] .
The most basic model for g is a Hill-Function
g={ 1- [ R k R ] n (1+⁡[ R k R ] ) n for repression [ A k A ] n (1+⁡[ A k A ] ) n for activation [ A k A ] n (1+⁡[ A k A ] ) n *( 1- [ R k R ] n (1+⁡[ R k R ] ) n ) for both
where the parameter n is used to fit the shape of the function to gathered data. If no data is available a general rule of thumb is that n is the number of molecules involved in the binding process. k A and k R are the binding rates of activator and repressor.
This is of course a very basic model, but in our case we lacked some parameters and decided that a more sophisticated model would only give the illusion of a higher level of detail. However if parameters and exact behaviour of the regulatory region are known the following prototypes for g in case of activation only can be used
g={ { [ [A] k A ] n n k=0 (⁡[ A k A ] ) k sequential binding [ A k A ] n n k=0 ( n k )(⁡[ A k A ] ) k independent binding for no cooperativity { [A] n n i=1 k A i n j=0 [A] j j i=1 k A i sequential binding [ A k A ] n 1+n j=0 ( n j )[A] j j i=1 k A i independent binding for cooperative binding 
where cooperative binding means that the binding rate of the other binding sites is affected if an activator binds to a specific binding site and sequential binding means that an activator can only bind if the previous binding site is occupied. For cooperative binding the different binding rates are denoted by k A i whereas in case of no cooperativity all k A i are equal and k A is used respectively.
The formulas given above can be adapted for repression only and activation and repression in the same manner as in the basic version.
(ref: lecture notes system biophysics)

Translation

The next step after transcription of the DNA is the translation of the mRNA into proteins. Since the binding rates of the ribosome and folding process are widely unknown it is advisable to use a very simplifyied model for the translation process which only involves a translation rate v trans linearly scaling with availability of mRNA
[ Protein ] ˙ = v trans *[ mRNA ]- 1 τ 1 2 [ Protein ]
If the time τ fold needed for the folding of the protein is known it could be implemented by using Delay Differential Equations with the folding time as constant delay [ Protein ] ˙ = v trans *[ mRNA ]( t- τ folding )- 1 τ 1 2 [ Protein ]
Since solving DDEs requires a special much slower solver and the impact of the change can be expected to be very small, this possible improvement was omitted in our model.
If there are other regulatory processes involved in the mRNA translation as in the case of our AND-Gate where the translation of the T7 RNA Polymerase depends on the availability of tRNA a combination of the techniques for Mass Action Law and Transcription can be used to model their influence.

Protein Activity

As protein activity can be regulated by activators and repressors in a similiar manner as the transcription of DNA, analogous methods can be used. The the constituent aswell as the regulated activity rate will of course additionally depend on the concentration of the substrate and of the protein. In general Protein activity is modeled with the Michaelis-Menten principle
[ Product ] ˙ = v max [ Protein ] [ Substrate ] K+[ Substrate ]
where v max is the maximum protein activity and K is the concentration of substrate at which half the maximum activity is achieved. If the initial substrate concentration is much larger than K, we can simplify this equation to
[ Product ] ˙ = v max [ Protein ]
(Ref: Chaos Lac)

Initial Data

If you design a becterial application that reacts to a specific signal, the reactions occur independent of the signal will normally have reached a steady state. Thus it is advisable to set their initial values in your simulation to their respective steady state.

Simplification

In some cases your model will involve multiple subsystems that work on different timescales which will most probably lead to ill-posed problems which cause numerical problems. These timescales can be identified by dimensional analysis and sufficiently fast subsystems can be assumed to reach their steady state instantly. This means that if the reaction for x is fast its differential equation x ˙ =f( x,y )
simplifies to an algebraic equation
0=f( x ‾ ,y, )
whose solution can be substituted into the other equations.

MATLAB Implementation

If no simplifactions regarding fast/slow subsystems are made or if you experience your code to be unexpected slow, it is advisable to use a solver for stiff differential equations as ode23s.