Team:ULB-Brussels/modeling/42

From 2011.igem.org

Modelling : Insertion model

Insertion model

Definitions

Let us begin with a proper definition of the different biological functions that are considered in our model:

  • $N$: total number of bacteria per ml in the considered culture;
  • $P$: average number of pINDEL plasmid copies per bacterium;
  • $F$: average amount of active FLP per bacterium;
  • $G_i (i=1,2,3)$: average amount of the Red recombinase protein $i$ (1 is Gam, 2 is Exo and 3 is Bet), per bacterium.

Experimental design of the insertion step

A colony of E. coli containing the pINDEL plasmid is grown at $30^\circ$C in $10$ml of LB medium containing Amp to select for the presence of pINDEL. This culture reaches saturation after ON culture at $30^\circ$C (titer of the culture between $2\cdot10^9$ and $5\cdot10^9$ bacteria per ml of culture). This ON culture is then diluted $100$- to $1000$-fold and grown in logarithmic phase at $30^\circ$C in LB medium ($\mbox{OD}_{600\mbox{nm}}$ around $0.2$, corresponding to $10^8$ bacteria/ml of culture). Arabinose ($0.2$ to $1\%$) is then added to the culture to induce the expression of the IN function. These cells are electroporated with a linear PCR fragment containing the gene X of interest and the FRT'-Cm-FRT'. Transformants are selected on LB plates containing Cm without arabinose at $30^\circ$C.

Getting the equations of the insertion step

At the initial time ($t=0$), i.e. immediately after the dilution, the number of bacteria ($N(t)$) is $N_0:=N(0)\approx10^8 \ \mbox{bact}/\mbox{ml}$. We used the Verhulst's logistic model. \begin{equation} \dot N=k_NN\left(1-\frac N{N_{max}}\right) \label{N30} \end{equation} where $N_{max}$ is the maximal number of bacteria in the culture and where $k_N$ corresponds to the growth rate one would observe in the limit where the saturation would be inexistent. In our case, at saturation, the $\mbox{OD}_{600\mbox{nm}}$ slightly exceeds $1$, which corresponds to approximately $N_{max}\approx 2\cdot10^9\mbox{bact}/\mbox{ml}$. On the other hand, since in our conditions, E. coli ideally divides every $20$min, if we are far from the saturation ($N_{max}=\infty$), we obtain \begin{equation} \dot N=k_NN \Rightarrow N_0e^{k_Nt}=N(t)=N_02^{t/20\mbox{\min}} \quad\Rightarrow k_N\approx \frac{\log{2}}{20\cdot60}\mbox{s}^{-1}. \label{k_N}\end{equation}

At this point, all the bacteria in the culture contain the pINDEL plasmid. Note that RepA101Ts is fully active for pINDEL replication at $30^\circ$C. However, we shall note that the number of plasmid copy per bacterium cannot exceed a certain number $P_{max}\approx20$ (as the origin of replication of pINDEL is low copy). At initial time, the number of pINDEL plasmid copy per bacterium is $P_0:=P(0)\approx19$, that is slightly less than the maximum: immediately after the night culture we must have theoretically $P=P_{max}$, but we have to take into account the possible accidents during the manipulations before the beginning of the insertion step. Again, we naturally postulate a logistic model: \begin{equation} \dot P=k_PP\left(1-\frac{P}{P_{max}}\right). \label{production}\end{equation}

As pINDEL is composed of $10800$ nt and as the replication rate is of about $750$ nt/s, we can estimate that the replication of pINDEL takes $\frac{10800}{750}\mbox{s}=14.4 \mbox{s}$. Using then the same reasoning we used for $k_N$ (eq(\ref{k_N})), we compute $k_P\approx\frac{\log{2}}{14.4}\mbox{s}^{-1}$. Moreover, we have to consider the contribution of the increase in population, which produce a dilution effect. In that purpose, let us suppose for a moment that the plasmids do not replicate any more; we then have $PN=\mbox{cst}$, thus \begin{equation} P=\frac{\mbox{cst}}N\quad\Rightarrow \dot P=-\mbox{cst}\frac{\dot N}{N^2}=-\frac{\dot N}NP. \label{dilution}\end{equation}

Combining both production (eq (\ref{production})) and dilution (eq(\ref{dilution})) effects, we get the evolution equation for $P$: \begin{equation} \dot P=k_PP\left(1-\frac P{P_{max}}\right)-\frac{\dot N}NP. \label{P30}\end{equation}

Note that this equation can be written as follow: \begin{equation} \frac d{dt}(NP)=k_PNP\left(1-\frac P{P_{max}}\right), \label{equNP}\end{equation} which allows a convenient interpretation: $NP$, the total number of pINDEL plasmid copy, follows a logistic model but where the saturation is only due to $P$. This seems quite natural, as we will see. The evolution of the total number of plasmid copy (per ml) has to be of the form \begin{equation} \frac d{dt}(NP)=NP\cdot(g(N,P)-d(N,P)) \end{equation} in term of a generation rate of new plasmids $g(N,P)$ and a death rate $d(N,P)$. The death rate is a priori constant and even zero in our case: $d(N,P)=d=0$. Regarding the generation rate, it has to diminish when $P$ increases, but is obviously not correlated to the number of bacteria per ml ($N$); the easiest is then to postulate an affine function $g(N,P)=\alpha-\beta P$, so that we find \begin{equation} \frac d{dt}(NP)=NP(\alpha-\beta P)t \end{equation} which is equivalent to (\ref{equNP}). This observation thus justifies our equation for $P$ (eq(\ref{P30})), initially obtained by heuristic reasoning.

As explained above, arabinose induces expression from the pBAD promoter (the promoter controlling the expression of the three genes $i$ on pINDEL). Keeping in mind the three Red recombinase proteins natural decay, the easiest way to model the evolution of the total amount (per ml) of these proteins (i.e. $G_i\cdot N$) is \begin{align} &\frac d{dt}(G_iN)=C_iPN-D_iG_iN \quad (i=1,2,3)\\ \Leftrightarrow\quad&\dot G_i=C_iP-D_iG_i-\frac{\dot N}NG_i \quad (i=1,2,3) \label{Gi30}\end{align} where $C_i$ is the production rate of the protein $i$ and $D_i$ the decay rate of the same protein. We can estimate that a pINDEL plasmid produces one protein $i$ every $40$s: in good approximation, we only have to consider the three genes transcription time and we may suppose the transcriptions are performed one by one; as gam, bet and exo consist of $417$nt, $786$nt and $681$nt respectively and as the transcription speed is about $51.5$nt/s (between $24$ and $79$ nt/s), we find a transcription time of about $40$s, so that $C_i\approx\frac1{40}\mbox{s}^{-1}$. Furthermore, as these three proteins are stable, we can estimate their half-life time to be around $60$min; we then obtain (by a similar reasoning as in (eq(\ref{k_N}))) $D_i\approx\frac{\log2}{60\cdot60}\mbox{s}^{-1}$.

The promoter of the flp gene is repressed by the CI857 thermosensitive repressor at $30^\circ$C. However, repression is not complete and we postulate that the pR leakiness is around $10\%$. In addition, the pR transcription is inhibited by interference with the transcription of the IN genes $i$. By computer simulation, we have been able to estimate $p_{simul}$, the probability that the flp gene is entirely transcribed (see section (\ref{IntTranscr})). Note that at $30^\circ$C FLP is active. Keeping in mind FLP natural decay, the easiest way to model the evolution of the total amount (per ml) of FLP (i.e. $F\cdot N$) is \begin{align} &\frac d{dt}(FN)=10\%p_{simul}C_FPN-D_FFN\\ \Leftrightarrow\quad&\dot{F}=10\%p_{simul}C_FP-D_FF-\frac{\dot N}NF \label{F30}\end{align} where $C_F$ is the production rate of FLP by pINDEL (in ideal conditions, at $100\%$ of its activity, without transcriptional interference nor repression) and where $D_F$ is the decay rate of FLP. We can estimate that a pINDEL plasmid produces one FLP every $24$s: in good approximation, we only have to consider the three genes transcription time and we may suppose the transcriptions are performed one by one; as flp consists of $1272$nt and as the transcription speed is about $51.5$nt/s (between $24$ and $79$ nt/s ), we find a transcription time of about $24$s, so that $C_F\approx\frac1{24}\mbox{s}^{-1}$. Furthermore, as FLP at $30^\circ$C is stable, we can estimate its half-life time to be around $60$min; we then obtain (by a similar reasoning as in (eq(\ref{k_N}))) $D_F\approx\frac{\log2}{60\cdot60}\mbox{s}^{-1}$.

We thereby obtain the following system (see eqs (\ref{N30}), (\ref{P30}), (\ref{Gi30}) and (\ref{F30})): \[ \left\{ \begin{array}{c} \dot{N}=k_NN\left(1-\frac N{N_{max}}\right)\label{N30f}\\ \dot{P}=k_PP\left(1-\frac{P}{P_{max}}\right)-\frac{\dot N}NP\label{P30f}\\ \dot{G_i}=C_iP-D_iG_i-\frac{\dot N}NG_i \qquad (i=1,2,3)\label{Gi30f}\\ \dot{F}=10\%p_{simul}C_FP-D_FF-\frac{\dot N}NF\label{F30f} \end{array} \right. \]

Solving the equations of the insertion step

In order to solve the first equation (eq(\ref{N30f})), we pose $M=1/N$; the equation then reads \begin{equation} \dot M=-\frac{\dot N}{N^2}=-k_N\left(\frac1N-\frac1{N_{max}}\right)=-k_NM+\frac {k_N}{N_{max}} \end{equation} and easily get solved to give \begin{equation} M(t)=\frac1{N_{max}}+(\frac1{N_0}-\frac1{N_{max}})e^{-k_Nt} \end{equation} thus \begin{align} N(t)&=\frac{N_{max}N_0e^{k_Nt}}{N_0e^{k_Nt}+(N_{max}-N_0)}=N_0e^{k_Nt}\frac1{1+\frac{N_0}{N_{max}}\left(e^{k_Nt}-1\right)}\label{Nsol30}\\ &\approx N_0e^{k_Nt}\label{approx30} \end{align} where the approximation (eq(\ref{approx30})) remains valid for short times, that is \begin{equation} t\ll\frac1{k_N}\log{(\frac{N_{max}}{N_0}+1)}\approx5271\mbox{s}=1\mbox{h}27\mbox{min}51\mbox{s}. \end{equation} Saturation is reached when $t\approx9000\mbox{s}=2\mbox{h}30\mbox{min}$, as we can see on the graph (fig(\ref{graph1})) (obtained for realistic values of the parameters).
In blue is plot the exact solution for $N$, while in red is the exponential approximation (eq(\ref{approx30})). This is obtained for $N_{max}=2\cdot10^9$bact/ml, $N_0=10^8$bact/ml and $k_N=\frac{\log2}{20\cdot60}\mbox{s}^{-1}$.

The equation for $P$ (eq(\ref{P30f})) then becomes, using eq(\ref{Nsol30}): \begin{equation} \dot P=k_PP\left(1-\frac{P}{P_{max}}\right)-k_N\frac{N_{max}-N_0}{N_0e^{k_Nt}+(N_{max}-N_0)}P \end{equation} which cannot be solved analytically. However, we can solve it numerically using Mathematica: for realistic values of the parameters, we obtain the graph (fig(\ref{graph2})).
This is obtained for $N_{max}=2\cdot10^9$bact/ml, $N_0=10^8$bact/ml, $k_N=\frac{\log2}{20\cdot60}\mbox{s}^{-1}$, $k_P=\frac{\log2}{14.4}\mbox{s}^{-1}$, $P_ 0=19$ and $P_{max}=20$. We observe that $P(t)\approx P_{max}$ as soon as $t\gtrsim50\mbox{s}$.

The two last equations, for $F$ and $G_i$ (eqs (\ref{F30f}) and (\ref{Gi30f})), rewrite, using the solution for $N$ (eq(\ref{Nsol30})): $$ \left\{ \begin{array}{c} \dot{G_i}=C_iP-D_iG_i-k_N\frac{N_{max}-N_0}{N_0e^{k_Nt}+(N_{max}-N_0)}G_i \qquad (i=1,2,3)\label{tre}\\ \dot{F}=10\%p_{simul}C_FP-D_FF-k_N\frac{N_{max}-N_0}{N_0e^{k_Nt}+(N_{max}-N_0)}F\label{ert} \end{array} \right. $$ which can also be solved via Mathematica; for realistic constants, we get the graphs (fig(\ref{graph3})) and (fig(\ref{graph4})) for $F$ and $G_i$ respectively.
This is obtained for $N_{max}=2\cdot10^9$bact/ml, $N_0=10^8$bact/ml, $k_N=\frac{\log2}{20\cdot60}\mbox{s}^{-1}$, $k_P=\frac{\log2}{14.4}\mbox{s}^{-1}$, $P_ 0=19$, $P_{max}=20$, $D_F=\frac{\log2}{60\cdot60}\mbox{s}^{-1}$, $C_F=\frac1{24}\mbox{s}^{-1}$ and $p_{simul}=0.01$.
This is obtained for $N_{max}=2\cdot10^9$bact/ml, $N_0=10^8$bact/ml, $k_N=\frac{\log2}{20\cdot60}\mbox{s}^{-1}$, $k_P=\frac{\log2}{14.4}\mbox{s}^{-1}$, $P_ 0=19$, $P_{max}=20$, $D_i=\frac{\log2}{60\cdot60}\mbox{s}^{-1}$ and $C_i=\frac1{40}\mbox{s}^{-1}$. Note that $G_i$ and $F$ increase to a stable asymptotic equilibrium: \begin{equation} \lim\limits_{t\rightarrow\infty}{G_i(t)}=\frac{C_iP_{max}}{D_i}\approx 2.59 \cdot 10^3 \end{equation} and \begin{equation} \lim\limits_{t\rightarrow\infty}{F(t)}=\frac{10\%p_{simul}C_FP_{max}}{D_F}\approx\ 4.33 \end{equation} as can be seen immediately from the equations (eq(\ref{tre})) and (eq(\ref{ert})).

It is important to point out that here, the solution of our model only presents a small sensitivity to the parameters around the estimated values: a small error on the parameters will only result in a small change in the solution, as we can observe if we vary the values of the parameters a little around their estimation.

iGEM ULB Brussels Team - Contact us