Team:ULB-Brussels/modeling/30

From 2011.igem.org

(Difference between revisions)
m (texit > em)
 
(32 intermediate revisions not shown)
Line 11: Line 11:
   tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
   tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
});
-
 
</script>
</script>
<script type="text/javascript" src="path-to-mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
<script type="text/javascript" src="path-to-mathjax/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script>
 +
    <style>
-
<style>
+
#main h1{
 +
color: #690115;
 +
font-weight: bolder;
 +
}
 +
 
 +
#main h2{
 +
border-bottom-style: none;
 +
text-decoration: underline;
 +
font-weight: bolder;
 +
 
 +
}
 +
 
 +
#gris
 +
{
 +
color : #27303e;
 +
}
#menubar .left-menu noprint{
#menubar .left-menu noprint{
Line 120: Line 135:
{
{
width:980px;
width:980px;
-
height:2350px;/* A faire varier pour que la barre rouge soit au bonne endroit */
+
height:1500px;/* A faire varier pour que la barre rouge soit au bonne endroit */
margin: auto;
margin: auto;
padding-left: 5px;
padding-left: 5px;
Line 150: Line 165:
{
{
float: left;
float: left;
-
padding: 4px;
+
padding: 20px;
width:980px;
width:980px;
  /* Valeur du height de maintext -16px */
  /* Valeur du height de maintext -16px */
Line 550: Line 565:
position:relative;
position:relative;
top:112px;
top:112px;
-
margin-left: 135px; /*tu change ici pour le bouger horizontalement*/
+
margin-left: 0px; /*tu change ici pour le bouger horizontalement*/
color:white;
color:white;
widht: 900px;
widht: 900px;
Line 575: Line 590:
<a href="https://2011.igem.org/Team:ULB-Brussels">Home</a>
<a href="https://2011.igem.org/Team:ULB-Brussels">Home</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/project">Project</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/project">Project</a>
-
<a id="couleur"  href="https://2011.igem.org/Team:ULB-Brussels/modeling">Modeling</a>
+
<a id="couleur"  href="https://2011.igem.org/Team:ULB-Brussels/modeling">Modelling</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/human">Human practice</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/human">Human practice</a>
-
<a href="https://2011.igem.org/Team:ULB-Brussels/Results">Results</a>
+
<a href="https://2011.igem.org/Team:ULB-Brussels/Discussion">Discussion</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/parts">Parts</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/parts">Parts</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/safety">Safety</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/safety">Safety</a>
Line 586: Line 601:
<div id="sousm">
<div id="sousm">
-
<a  href="https://2011.igem.org/Team:ULB-Brussels/modeling/introduction">Introduction</a>
+
<a  href="https://2011.igem.org/Team:ULB-Brussels/modeling">Introduction</a>
-
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/30">Phase at 30°C</a>
+
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/30">Transcriptional interference</a>
-
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/42">Phase at 42°C</a>
+
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/42">Insertion model</a>
-
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/comparison">Comparison with the Wet Lab work </a>
+
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/excision">Excision model</a>
 +
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/loss">Loss of the pINDEL plasmid at 42°C</a>
 +
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/comparison">Comparison with data</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/conclusion">Conclusion</a>
<a href="https://2011.igem.org/Team:ULB-Brussels/modeling/conclusion">Conclusion</a>
</div>
</div>
Line 598: Line 615:
<div id="maintext">
<div id="maintext">
<div id="hmaint">
<div id="hmaint">
-
Modeling : Phase at 30°C </div>
+
Modelling : Transciptional interference </div>
<div id="maint">
<div id="maint">
-
<h1>Transcriptional interference: computer simulation</h1>
 
-
<p>
+
<p>In this section, we will study the transcriptional interference between the 2 functional units.
-
In this section, we study the interference in the transcription provoked by the simultaneous expression of the gene coding for the flippase (which is performed only for $\ldots\%$ at $30^\circ$C), and of the three genes $i$ for the insertion in the bacterial DNA (which is performed for $100\%$).
+
<br>
 +
<img src="https://static.igem.org/mediawiki/2011/a/ab/Figure0.png" alt="">
 +
Schematic view of the different genes and the motion of RNA-polymerase molecules. The RNA-polymerase molecule on the right will meet the 4th molecule before finishing the transcription of flp. This represent a very simple example of transcriptional interference. $N_0$: size (nt) of the gam, bet and exo genes; $N_1-N_0$: size (nt) of the flp gene
</p>
</p>
-
<p>
+
<p>As explained above, the pINDEL plasmid is constructed such that in the presence of arabinose and at  $30\circ$C, the transcription from the pBAD promoter inhibits the transcription of the flp gene by transcriptional interference.
-
...[Jo&Pierre]
+
</p>
</p>
-
 
+
<p>We have modeled the transcriptional interference by collisions (for review, see Shearwin et al., 2003, trends in genetics). The purpose is to study this transcriptional interference by simulating the movements of the RNA-polymerases along the gam, bet and exo genes and the flp gene, and therefore estimate the interference efficiency.  
-
 
+
-
<h1>Model</h1>
+
-
 
+
-
<h2>Preparation: electroporation and night culture</h2>
+
-
 
+
-
<p>
+
-
We electropore <emph>E. Coli</emph> with Pindel plasmids. Given that the plasmids include a resistance gene to ampiciline, we can see, by testing that resistance, which bacteria actually received a Pindel plasmid. One colony of those bacteria is then cultivated at $30^\circ$C in $10$ml, where she attains dew point (between $2\cdot10^9$ and $5\cdot10^9$ bacteria per ml). The solution is diluted $100$ to $1000$ times, then cultivated again, until we reach the optic density (OD) (at $600$nm) of $0.2$, which corresponds to approximatively $10^8$ bacteria. Those bacteria are then put in touch with arabinose at $30^\circ$C.
+
</p>
</p>
-
<h2>Modelisation of the $30^\circ$C phase on arabinose</h2>
+
<p>The RNA-polymerase molecules bind to the promoters located in 0 (for pBAD) and $N_1$ (for the pR promoter). The gene encoding FLP will be transcribed only if the RNA-polymerase molecules, which have initiated transcription at $N_1$ do not meet RNA-polymerase molecules between $N_1$ and $N_0$, and still interact with DNA.
-
 
+
-
<p>
+
-
At the initial time ($t=0$), the amount of bacteria ($N(t)$) is $N_0:=N(0)\approx10^8$. It seems natural to use Verhulst's logistic model:
+
-
\begin{equation}
+
-
\dot N=k_NN\left(1-\frac N{N_{max}}\right)
+
-
\label{N30}
+
-
\end{equation}
+
-
where $N_{max}$ is the maximum amount of bacteria that the culture environment is able to contain and where $k_N$ corresponds to the growth rate one would observe in the limit where the saturation would be inexistent. In our case, the saturation density slightly exceeds $1$ OD (at $600$nm), that is approximatively $N_{max}\approx 2\cdot10^9$. On the other hand, since our <emph>E. Coli</emph> ideally duplicate every $20$min, if we are far of the saturation ($N_{max}=\infty$), we obtain
+
-
\begin{equation}
+
-
\dot N=k_NN \Rightarrow N_0e^{k_Nt}=N(t)=N_02^{t/20\mbox{\footnotesize{min}}} \quad\Rightarrow k_N\approx \frac{\log{2}}{20\cdot60}\mbox{s}^{-1}.
+
-
\label{k_N}\end{equation}
+
</p>
</p>
-
 
+
<p>The way the program works:
-
<p>
+
-
At this point, all the bacteria contain Pindel. At $30^\circ$C, RepA101 becomes active and is present in sufficient quantities to allow the plasmid's replication; the evolution of $E_{tot}$ and $E$ thus do not matter. However, we shall note that the amount of plasmids per bacterium cannot exceed a certain number $P_{max}\approx20$ (because the origin of replication of the plasmid is <em>low copy</em>). At initial time, the amount of Pindel plasmids per bacterium is $P_0:=P(0)\approx19$. Again, we naturally opted for a logistic model:
+
-
\begin{equation}
+
-
\dot P=k_PP\left(1-\frac{P}{P_{max}}\right).
+
-
\end{equation}
+
</p>
</p>
 +
We enter:
 +
<ul>
 +
<li> The size of the two transcriptional units </li>
 +
<li> The frequency in which RNA-polymerases bind to the 2 promoters (i.e the strength of the promoter): $T_{pBAD}^{-1}$ and $T_{pR}^{-1}$</li>
 +
<li> The elongation rate of the RNA-polymerase </li>
 +
<li> The number of event to be simulated ($N_pR$) (<em>i.e.</em> the number of RNA-polymerases which will bind on the pR promoter)</li>
 +
</ul>
-
<p>
+
<p>Then, it will calculate the ratio between the number of RNA-polymerases, which reach $N_0$ starting from pR ($N_1$) and the number of RNA-polymerases that have initiated transcription at pR ($N_1$).
-
By the same reasoning we used for $k_N$ (eq(\ref{k_N})), we compute $k_P\approx\frac{\log{2}}{11}\mbox{s}^{-1}$, since our plasmid replicates itself every $11$s (reference?). Moreover, we have to consider the dilution of those plasmids through the population, due to its increase. In that purpose, let us suppose for a moment that the plasmids don't replicate anymore; we then have $PN=\mbox{cst}$, thus
+
-
\begin{equation}
+
-
P=\frac{\mbox{cst}}N\quad\Rightarrow \dot P=-\mbox{cst}\frac{\dot N}{N^2}=-\frac{\dot N}NP.
+
-
\label{dilution}\end{equation}
+
</p>
</p>
-
 
+
<p>We also make the following approximations:
-
<p>
+
<ul>
-
Combining both production (eq (\ref{production})) and dilution (eq(\ref{dilution})) effects, we get the evolution equation for $P$:
+
<li>The elongation rate of RNA-polymerase is the same and constant for the 2 transcriptional units</li>
-
\begin{equation}
+
<li>The probability that RNA-polymerases prematurely terminate transcription is memoryless.</li>
-
\dot P=k_PP\left(1-\frac P{P_{max}}\right)-\frac{\dot N}NP.
+
<li>When two RNA-polymerases come in collision with each other, they prematurely terminate transcription and will not interfere with the next round of transcription</li>
-
\label{P30}\end{equation}
+
</ul>
</p>
</p>
-
<p>
+
<p>$N_pR$ represents the number of RNA-polymerase molecules that bind to the pR promoter in the time interval [$0$, $N_pR \cdot T_pR$] and $N_pBAD = N_pR .T_pR /T_pBAD$ those that bind to the pBAD promoter. For both promoters, the program will randomly generate times corresponding to the RNA-polymerases binding and transcription initiation in the time interval [$0$, $N_pR \cdot T_pR$].  
-
Remark that this equation can be written as follow:
+
-
\begin{equation}
+
-
\frac d{dt}(NP)=k_PNP\left(1-\frac P{P_{max}}\right),
+
-
\label{equNP}\end{equation}
+
-
which allows a convenient interpretation: $NP$, the total amount of Pindel plasmids, follows a logistic model but where the saturation is only due to $P$. This seems natural enough, as we will see. The evolution of the amount of plasmids has to be of the form
+
-
\begin{equation}
+
-
\frac d{dt}(NP)=NP\cdot(b(N,P)-d(N,P))
+
-
\end{equation}
+
-
in term of a birth rate of new plasmids $b(N,P)$ and a death rate $d(N,P)$. The death rate is <em>a priori</em> constant and even zero in our case: $d(N,P)=d=0$. Regarding the birth rate, it has to diminish when $P$ increases, but is obviously unlinked with the amount of bacteria $N$; the easiest is then to postulate an affine function $b(N,P)=\alpha-\beta P$, so that we find
+
-
\begin{equation}
+
-
\frac d{dt}(NP)=NP(\alpha-\beta P)
+
-
\end{equation}
+
-
which is equivalent to (\ref{equNP}). This observation thus justifies our equation for $P$ (eq(\ref{P30})), initially obtained by heuristic reasoning.
+
</p>
</p>
-
 
+
<p>In a first step, if we neglect the risks of premature termination and interferences, and knowing the RNA-polymerase elongation rate, we can predict their positions along the pINDEL DNA as a function of time.  
-
<p>
+
-
Arabinose activates Pbad (the promotor of the three-genes sequence $i$ on Pindel), in order that those $3$ genes are expressed. Keeping in mind that the expressed proteins naturally deteriorate, the easiest way to modelise the evolution of their quantity ($G_i$) is
+
-
\begin{equation}
+
-
\dot{G_i}=C_iP-D_iG_i \quad (i=1,2,3)
+
-
\label{Gi30}\end{equation}
+
-
where $C_i$ is the production rate of the protein $i$ by the Pindel plasmid and $D_i$ the deterioration rate of that same protein. We estimated $C_i\approx\ldots$ and $D_i\approx\ldots$.
+
</p>
</p>
-
 
+
<p>In a second step, we will take into account the risk of premature termination.  
-
<p>
+
-
The promotor of flippase is repressed by a thermo-sensible repressor, and, at $30^\circ$C, is only partially activated ($\ldots\%$); in addition, the transcription is hindered by a possible interference with the transcription of the genes $i$. By a computer simulation, we have been able to estimate $p_{simul}$, the probability that the flippase sequence gets entirely transcribed (see section \ref{IntTranscr}). Remark that at $30^\circ$C flippase is entirely active. Furthermore, if we take the natural deterioration of flippase in account, we can write
+
-
\begin{equation}
+
-
\dot{F}=C_Fp_{simul}P-D_FF
+
-
\label{F30}\end{equation}
+
-
where $C_F$ is the production rate of flippase by Pindel (in ideal conditions, at $100\%$ of its activity) and $D_F$ the natural deterioration rate of flippase. We estimated that $C_F\approx\ldots$ et $D_F\approx0.1$.
+
</p>
</p>
-
 
+
<p>To this end, let us make some calculations taking into account the above approximations
-
<p>
+
-
We thereby obtain the following system (see eqs (\ref{P30}), (\ref{P30}), (\ref{Gi30}), (\ref{F30})):
+
-
$$
+
-
\left\{
+
-
\begin{array}{l}
+
-
\dot{N}=k_NN\left(1-\frac N{N_{max}}\right)\label{N30f}\\
+
-
\dot{P}=k_PP\left(1-\frac{P}{P_{max}}\right)-\frac{\dot N}NP\label{P30f}\\
+
-
\dot{G_i}=C_iP-D_iG_i \qquad (i=1,2,3)\label{Gi30f}\\
+
-
\dot{F}=C_Fp_{simul}P-D_FF\label{F30f}
+
-
\end{array}
+
-
\right.
+
-
$$
+
</p>
</p>
 +
<p>We know that the chance that the RNA-polymerase molecules premature terminate the transcription does not change with the amount of nucleotides already read. If we define the random variable $N$ as the number of nucleotides read by the polymerase, we can say that $N$ follows an exponential distribution ($N \sim $ Exp$(- \lambda)$). Indeed, the probability of reading, for example, $20$ more nucleotides, knowing that the polymerase already read a certain number of them doesn't depend of that number. We can then say that N is memoryless, and thus follows an exponential distribution.
-
<p>
+
<p>We still have to figure out what the parameter of that exponential distribution is. We estimated that their is only a small chance the polymerase transcribes the whole plasmid at once. By taking $\lambda = 3,5.10^{-4}$, we have that the chance of reading and transcribing all the nucleotides of the plasmid is close enough to zero.
-
In order to solve the first equation (eq(\ref{N30f})), we pose $M=1/N$; the equation then reads
+
Now that we have the distribution of $N$:
-
\begin{equation}
+
\[F(n) = P(N \leq n) = \left\{  
-
\dot M=-\frac{\dot N}{N^2}=-k_N\left(\frac1N-\frac1{N_{max}}\right)=-k_NM+\frac {k_N}{N_{max}}
+
\begin{array}{l l}
-
\end{equation}
+
  1-e^{-0,00035 n} & \quad \mbox{if $n \geq 0$}\\
-
and easily get solved to give
+
  0 & \quad \mbox{if $n \leq 0$}\\ \end{array} \right. \]
-
\begin{equation}
+
-
M(t)=\frac1{N_{max}}+(\frac1{N_0}-\frac1{N_{max}})e^{-k_Nt}
+
-
\end{equation}
+
-
and thus
+
-
\begin{equation}
+
-
N(t)=\frac{N_{max}N_0e^{k_Nt}}{N_0e^{k_Nt}+(N_{max}-N_0)}=N_0e^{k_Nt}\frac1{1+\frac{N_0}{N_{max}}\left(e^{k_Nt}-1\right)}\approx N_0e^{k_Nt}
+
-
\end{equation}
+
-
where the approximation stays valid for short times, that is
+
-
\begin{equation}
+
-
t\ll\frac1{k_N}\log{(\frac{N_{max}}{N_0}+1)}\approx5271\mbox{s}=1\mbox{h}27\mbox{min}51\mbox{s}.
+
-
\end{equation}
+
-
</p>
+
-
<p>
+
We can generate random numbers following this distribution to simulate N, by applying $F^{-1}$ on a $[0,1]$-uniform random variable.
-
Saturation is reached when $t\approx2\mbox{h}30\mbox{min}$, like we can see on the following graph (established with realistic constants)
+
Indeed, the new random variable will then follow the law which F is the distribution function. Let us show this. Let $X$ be a $[0,1]$-uniform random variable. We have that
-
\[\mbox{[insérer le graphique 1]}\]
+
 
-
</p>
+
\[P(X \leq x) = \left\{  
 +
\begin{array}{l l l}
 +
  0 & \quad \mbox{if $x \leq 0$}\\
 +
  x & \quad \mbox{if $0 \leq x \leq 1$}\\
 +
  1 & \quad \mbox{if $x \geq 1$}\\ \end{array} \right. \]
-
<p>
+
Now, let $F$ be a given distribution function. Define the random variable $Y=F^{-1}(X)$. We have :
-
The equation for $P$ (eq(\ref{P30f})) then becomes
+
\begin{equation}
\begin{equation}
-
\dot P=k_PP\left(1-\frac{P}{P_{max}}\right)-k_N\frac{N_{max}-N_0}{N_0e^{k_Nt}+(N_{max}-N_0)}P
+
P(Y \leq y) = P(F^{-1}(X) \leq y) = P(X \leq F(y)) = F(y)
\end{equation}
\end{equation}
-
which can't be solved analytically.  However, we can solve it numerically using <em>Mathematica</em>: for realistic values of the parameters,
 
-
\[\mbox{[insérer le graphique 2]}\]
 
-
</p>
 
-
<p>
+
Thus Y follows the law which F is the distribution function, like announced.
-
The two last equations, for $F$ and $G_i$ (eqs (\ref{F30f}) et (\ref{Gi30f})), can also be solved  via <em>Mathematica</em>: for realistic constants,
+
-
\begin{description}
+
-
\item{for $F$:}
+
-
\[\mbox{[insŽrer le graphique 3]}\]
+
-
\item{for $G_i$:}
+
-
\[\mbox{[insŽrer le graphique 4]}\]
+
-
\end{description}
+
</p>
</p>
 +
<p>In a last step, we take into account the supposition that when 2 RNA-polymerases come in collision with each other, they prematurely terminate transcription and will not interfere with the next round of transcription.
-
<p>
+
In order to calculate the efficiency of transcriptional interference, we simply have to count the number of RNA-polymerases that bind to $N_1$ and reach $N_0$, and compare this number to the number of RNA-polymerases that binds to $N_1$.
-
As soon as $t\approx\ldots$, $G_i$ reaches its asymptotic maximum $C_iP_{max}/D_i\approx\ldots$ and $F$ reaches its asymptotic maximum $C_Fp_{simul}P_{max}/D_F\approx\ldots$.
+
</p>
</p>
-
<p>
+
<p>As we know the size of the 2 transcriptional units  :
-
It is important to point out that here, the solution of our model only presents a small sensitivity to the parameters around the estimated values: a small error on the parameters will only result in a small change in the solution, like we can observe if we vary the values of the parameters a little around their estimation.
+
<ul>
-
</p>
+
<li>2067 nt for gam+bet+exo </li>
-
 
+
<li> 1272 nt for flp </li>
-
 
+
</ul>
 +
We also have an idea of the elongation rate: 24-79 nt/s.
 +
We can also estimate the frequencies in which RNA-polymerases bind to the promoters as we know the time it takes to a protein to be produced ($1/240 \mbox{s}^{-1}$ for pR and $1/40 \mbox{s}^{-1}$ for pBAD).<br/>
 +
The simulation finally shows that due to transcriptional interferences only 63% of the flp is transcribed. This seems to be coherent considering the fact that under $30\circ$C flp is already produced to 10% of its maximal rate. </p>
</div>
</div>
</div>
</div>

Latest revision as of 04:44, 22 September 2011

Modelling : Transciptional interference

In this section, we will study the transcriptional interference between the 2 functional units.
Schematic view of the different genes and the motion of RNA-polymerase molecules. The RNA-polymerase molecule on the right will meet the 4th molecule before finishing the transcription of flp. This represent a very simple example of transcriptional interference. $N_0$: size (nt) of the gam, bet and exo genes; $N_1-N_0$: size (nt) of the flp gene

As explained above, the pINDEL plasmid is constructed such that in the presence of arabinose and at $30\circ$C, the transcription from the pBAD promoter inhibits the transcription of the flp gene by transcriptional interference.

We have modeled the transcriptional interference by collisions (for review, see Shearwin et al., 2003, trends in genetics). The purpose is to study this transcriptional interference by simulating the movements of the RNA-polymerases along the gam, bet and exo genes and the flp gene, and therefore estimate the interference efficiency.

The RNA-polymerase molecules bind to the promoters located in 0 (for pBAD) and $N_1$ (for the pR promoter). The gene encoding FLP will be transcribed only if the RNA-polymerase molecules, which have initiated transcription at $N_1$ do not meet RNA-polymerase molecules between $N_1$ and $N_0$, and still interact with DNA.

The way the program works:

We enter:
  • The size of the two transcriptional units
  • The frequency in which RNA-polymerases bind to the 2 promoters (i.e the strength of the promoter): $T_{pBAD}^{-1}$ and $T_{pR}^{-1}$
  • The elongation rate of the RNA-polymerase
  • The number of event to be simulated ($N_pR$) (i.e. the number of RNA-polymerases which will bind on the pR promoter)

Then, it will calculate the ratio between the number of RNA-polymerases, which reach $N_0$ starting from pR ($N_1$) and the number of RNA-polymerases that have initiated transcription at pR ($N_1$).

We also make the following approximations:

  • The elongation rate of RNA-polymerase is the same and constant for the 2 transcriptional units
  • The probability that RNA-polymerases prematurely terminate transcription is memoryless.
  • When two RNA-polymerases come in collision with each other, they prematurely terminate transcription and will not interfere with the next round of transcription

$N_pR$ represents the number of RNA-polymerase molecules that bind to the pR promoter in the time interval [$0$, $N_pR \cdot T_pR$] and $N_pBAD = N_pR .T_pR /T_pBAD$ those that bind to the pBAD promoter. For both promoters, the program will randomly generate times corresponding to the RNA-polymerases binding and transcription initiation in the time interval [$0$, $N_pR \cdot T_pR$].

In a first step, if we neglect the risks of premature termination and interferences, and knowing the RNA-polymerase elongation rate, we can predict their positions along the pINDEL DNA as a function of time.

In a second step, we will take into account the risk of premature termination.

To this end, let us make some calculations taking into account the above approximations

We know that the chance that the RNA-polymerase molecules premature terminate the transcription does not change with the amount of nucleotides already read. If we define the random variable $N$ as the number of nucleotides read by the polymerase, we can say that $N$ follows an exponential distribution ($N \sim $ Exp$(- \lambda)$). Indeed, the probability of reading, for example, $20$ more nucleotides, knowing that the polymerase already read a certain number of them doesn't depend of that number. We can then say that N is memoryless, and thus follows an exponential distribution.

We still have to figure out what the parameter of that exponential distribution is. We estimated that their is only a small chance the polymerase transcribes the whole plasmid at once. By taking $\lambda = 3,5.10^{-4}$, we have that the chance of reading and transcribing all the nucleotides of the plasmid is close enough to zero. Now that we have the distribution of $N$: \[F(n) = P(N \leq n) = \left\{ \begin{array}{l l} 1-e^{-0,00035 n} & \quad \mbox{if $n \geq 0$}\\ 0 & \quad \mbox{if $n \leq 0$}\\ \end{array} \right. \] We can generate random numbers following this distribution to simulate N, by applying $F^{-1}$ on a $[0,1]$-uniform random variable. Indeed, the new random variable will then follow the law which F is the distribution function. Let us show this. Let $X$ be a $[0,1]$-uniform random variable. We have that \[P(X \leq x) = \left\{ \begin{array}{l l l} 0 & \quad \mbox{if $x \leq 0$}\\ x & \quad \mbox{if $0 \leq x \leq 1$}\\ 1 & \quad \mbox{if $x \geq 1$}\\ \end{array} \right. \] Now, let $F$ be a given distribution function. Define the random variable $Y=F^{-1}(X)$. We have : \begin{equation} P(Y \leq y) = P(F^{-1}(X) \leq y) = P(X \leq F(y)) = F(y) \end{equation} Thus Y follows the law which F is the distribution function, like announced.

In a last step, we take into account the supposition that when 2 RNA-polymerases come in collision with each other, they prematurely terminate transcription and will not interfere with the next round of transcription. In order to calculate the efficiency of transcriptional interference, we simply have to count the number of RNA-polymerases that bind to $N_1$ and reach $N_0$, and compare this number to the number of RNA-polymerases that binds to $N_1$.

As we know the size of the 2 transcriptional units :

  • 2067 nt for gam+bet+exo
  • 1272 nt for flp
We also have an idea of the elongation rate: 24-79 nt/s. We can also estimate the frequencies in which RNA-polymerases bind to the promoters as we know the time it takes to a protein to be produced ($1/240 \mbox{s}^{-1}$ for pR and $1/40 \mbox{s}^{-1}$ for pBAD).
The simulation finally shows that due to transcriptional interferences only 63% of the flp is transcribed. This seems to be coherent considering the fact that under $30\circ$C flp is already produced to 10% of its maximal rate.

iGEM ULB Brussels Team - Contact us