Team:ULB-Brussels/modeling/30

From 2011.igem.org

(Difference between revisions)
Line 618: Line 618:
<div id="maint">
<div id="maint">
-
<h1>Introduction</h1>
+
<p>In this section, we will study the transcriptional interference between the 2 functional units.
-
<p>
+
\begin{figure}[!htp]
-
The pINDEL plasmid can be divided into $2$ functional units:
+
\begin{center}\includegraphics{figure0.png}
-
<ol>
+
\caption{\label{fig0}Schematic view of the different genes and the motion of RNA-polymerase molecules. The RNA-polymerase molecule on the right will meet the 4th molecule before finishing the transcription of \textit{flp}. This represent a very simple example of transcriptional interference. $N_0$: size (nt) of the Gam, Bet and Exo genes; $N_1-N_0$: size (nt) of the \textit{flp} gene }
-
  <li>the IN function which is composed of the <em>gam</em>, <em>exo</em> and <em>bet</em> genes coding for the $\lambda$ Red recombinase system \cite{dat,yu}; and</li>
+
\end{center}
-
  <li> the DEL function which is based on the <em>flp</em> gene encoding the FLP site-specific recombinase \cite{dat,yu}.</li>
+
\end{figure}
-
</ol>
+
</p>
 +
<p>As explained above, the pINDEL plasmid is constructed such that in the presence of arabinose and at  $30\circ$C, the transcription from the pBAD promoter inhibits the transcription of the \textit{flp} gene by transcriptional interference.
 +
</p>
 +
<p>We have modeled the transcriptional interference by collisions (for review, see Shearwin et al., 2003, trends in genetics).  The purpose is to study this transcriptional interference by simulating the movements of the RNA-polymerases along the Gam, Bet and Exo genes and the \textit{flp} gene, and therefore estimate the interference efficiency.
</p>
</p>
-
<p>
+
<p>The RNA-polymerase molecules bind to the promoters located in 0 (for pBAD) and N1 (for the pR promoter).  The gene encoding FLP will be transcribed only if the RNA-polymerase molecules, which have initiated transcription at $N_1$ do not meet RNA-polymerase molecules between $N_1$ and $N_0$, and still interact with DNA.
-
The expression of $\lambda$ Red recombinase genes is under the control of the pBAD promoter.  This promoter is repressed by the AraC transcriptional regulator in absence of arabinose and activated by the same protein in the presence of arabinose.  The<em>araC</em> gene is also encoded in the pINDEL plasmid.  The expression of the FLP recombinase is under the control of the $\lambda$ pR promoter.  This promoter is repressed at $30^\circ$C by the thermosensitive CI857 repressor which is also encoded in the pINDEL plasmid.  We will consider that expression of the <em>flp</em> gene is repressed at 90\% at $30^\circ$C, while at $42^\circ$C the <em>flp</em> gene is fully expressed. However it is reported that at this temperature, the activity of FLP is drastically reduced as compared to lower temperature \cite{buch}.
+
</p>
</p>
 +
<p>The way the program works:
 +
</p>
 +
We enter:
 +
<ul>
 +
<li> The size of the two transcriptional units
 +
<li> The frequency in which RNA-polymerases bind to the 2 promoters (i.e the strength of the promoter): ${T_pBAD}^{-1}$ and ${T_pR}^{-1}$
 +
<li> The elongation rate of the RNA-polymerase
 +
<li> The number of event to be simulated ($N_pR$) (i.e. the number of RNA-polymerases which will bind on the pR promoter)
 +
</ul>
-
<p>
+
<p>Then, it will calculate the ratio between the number of RNA-polymerases, which reach $N_0$ starting from pR ($N_1$) and the number of RNA-polymerases that have initiated transcription at pR ($N_1$).
-
In addition, pINDEL contains the <em>repA101ts</em> gene encoding the RepA101Ts protein and the origin of replication (<em>ori</em>) \cite{dat,yu}. The RepA101Ts protein initiates replication at $30^\circ$C by specifically binding to the ori. The RepA101Ts protein becomes rapidly inactive when the culture is shifted at 42¡C and is therefore not able to mediate replication initiation at this temperature. The pINDEL plasmid also contains the Amp resistance gene for plasmid selection.
+
</p>
</p>
 +
<p>We also make the following approximations:
 +
\begin{itemize}
 +
\item The elongation rate of RNA-polymerase is the same and constant for the 2 transcriptional units
 +
\item The probability that RNA-polymerases prematurely terminate transcription is memoryless.
 +
\item When two RNA-polymerases come in collision with each other, they prematurely terminate transcription and will not interfere with the next round of transcription
 +
\end{itemize}
 +
</p>
 +
 +
<p>$N_pR$ represents the number of RNA-polymerase molecules that bind to the pR promoter in the time interval [$0$, $N_pR \cdot T_pR$] and $N_pBAD = N_pR .T_pR /T_pBAD$ those that bind to the pBAD promoter. For both promoters, the program will randomly generate times corresponding to the RNA-polymerases binding and transcription initiation in the time interval [$0$, $N_pR \cdot T_pR$].
 +
</p>
 +
<p>In a first step, if we neglect the risks of premature termination and interferences, and knowing the RNA-polymerase elongation rate, we can predict their positions along the pINDEL DNA as a function of time.
 +
</p>
 +
<p>In a second step, we will take into account the risk of premature termination.
 +
</p>
 +
<p>To this end, let us make some calculations taking into account the above approximations
 +
</p>
 +
<p>We know that the chance that the RNA-polymerase molecules premature terminate the transcription does not change with the amount of nucleotides already read. If we define the random variable $N$ as the number of nucleotides read by the polymerase, we can say that $N$ follows an exponential distribution ($N \sim $ Exp$(- \lambda)$). Indeed, the probability of reading, for example, $20$ more nucleotides, knowing that the polymerase already read a certain number of them doesn't depend of that number. We can then say that N is memoryless, and thus follows an exponential distribution.
 +
 +
<p>We still have to figure out what the parameter of that exponential distribution is. We estimated that their is only a small chance the polymerase transcribes the whole plasmid at once. By taking $\lambda = 3,5.10^{-4}$, we have that the chance of reading and transcribing all the nucleotides of the plasmid is close enough to zero.
 +
Now that we have the distribution of $N$:
 +
\[F(n) = P(N \leq n) = \left\{
 +
\begin{array}{l l}
 +
  1-e^{-0,00035 n} & \quad \mbox{if $n \geq 0$}\\
 +
  0 & \quad \mbox{if $n \leq 0$}\\ \end{array} \right. \]
 +
 +
We can generate random numbers following this distribution to simulate N, by applying $F^{-1}$ on a $[0,1]$-uniform random variable.
 +
Indeed, the new random variable will then follow the law which F is the distribution function. Let us show this. Let $X$ be a $[0,1]$-uniform random variable. We have that
 +
 +
\[P(X \leq x) = \left\{
 +
\begin{array}{l l l}
 +
  0 & \quad \mbox{if $x \leq 0$}\\
 +
  x & \quad \mbox{if $0 \leq x \leq 1$}\\
 +
  1 & \quad \mbox{if $x \geq 1$}\\ \end{array} \right. \]
 +
 +
Now, let $F$ be a given distribution function. Define the random variable $Y=F^{-1}(X)$. We have :
 +
\begin{equation}
 +
P(Y \leq y) = P(F^{-1}(X) \leq y) = P(X \leq F(y)) = F(y)
 +
\end{equation}
-
<p>
+
Thus Y follows the law which F is the distribution function, like announced.
-
The Red recombinase promotes the insertion of a gene of interest (gene X) coupled to an antibiotic resistance gene flanked of FRT' sites (FRT'-Cm-FRT', our biobrick BBa\_K551000 for the selection of the insertion event in the bacterial chromosome.  FLP on the other hand is responsible for the site-specific excision of the antibiotic resistance gene, after insertion of the gene of interest, leaving a FRT' site. Thus, the IN and DEL functions are antagonist. Even under <em>flp</em> repression condition ($30^\circ$C), we cannot exclude that a small amount of FLP is produced due to the $\lambda$ pR promoter leakiness. This could drastically affect the frequency of insertion because excision of the Cm resistance gene could occur prior insertion of the X gene in the bacterial chromosome. To overcome this problem, we designed a particular configuration in which the IN and DEL functional units are encoded on the opposite strands and are facing each other. Our hypothesis is that the expression of the IN function (induced by arabinose) would inhibit the DEL function expression by a mechanism denoted as transcriptional interference. First, we will study by a computer simulation whether a potential transcriptional interference occurs between these 2 opposite-oriented functional units (see section (\ref{IntTranscr})).
+
</p>
</p>
 +
<p>In a last step, we take into account the supposition that when 2 RNA-polymerases come in collision with each other, they prematurely terminate transcription and will not interfere with the next round of transcription.
-
<p>
+
In order to calculate the efficiency of transcriptional interference, we simply have to count the number of RNA-polymerases that bind to N1 and reach N0, and compare this number to the number of RNA-polymerases that binds to N1.
-
In our different models, we will consider a few parameters and we will estimate their values based on biological considerations. We will then analyze the coherence of our predictions together with the results of the experiments, and adapt the model if necessary.
+
</p>
</p>
 +
<p>As we know the size of the 2 transcriptional units \cite{dat2} :
 +
\begin{itemize}
 +
\item 2067 nt for Gam+Bet+Exo
 +
\item 1272 nt for \textit{flp}
 +
\end{itemize}
 +
We also have an idea of the elongation rate: 24-79 nt/s.\cite{brem}
 +
We can also estimate the frequencies in which RNA-polymerases bind to the promoters as we know the time it takes to a protein to be produced ($1/240 \mbox{s}^{-1}$ for pR and $1/40 \mbox{s}^{-1}$ for pBAD).
 +
The simulation finally shows that only 63\% of the \textit{flp} is transcribed due to transcriptional interferences. This seems to be coherent considering the fact that under $30\circ$C \textit{flp} is already produced to 10\% of its maximal rate. </p>
</div>
</div>
</div>
</div>

Revision as of 03:22, 22 September 2011

Modelling : Transciptional interference

In this section, we will study the transcriptional interference between the 2 functional units. \begin{figure}[!htp] \begin{center}\includegraphics{figure0.png} \caption{\label{fig0}Schematic view of the different genes and the motion of RNA-polymerase molecules. The RNA-polymerase molecule on the right will meet the 4th molecule before finishing the transcription of \textit{flp}. This represent a very simple example of transcriptional interference. $N_0$: size (nt) of the Gam, Bet and Exo genes; $N_1-N_0$: size (nt) of the \textit{flp} gene } \end{center} \end{figure}

As explained above, the pINDEL plasmid is constructed such that in the presence of arabinose and at $30\circ$C, the transcription from the pBAD promoter inhibits the transcription of the \textit{flp} gene by transcriptional interference.

We have modeled the transcriptional interference by collisions (for review, see Shearwin et al., 2003, trends in genetics). The purpose is to study this transcriptional interference by simulating the movements of the RNA-polymerases along the Gam, Bet and Exo genes and the \textit{flp} gene, and therefore estimate the interference efficiency.

The RNA-polymerase molecules bind to the promoters located in 0 (for pBAD) and N1 (for the pR promoter). The gene encoding FLP will be transcribed only if the RNA-polymerase molecules, which have initiated transcription at $N_1$ do not meet RNA-polymerase molecules between $N_1$ and $N_0$, and still interact with DNA.

The way the program works:

We enter:
  • The size of the two transcriptional units
  • The frequency in which RNA-polymerases bind to the 2 promoters (i.e the strength of the promoter): ${T_pBAD}^{-1}$ and ${T_pR}^{-1}$
  • The elongation rate of the RNA-polymerase
  • The number of event to be simulated ($N_pR$) (i.e. the number of RNA-polymerases which will bind on the pR promoter)

Then, it will calculate the ratio between the number of RNA-polymerases, which reach $N_0$ starting from pR ($N_1$) and the number of RNA-polymerases that have initiated transcription at pR ($N_1$).

We also make the following approximations: \begin{itemize} \item The elongation rate of RNA-polymerase is the same and constant for the 2 transcriptional units \item The probability that RNA-polymerases prematurely terminate transcription is memoryless. \item When two RNA-polymerases come in collision with each other, they prematurely terminate transcription and will not interfere with the next round of transcription \end{itemize}

$N_pR$ represents the number of RNA-polymerase molecules that bind to the pR promoter in the time interval [$0$, $N_pR \cdot T_pR$] and $N_pBAD = N_pR .T_pR /T_pBAD$ those that bind to the pBAD promoter. For both promoters, the program will randomly generate times corresponding to the RNA-polymerases binding and transcription initiation in the time interval [$0$, $N_pR \cdot T_pR$].

In a first step, if we neglect the risks of premature termination and interferences, and knowing the RNA-polymerase elongation rate, we can predict their positions along the pINDEL DNA as a function of time.

In a second step, we will take into account the risk of premature termination.

To this end, let us make some calculations taking into account the above approximations

We know that the chance that the RNA-polymerase molecules premature terminate the transcription does not change with the amount of nucleotides already read. If we define the random variable $N$ as the number of nucleotides read by the polymerase, we can say that $N$ follows an exponential distribution ($N \sim $ Exp$(- \lambda)$). Indeed, the probability of reading, for example, $20$ more nucleotides, knowing that the polymerase already read a certain number of them doesn't depend of that number. We can then say that N is memoryless, and thus follows an exponential distribution.

We still have to figure out what the parameter of that exponential distribution is. We estimated that their is only a small chance the polymerase transcribes the whole plasmid at once. By taking $\lambda = 3,5.10^{-4}$, we have that the chance of reading and transcribing all the nucleotides of the plasmid is close enough to zero. Now that we have the distribution of $N$: \[F(n) = P(N \leq n) = \left\{ \begin{array}{l l} 1-e^{-0,00035 n} & \quad \mbox{if $n \geq 0$}\\ 0 & \quad \mbox{if $n \leq 0$}\\ \end{array} \right. \] We can generate random numbers following this distribution to simulate N, by applying $F^{-1}$ on a $[0,1]$-uniform random variable. Indeed, the new random variable will then follow the law which F is the distribution function. Let us show this. Let $X$ be a $[0,1]$-uniform random variable. We have that \[P(X \leq x) = \left\{ \begin{array}{l l l} 0 & \quad \mbox{if $x \leq 0$}\\ x & \quad \mbox{if $0 \leq x \leq 1$}\\ 1 & \quad \mbox{if $x \geq 1$}\\ \end{array} \right. \] Now, let $F$ be a given distribution function. Define the random variable $Y=F^{-1}(X)$. We have : \begin{equation} P(Y \leq y) = P(F^{-1}(X) \leq y) = P(X \leq F(y)) = F(y) \end{equation} Thus Y follows the law which F is the distribution function, like announced.

In a last step, we take into account the supposition that when 2 RNA-polymerases come in collision with each other, they prematurely terminate transcription and will not interfere with the next round of transcription. In order to calculate the efficiency of transcriptional interference, we simply have to count the number of RNA-polymerases that bind to N1 and reach N0, and compare this number to the number of RNA-polymerases that binds to N1.

As we know the size of the 2 transcriptional units \cite{dat2} : \begin{itemize} \item 2067 nt for Gam+Bet+Exo \item 1272 nt for \textit{flp} \end{itemize} We also have an idea of the elongation rate: 24-79 nt/s.\cite{brem} We can also estimate the frequencies in which RNA-polymerases bind to the promoters as we know the time it takes to a protein to be produced ($1/240 \mbox{s}^{-1}$ for pR and $1/40 \mbox{s}^{-1}$ for pBAD). The simulation finally shows that only 63\% of the \textit{flp} is transcribed due to transcriptional interferences. This seems to be coherent considering the fact that under $30\circ$C \textit{flp} is already produced to 10\% of its maximal rate.

iGEM ULB Brussels Team - Contact us