Team:St Andrews/modelling

From 2011.igem.org

Modelling

Details of our promoter: pBAD Strong

We decided to utilise the pBAD strong promoter (K206000), which was created by the British Columbia iGEM 2009 team as a mutagenized form of the naturally occurring pBAD promoter. The modelling was based upon and used references pertaining to the pBAD promoter, also known as the 'ara operon'. pBAD and pBAD strong are both arabinose-inducible promoters.

In the absence and presence of arabinose, the pBAD promoter acts as a repressor and an inducer respectively. The gene products of pBAD in Escherichia coli allow the cells to take up and catabolize L-arabinose, a five-carbon sugar. pBAD’s very basic structure is depicted below:

Figure 1: A basic structural outline of the pBAD and, adjacent, AraC protein, in the absence of arabinose. (Schleif, 2011)

The three enzymes that comprise the pBAD promoter cause the catabolism of the sugar arabinose as follows:

araA – arabinose isomerase which converts arabinose to ribulose

araB – ribulokinase which phosphorylates ribulose

araD – ribulose-5-phosphate epimerase which converts ribulose-5-phosphate which can then be metabolised via the pentose phosphate pathway (Patel, GU)

Adjacent to the three pBAD structural genes is the AraC regulatory gene. The dimeric (the compound comprises of two structurally similar subunits called monomers which together make a dimer) AraC protein actively represses transcription as well as the synthesis of the pBAD genes; this occurs when arabinose is not present in the environment. The AraC protein binds to the half sites araO2 and ara I1 which creates a loop within the DNA, thus blocking the RNA polymerase from binding to the pC and pBAD promoters. This can be seen in Figure 2, where it’s possible to comprehend the position of the relative half-sites.

Figure 2: The red circles in the AraC pockets denote the inducer, L-arabinose. (Schleif, 2003)

When arabinose is present, it binds to the AraC protein and this destabilises the AraC protein binding to the araI1-O2 half-site looped complex, but stabilizes binding to the adjacent half-sites araI1 and araI2, which are upstream of the pBAD promoter. This then ‘straightens’ the DNA loop, (Carra and Schleif, 1993) allowing the activation of the transcription of pBAD (Schleif, 2011).

Creating a model

The modelling was established using the theoretical equations used in St Andrews 2010 iGEM team which can be found in their wiki under the subheading “Typical ODE Elements”. Our system is a series of five ordinary differential equations which attempt to mathematically describe the very basic functions in our biology, they are as follows:

List of Equations

Notation Table

Programs Used

We considered coding our own basic iterative method but consequently decided that there may be too many complications and bugs to worry about so opted to perform our modelling in the MATLAB environment. We also opted for the same strategy that last year’s St Andrews team adopted, namely the 4th order Runge-Kutta solver. This solver is inbuilt to the MATLAB environment, and is stated as ode45.

Parameters Used

The various parameters were researched throughout previous iGEM team’s findings, several journals and numerous articles. The initial relevant findings and their sources are given below:

Referring back to the equations, it can be seen that we had difficulties in locating five of the parameters within any literature. These particular constants and constraints were attempted, via parameter sensitivity analysis, at being established. We used ranges for these elusive parameters to find optimal values.

Parameter Ranges

These rates were estimated from a variety of sources. The optimal extracellular arabinose concentration is actually assumed to be around 1x10-6M, as if the extracellular arabinose concentration exceeds 1x10-7M then it allows the induction of the AraC promoter [8] and according to the Standard Registry of Biological Parts [1], the ‘switching point’ for arabinose in pBAD is 3.14x10-6M.

Initial Concentrations

Results

There was a significant amount of output from the model as we had a considerable volume of parameters to compare against. 5 subplots were created from the model, which were in reference to the time evolution of the concentrations of intracellular arabinose, arabinose-AraC complex, AraC protein, mRNA coding and protegrin-1 respectively. For the production of the protegrin-1, mainly encased in equation (1.5), the parameters are unknown so there was a great deal of variation in the final graph produced in the output. Below are two selected outputs from the model which depict the different behaviour and the possible instability of the protegrin-1 production.

Figure 3: The MATLAB output from the stated initial conditions (where the y-axis is molar concentration (M) for the respective compounds or complexes for each respective equation (1.1)-(1.5) and the x-axis is time (seconds)).

Figure 4: The MATLAB output from the stated initial conditions (where the y-axis is molar concentration (M) for the respective compounds or complexes for each respective equation (1.1)-(1.5) and the x-axis is time (seconds)).

These two sets of graphs use a similar set of initial conditions chosen from the ranges given above. The initial conditions that actually created each series are labelled above each respective graph array. However it can be seen that the Figure 3 has a complication when it comes to the concentration of the protegrin-1 concentration graph (the last in each of the series). These graphs depict the instability that forms because of the initial conditions. The method employed in creating the equations allows the protegrin-1 to remain positive if it starts positive. However, after the initial time lag the protegrin-1 concentration must become negative and therefore induce the instability that is shown. Also the oscillatory and almost severe peaks and troughs formed on the graph were assumed to have been contributed from the time step utilised in the odesolver inbuilt in MATLAB. The time step in MATLAB attempts to optimise the data that it collects which can also affect the output.

Figure 4 shows a much smoother inclination, and positive, for the protegrin-1 concentration and, at the end of the iterations, appears to be approaching a steady concentration value of 4x10-4. This was an expected output for the model. This comparison between just 2 selected outputs, details the variation between very small changes in the initial conditions and how they affect the behaviour of the model.

Assumptions and Complications

The modelling was based on a set of assumptions and there were a variety of complications that were neglected to simplify the model initially. Any further work would have included more of these complications to match the real biological system.

The theory based on the system was based on a unicellular system however the concentrations used are for multicellular environments. The cell’s own pH was assumed to be similar to deionised water when referring to the degradation of arabinose inside the cell. The diffusion of arabinose within the cell was assumed to be instantaneous, moreover that once the arabinose has entered the cell it would remain intracellular. In addition to this, the diffusion of compounds within the cell to the right place was deemed as instantaneous, i.e. space in the model was neglected. There is also an equilibrium created between L-arabinose (the particular structure that induces pBAD) and L-ribulose [1], so as increased concentrations of the arabinose is added there is a shift to create L-ribulose (which is another sugar that would have been consumed by the cell). This means that the arabinose concentration may decrease slightly before the equilibrium stabilises. In our case we assumed that this shift, again, was instantaneous [1]. The degradation of arabinose was to include any consumption or use of it in other metabolic pathways in the cell.

Also it was presumed that there were infinite supplies of ribosome and ribosome binding sites available and that the RNA polymerase would not become degraded. A condition we placed on the model, as well as assumed, was that the maximal transcription rate is always greater than the half-maximal transcription rate as this makes logical sense. Dissociation and association constants of the AraC protein from the half-sites were neglected at this stage as were investigating the simplest setup of the biology. This particular process is reasonably slow and would have caused a time lag in the initial stages.

Conclusions

Based on our current results, we were not able to find conclusive data. The next stage would have been to look at the stability of the system and which rate values the model would have behaved as expected or become oscillatory.

We have a number of theories that have been discussed as to the technicalities encountered in the model. The first of which involved the maximal transcription rate constants’ units and the arabinose-AraC complex association constant. These were initially supposed to be in s-1 however the kmaxprodRNA was considered to be in Ms-1, the k1/2maxprodRNA in M and the arabinose-AraC complex in M-1s-1. In order to find the association constant, it was discussed that second order chemical kinetics be utilised and, by knowing the equilibrium constant [1], the association constant could be calculated.

The second theory was that once we have a suggested value for the association constant, we can provide some mathematical estimation in finding the transcriptional rate constants. This is based on whether the equations (1.1)-(1.3), referred to as the AraC cycle, is much faster than the PG-1 cycle, equations (1.4) and (1.5), or even if both cycles perform at similar speeds. If we assume that both sets are of a relatively comparable speed then all terms on the right hand side of the ODEs should be of a similar magnitude and we could choose the constants to show this is true. However, if the two cycles are of differing speeds then the requirement would be to find the equilibrium of AraC cycle, by allowing the left hand side of each equation equal zero. Via this process, the steady value output for the arabinose-AraC (A1AraC) complex could be obtained and then entered into the AMP cycle, therefore creating two separate systems.

References:

[1] – Schleif, R (1969), "Induction of the L-arabinose operon", J.Mol.Biol, Vol 46, pg197-199. Link to paper.

[2] – Hendrickson, W and Schleif, R (1984), "Regulation of the Escherichia coli L-Arabinose Operon Studied by Gel Electrophoresis DNA Binding Assay", J.Mol.Biol, Vol 174, pg 611-628. Link to paper.

[3] – Koostra, A et al (2009), "Differential effects of mineral and organic acids on the kinetics of arabinose degradation under lignocellulose pretreatment conditions", Biochemical Engineering Journal, Vol 43, pg 92-97. Link to paper.

[4] – Davidson, C et al (2010), "Integration of transcriptional inputs at promoters of the arabinose catabolic pathway", BMC Systems Biology, Vol 4, pg75. Link to paper.

[5] – N/A (2011), "Registry of Standard Biological Parts - pBAD Strong Characterization", Last Updated 22 Oct 2009) Link.

[6] – Alon, U (2007) “An Introduction to Systems Biology Design Principles of Biological Circuits.” London: Chapman & Hall/CRC, 1st ed.

[7] – Weldon, J et al (2007), "Structure and Properties of a Truly Apo Form of AraC Dimerization Domain", Proteins: Structure, Function and Bioinformatics, Vol 66, pg646-654. Link to paper.

[8] – Schleif, R (2010), "AraC protein, regulation of the L-arabinose operon in Escherichia coli , and the light switch mechanism of AraC action", FEMS Microbiol Rev, pg1-18. Link to paper.

Patel, Bharat. "Regulation of Gene Expression in Prokaryotes." Lecture 4. Griffiths University. Link to paper.

Schleif, R (2003), "AraC Protein: A Love-Hate Relationship." BioEssays, Vol. 25, pg. 274-282. Link to paper.

Schleif, Robert. The Johns Hopkins University. Last updated August 2011. Link to paper.

Carra, John and Schleif, R (1993) "Variation of half-site organization and DNA looping by AraC protein." The EMBO Journal, Vol. 12, pg. 35-44. Link to paper.