Team:Imperial College London/Software
From 2011.igem.org
PoppyField (Talk | contribs) |
PoppyField (Talk | contribs) |
||
Line 10: | Line 10: | ||
<h2>Background:</h2> | <h2>Background:</h2> | ||
- | <p>The genetic code is redundant which means that multiple codons can encode the same amino acid. Synonymous codons are circumstantially decoded by the cellular machinery at different speeds. This phenomenon means that | + | <p>The genetic code is redundant which means that multiple codons can encode the same amino acid. Synonymous codons are circumstantially decoded by the cellular machinery at different speeds. Codon usage also varies between different species. This phenomenon means that sequence optimisation can be used to tune protein expression levels. |
- | + | ||
It is tempting to think that one could codon optimise a sequence by selectively using an organism’s preferred codons. This is commonly referred to as the "one amino acid-one codon" method. Unfortunately it does not work. | It is tempting to think that one could codon optimise a sequence by selectively using an organism’s preferred codons. This is commonly referred to as the "one amino acid-one codon" method. Unfortunately it does not work. | ||
- | Recent optimisation studies have highlighted the importance of maintaining a diverse codon population in a given coding sequence. This said, the inclusion of ‘rare codons’ has also been shown to dramatically reduce protein expression E.coli.</p> | + | Recent optimisation studies have highlighted the importance of maintaining a diverse codon population in a given coding sequence. This said, the inclusion of ‘rare codons’ has also been shown to dramatically reduce protein expression <i>E.coli.</i></p> |
<h2>Solution:</h2> | <h2>Solution:</h2> | ||
- | <p>In our approach to codon optimisation, we attempted to maintain codon diversity while simultaneously limiting rare codon inclusion. This was achieved by | + | <p>In our approach to codon optimisation, we attempted to maintain codon diversity while simultaneously limiting rare codon inclusion. This was achieved by weighting codon selection using bias tables obtained from the <i>Codon Usage Database</i>[a]. Joint optimisation was facilitated by combining the bias tables of <i>E.coli</i> and <i>B.subtilis</i>. Following the generation of a <i>seed-sequence</i>, the stochastic pruning of rare codons was used to iteratively optimise the sequence.</p> |
<h2>In-silico Testing:</h2> | <h2>In-silico Testing:</h2> |
Revision as of 03:02, 22 September 2011
Software - Joint Codon Optimisation Algorithm
We wanted the flexibility to express the genes responsible for auxin production in both B.subtilis and E.coli. To achieve this, we decided to joint codon optimise the IaaM and IaaH coding sequences. Since, we could not find any software for this, we wrote our own.
Background:
The genetic code is redundant which means that multiple codons can encode the same amino acid. Synonymous codons are circumstantially decoded by the cellular machinery at different speeds. Codon usage also varies between different species. This phenomenon means that sequence optimisation can be used to tune protein expression levels. It is tempting to think that one could codon optimise a sequence by selectively using an organism’s preferred codons. This is commonly referred to as the "one amino acid-one codon" method. Unfortunately it does not work. Recent optimisation studies have highlighted the importance of maintaining a diverse codon population in a given coding sequence. This said, the inclusion of ‘rare codons’ has also been shown to dramatically reduce protein expression E.coli.
Solution:
In our approach to codon optimisation, we attempted to maintain codon diversity while simultaneously limiting rare codon inclusion. This was achieved by weighting codon selection using bias tables obtained from the Codon Usage Database[a]. Joint optimisation was facilitated by combining the bias tables of E.coli and B.subtilis. Following the generation of a seed-sequence, the stochastic pruning of rare codons was used to iteratively optimise the sequence.
In-silico Testing:
To test our codon optimisation software, we ran the protein Dendra2 (BBa_K515007) through our software. The resultant DNA sequences were then fed into Genscript’s Codon Adaptation Index (CAI) analyser. This online tool measures the suitability of a sequence for expression in E.coli. Genscript’s own codon optimisation claims to be able to generate sequences with a CAI > 0.8. We were able to match this.
Figure 1: (Data generated by Imperial College iGEM team 2011.)
Future Work:
Recent work has suggested that rather than using codon frequency tables, it is better to use codons that are read by a subset of tRNAs that the most frequently charged during amino acid starvation. http://www.plosone.org/article/info:doi/10.1371/journal.pone.0007002 Once enough data is available for both E.coli and B.subtilis this data could be incorporated into the program.