Multiplex Automated Genome Engineering
Multiplex automated genome engineering (MAGE) allows for large-scale programming and evolution of cells. Mediated by λ-Red ssDNA-binding protein β, oligos are incorporated into the lagging strand of the replication fork during DNA replication, creating a new allele that will spread through the population as the bacteria divide. The efficiency of oligo incorporation depends on several factors, but the frequency of the allele can be increased by performing multiple rounds of MAGE on the same cell culture. MAGE facilitates rapid and continuous generation of a diverse set of genetic changes (mismatches, insertions, deletions). This multiplex approach embraces engineering in the context of evolution by expediting the design and evolution of organisms with new and improved properties.
Figure adapted from Wang et al., 2009. Each cell contains a different set of mutations, producing a heterogeneous population of rich diversity (denoted by distinct chromosomes in different cells). Degenerate oligo pools that target specific genomic positions enable the generation of a diverse set of sequences at each chromosomal location. (Wang et al, 2009)
This figure is from Wang et al., 2009. MAGE is capable of producing mismatches, insertions or deletions.
This graphic, from Farren Isaacs, describes how to design oligos for MAGE. Our oligos were approximately 90 bases long with the first 5’ base phosphorothioated (increases recombination efficiency). Since we integrated the RiAFP into the genome in replichore 1 (sites 78110 and 1415740; determined at ecocyc.org), we designed oligos to target the appropriate strand. Mismatches, insertions, and deletions were centered on the oligo to increase recombination efficiency.
This figure (Wang, 2009) describes the efficiency of incorporation for different types of sequence modifications.
Design and modeling of AFP-evolving oligonucleotides
We designed and ordered 22 degenerate oligos to target the theorized ice-binding site of RiAFP. These oligos were designed to insert additional Tx repeats, delete Tx repeats, delete entire TxT segments, and replace regions with degenerate TxTxTxT repeats. The specific MAGE oligos and design methodology can be found in this document
The degeneracy of the entire set of 22 oligos represents the following complexity:
During MAGE, the distribution of genetic variants in a cell population depends on several factors: the number of genomic loci simultaneously targeted (K), the efficiency of oligo binding at each of those loci (M),and the number of MAGE cycles performed (N). For a simple system of identical mismatch modifications at discrete loci, such as ten separate gene knockouts, the cells are binomially distributed over the number of mutations j (modified from Wang et al., 2009):
(In this binomial distribution, the exponents N represent the compounding effect of multiple cycles, since the probability of remaining unmutated in cycle N depends also on being unmutated in N – 1.) With such a distribution and an uptake efficiency of %30 per locus per cycle, the majority of cells will have incorporated all 10 knockouts after 90 cycles (modified from Wang et al., 2009):
From this curve, we can also generate the fraction of cells within the population mutated as a cumulative function of the number of MAGE cycles (left) and number of unmutated cells remaining at each cycle (right):
Whole genome engineering for optimized protein production
Besides targeting and optimizing a highly localized genetic region (i.e. our RiAFP's predicted ice-binding sites), MAGE is also able to target multifarious sites on the genome. Thus, our second MAGE experiment takes advantage of this ability to simultaneously evolve diverse regions on the E. coli genome to optimize recombinant protein production. By using a transcriptome profile (Haddadin and Harcum, 2004) of recombinant E. coli protein production, we are able to locate which genes are most up-regulated and down-regulated during IPTG-induction of the protein. These regions, which are subsequently the best targets for oligonucleotide targeting via MAGE, include:
- Various energy synthesis genes (down-regulated)
- Phage shock protein genes (down-regulated)
- Phage repressor genes (down-regulated)
- Transposon genes (up-regulated)
- IS-element genes (up-regulated)
By integrating our eGFP-RiAFP fusion protein into the E. coli genome (along with the upstream T7 promoter), we are able to screen via fluorescence for colonies (post-IPTG) that demonstrate highest recombinant protein production. Clearly, these strains would have significant utility in large-scale applications in both industry and academia.
Thus, the Yale iGEM team intends to submit a new RFC that supports and details full-genome engineering based on MAGE, as we believe that an integral component to the modularity of synthetic biology are well-engineered and well-characterized strains/chassis, which have received surprisingly little attention in iGEM. As additional parts are added to the registry every year, it is increasingly crucial to actually demonstrate reliable performance of the submitted parts, and MAGE allows an easy platform for optimization of parts expression in-vivo.
In order to perform MAGE, we needed to first integrate RiAFP into the genome of the EcNR2 strain. The RiAFP gene was linked to kanamycin by crossover PCR. dsDNA recombination efficiency data from Conjugative Assembly Genome Engineering (Isaacs, et al 2011). For further details, please see protocols section.
Gel confirming success of PCR reactions for eventual crossover PCR. 1: 100 bp ladder, 2: 1kb ladder, 3: 1kb ladder, 4: GR7 - i (PCR reaction for RiGFP, 78 integration site, and Kan crossover site), 5: replicate of above, 6: replicate of above, 7: 141Kan - i (PCR reaction for Kanamycin, 141 integration site), 8: replicate of above, 9: replicate of above, 10: GR14 - i (PCR reaction for RiGFP, 141 integration site, and Kan crossover site), 11: replicate of above, 12: replicate of above, 13: 78Kan - i (PCR reaction for Kanamycin, 78 integration site), 14: replicate of above, 15: replicate of above
Gel confirming success of PCR reactions for eventual crossover PCR. 1: 1kb ladder, 2: 1 kb ladder, 3: 100 bp ladder, 4: did not work - AR7 (supposed to be RiAFP gene only with 78 integration site and Kan crossover site), 5: did not work - replicate of above, 6: worked: AR14 ii - (RiAFP with 141 integration site and Kan crossover site)
Gels confirming success of crossover PCR for lambda red integration.1: ladder:, 2: GR7 + 78 kan (failed), 3: replicate, 4: replicate, 5: GR14 + 141 Kan (worked), 6: replicate, 7: replicate, 8: AR14 + 141 Kan (worked), 9: replicate, 10: replicate, 11: replicate
The lambda-red recombination of the AFP-Kan and AFP-eGFP-Kan in the EcNR2 strain worked; there were colonies on the appropriate Kanamycin plates and no colonies on the -ve control Kanamycin plates (no DNA added for electroporation). See below for sample colony plates (top row, negative control plates with no colonies; bottom row left plate with colonies of RiAFP-GFP-Kanamycin ; bottom row right plate with colonies of RiAFP-Kanamycin
Thus far, we have generated a diverse population of mutants for the antifreeze protein sequence. Based on data from the original MAGE paper, we have generated four hundred and thirty four million predicted genomic variants thus far. This represents more potential “biobricks” than currently exist in the iGEM registry, generated in one experiment! We are currently applying the selective pressure of multiple freeze thaw cycles. We intend to run additional MAGE cycles on mutants that survive multiple freeze thaw cycles to hopefully generate and then characterize “superactive”, soluble antifreeze proteins.