Team:Arizona State/Project/CRISPR

From 2011.igem.org

(Difference between revisions)
 
(14 intermediate revisions not shown)
Line 1: Line 1:
{{:Team:Arizona State/Templates/main|title=CRISPR|content=
{{:Team:Arizona State/Templates/main|title=CRISPR|content=
 
 
 +
''See [[Team:Arizona State/Glossary|glossary]] for explanation of various abbreviations used on this page.''
''See [[Team:Arizona State/Glossary|glossary]] for explanation of various abbreviations used on this page.''
-
<p>'''C'''lustered '''R'''egularly '''I'''nterspaced '''S'''hort '''P'''alindromic '''R'''epeats (CRISPR) are a genomic feature of many prokaryotic and archaeal species. 40% of sequenced bacterial genomes and 90% of archaeal genomes contain at least one CRSIPR array{{:Team:Arizona State/Templates/ref|18}}. It is possible that many laboratory strains of bacteria, which are the sources of many available genome sequences, have lost CRISPR due to a lack of exposure to phages{{:Team:Arizona State/Templates/ref|40}}.</p>
+
<p>'''C'''lustered '''R'''egularly '''I'''nterspaced '''S'''hort '''P'''alindromic '''R'''epeats (CRISPR) are a genomic feature of many prokaryotic and archaeal species. 40% of sequenced bacterial genomes and 90% of archaeal genomes contain at least one CRISPR array{{:Team:Arizona State/Templates/ref|20}}. It is possible that many laboratory strains of bacteria, which are the sources of many available genome sequences, have lost CRISPR due to a lack of exposure to phages{{:Team:Arizona State/Templates/ref|42}}.</p>
-
<p>CRISPR functions as an adaptive and inheritable immune system{{:Team:Arizona State/Templates/ref|38}}{{:Team:Arizona State/Templates/ref|50}}{{:Team:Arizona State/Templates/ref|31}}{{:Team:Arizona State/Templates/ref|36}}{{:Team:Arizona State/Templates/ref|40}}. A CRISPR locus consists of a set of Cas (CRISPR associated) genes, a leader, or promoter, sequence, and an array. This array consists of repeating elements along with "spacers". These spacer regions direct the CRISPR machinery to degrade or otherwise inactivate a complementary sequence in the cell.</p>
+
<p>CRISPR functions as an adaptive and inheritable immune system{{:Team:Arizona State/Templates/ref|40}}{{:Team:Arizona State/Templates/ref|53}}{{:Team:Arizona State/Templates/ref|34}}{{:Team:Arizona State/Templates/ref|38}}{{:Team:Arizona State/Templates/ref|42}}. A CRISPR locus consists of a set of Cas (CRISPR associated) genes, a leader, or promoter, sequence, and an array. This array consists of repeating elements along with "spacers". These spacer regions direct the CRISPR machinery to degrade or otherwise inactivate a complementary sequence in the cell.</p>
 +
 
 +
== The CRISPR array ==
 +
<p>Genetic information from previous encounters is stored in the array as spacers. These spacers are consistent in length (30-40 bp), and are flanked by repeating elements (also 30-40 bp). The repeating elements are usually partially palindromic, and form secondary structures when transcribed into pre-crRNA. These structures may be necessary for recognition and cleavage.</p>
== Engineered arrays ==
== Engineered arrays ==
-
<p>By engineering a spacer complementary to T3 phage, increased survival was demonstrated{{:Team:Arizona State/Templates/ref|15}}{{:Team:Arizona State/Templates/ref|23}}{{:Team:Arizona State/Templates/ref|26}}{{:Team:Arizona State/Templates/ref|48}}{{:Team:Arizona State/Templates/ref|55}}. A customized spacer can prevent transformation of PC194 plasmids with a matching sequence{{:Team:Arizona State/Templates/ref|26}}.</p>
+
<p>By engineering a spacer complementary to T3 phage, increased survival was demonstrated{{:Team:Arizona State/Templates/ref|17}}{{:Team:Arizona State/Templates/ref|25}}{{:Team:Arizona State/Templates/ref|28}}{{:Team:Arizona State/Templates/ref|51}}{{:Team:Arizona State/Templates/ref|59}}. A customized spacer can prevent transformation of PC194 plasmids with a matching sequence{{:Team:Arizona State/Templates/ref|28}}.</p>
-
== CRISPR in ''E. coli'' ==
+
== CRISPR in ''Escherichia coli K-12 substr. MG1655'' ==
-
<p>''E. coli'' contains a type I CRISPR system. There are four CRISPR loci in ''Escherichia coli K-12 substr. MG1655''. CRISPR1, the largest, is associated with eight Cas genes{{:Team:Arizona State/Templates/ref|70}}. In the classification scheme presented by Haft et al{{:Team:Arizona State/Templates/ref|13}}, these genes form the Cse family: casA, casB, casC, casD, casE, aka cse1, cse2, cse3, cse4, cas5e{{:Team:Arizona State/Templates/ref|13}}. These 5 proteins combine to form the Cascade complex{{:Team:Arizona State/Templates/ref|60}}. This is a protein complex of all 5 Cse genes, resembling a seahorse in shape{{:Team:Arizona State/Templates/ref|60}}. Its full composition is 1x casA, 2x casB, 6x casC, 1x casD, 1x casE{{:Team:Arizona State/Templates/ref|60}}. Specifically, casE cleaves pre-crRNA{{:Team:Arizona State/Templates/ref|23}}, and casA and casB can be omitted without affecting crRNA generation, but are necessary for phage resistance{{:Team:Arizona State/Templates/ref|60}}. This complex binds double stranded target DNA without need or enhancement by cofactors such as metal ions or ATP{{:Team:Arizona State/Templates/ref|60}}. It also undergoes conformational changes when binding DNA{{:Team:Arizona State/Templates/ref|60}}{{:Team:Arizona State/Templates/ref|76}}.</p>
+
'''[[Team:Arizona State/Project/E coli|Project page]]'''
-
<p>cas gene transcription is repressed by H-NS{{:Team:Arizona State/Templates/ref|67}}, and de-repressed by leuO{{:Team:Arizona State/Templates/ref|45}} or baeR{{:Team:Arizona State/Templates/ref|54}}</p>.
+
<p>''E. coli'' contains a type I CRISPR system. There are four CRISPR loci in this organism. CRISPR1, the largest, is associated with eight Cas genes{{:Team:Arizona State/Templates/ref|73}}. In the classification scheme presented by Haft et al{{:Team:Arizona State/Templates/ref|15}}, these genes form the Cse family: casA, casB, casC, casD, casE, aka cse1, cse2, cse3, cse4, cas5e{{:Team:Arizona State/Templates/ref|15}}. These 5 proteins combine to form the Cascade complex{{:Team:Arizona State/Templates/ref|63}}. This is a protein complex of all 5 Cse genes, resembling a seahorse in shape{{:Team:Arizona State/Templates/ref|63}}. Its full composition is 1x casA, 2x casB, 6x casC, 1x casD, 1x casE{{:Team:Arizona State/Templates/ref|63}}. Specifically, casE cleaves pre-crRNA{{:Team:Arizona State/Templates/ref|25}}, and casA and casB can be omitted without affecting crRNA generation, but are necessary for phage resistance{{:Team:Arizona State/Templates/ref|63}}. This complex binds double stranded target DNA without need or enhancement by cofactors such as metal ions or ATP{{:Team:Arizona State/Templates/ref|63}}. It also undergoes conformational changes when binding DNA{{:Team:Arizona State/Templates/ref|63}}{{:Team:Arizona State/Templates/ref|76}}.</p>
-
[[Image:Arizona State E coli CRISPR I|caption=Structure of the CRISPRI locus in ''E. coli''. 3 promoters have been identified{{:Team:Arizona State/Templates/ref|75}}: Pcrispr1, Pcas, and anti-Pcas.]]
+
<p>Cas gene transcription is repressed by H-NS{{:Team:Arizona State/Templates/ref|70}}, and de-repressed by leuO{{:Team:Arizona State/Templates/ref|48}} or baeR{{:Team:Arizona State/Templates/ref|57}}</p>.
 +
[[Image:E coli crispr full.png|600px]]<small>
 +
:Coordinates:
 +
:: CrisprI: 2875723-2876485
 +
:: Leader: 2876485-2876592
 +
:: Cas2 (ygbF): 2876592-2876875
 +
:: Cas1 (ygbT): 2876877-2877794
 +
:: CasE: 2877810-2878409
 +
:: CasD: 2878396-2879070
 +
:: CasC: 2879073-2880164
 +
:: CasB: 2880177-2880659
 +
:: CasA: 2880652-2882160
 +
:: IGLB regulatory region: 2882160-2882575
 +
:: Cas3 (ygcB): 2882575-2885241
 +
</small>
 +
<p>Structure of the CRISPR I locus in ''E. coli''. 3 promoters have been characterized{{:Team:Arizona State/Templates/ref|43}}: Pcrispr1, Pcas, and anti-Pcas.</p>
-
== CRISPR in ''Pyrococcus furiosus''==
+
== CRISPR in ''Pyrococcus furiosus DSM 3638''==
-
<p>''P. furiosus'' contains 7 CRISPR loci, along with 29 Cas genes in 2 gene clusters{{:Team:Arizona State/Templates/ref|33}}. All 6 core Cas genes (cas1-cas6), as well as genes from the Cmr (type III), Cst (type I), and Csa (type I) families are present. Cmr1-6 have been found to form a Cascade-like complex that targets RNA in in-vitro experiments{{:Team:Arizona State/Templates/ref|33}}.</p>
+
<p>This organism is notable due to the diversity of its Cas genes, as well as its possible RNA targeting. ''P. furiosus'' contains 7 CRISPR loci, along with 29 Cas genes in 2 gene clusters{{:Team:Arizona State/Templates/ref|35}}. All 6 core Cas genes (cas1-cas6), as well as genes from the Cmr (type III), Cst (type I), and Csa (type I) families are present. Cmr1-6 have been found to form a Cascade-like complex that targets RNA in in-vitro experiments{{:Team:Arizona State/Templates/ref|35}}.</p>
-
[[Image:Arizona State P furiosus CRISPR]]
+
== CRISPR in ''Bacillus halodurans C-125''==
== CRISPR in ''Bacillus halodurans C-125''==
-
<p>''B. halodurans'' contains 6 Cmr genes (Cmr1-6) in a single locus. This is a type III CRISPR system.</p>
+
'''[[Team:Arizona State/Project/B halodurans|Project page]]'''
 +
<p>''B. halodurans'' contains 6 Cmr genes (Cmr1-6) in a single locus. This is a type III CRISPR system. The organism also contains Csd1 and Csd2 (Dvulg subtype I-C) along with Cas1, Cas2, Cas3, Cas4, and Cas5 in another locus.</p>
-
[[Image:Arizona State B halodurans CRISPR]]
+
[[Image:ASU B halodurans crispr.png|400px]]<br>
 +
<small>*Note: These regions have not been characterized in the literature and are speculative. The two regions shown are homologous.
 +
:Coordinates:
 +
:: CrisprI: 339740-340773
 +
:: Leader: 345821-345991
 +
:: CrisprII: 345991-347087
 +
:: Cmr1 (RAMP): 347483-348859
 +
:: Cmr2 (RAMP): 348829-350643
 +
:: Cmr3 (RAMP): 350603-351874
 +
:: Cmr4 (RAMP): 351874-352785
 +
:: Cmr5 (RAMP): 352782-353159
 +
:: Cmr6 (RAMP): 353156-353902
 +
:: Leader: 355127-355361
 +
:: CrisprIII: 355361-356328
 +
:: Cas3 (Dvulg): 358059-360461
 +
:: Cas5 (Dvulg): 360504-361214
 +
:: Csd1 (Dvulg): 361216-363099
 +
:: Csd2 (Dvulg): 363096-363947
 +
:: Cas4 (Dvulg): 364072-364596
 +
:: Cas1 (Dvulg): 364593-365624
 +
:: Cas2 (Dvulg): 365633-365923
 +
:: CrisprIV: 366106-368458
 +
:: CrisprV: 378211-378913
 +
</small>
-
== CRISPR in ''Listeria innocua''==
+
== CRISPR in ''Listeria innocua Clip11262''==
-
<p>''L. innocua'' contains a type II CRISPR system. A single gene (Cas9) has been shown to be necessary for the expression and inactivation stages of the pathway{{:Team:Arizona State/Templates/ref|62}}. A separate trans-encoded small RNA (tracrRNA) binds with the repeat segment of the pre-crRNA{{:Team:Arizona State/Templates/ref|59}}, followed by cleavage by RNase III and binding with Cas9.</p>
+
'''[[Team:Arizona State/Project/L innocua|Project page]]'''
 +
<p>''L. innocua'' contains a type II CRISPR system. A single gene (Cas9 / Csn1) has been shown to be necessary for the expression and inactivation stages of the pathway{{:Team:Arizona State/Templates/ref|65}}. A separate trans-encoded small RNA (tracrRNA) binds with the repeat segment of the pre-crRNA{{:Team:Arizona State/Templates/ref|62}}, followed by cleavage by RNase III and binding with Cas9.</p>
-
[[Image:Arizona State L innocua CRISPR]]
+
[[Image:ASU L innocua crispr.png|400px]]<br>
 +
<small>
 +
:Coordinates:
 +
:: CrisprI: 2768992-2769687
 +
:: Leader: 2769687-2769814
 +
:: Cas2 (lin2742): 2769814-2770404
 +
:: Cas1 (lin2743): 2770410-2770706
 +
:: Csn1 (lin2744): 2770707-2774711
 +
:: TracrRNA: 2774711-2774865
 +
</small>
== Stages of the CRISPR pathway ==
== Stages of the CRISPR pathway ==
-
<p>There are 3 distinct stages of the CRISPR pathway: integration{{:Team:Arizona State/Templates/ref|15}}{{:Team:Arizona State/Templates/ref|51}}{{:Team:Arizona State/Templates/ref|20}}, expression, and adaptation.</p>
+
<p>There are 3 distinct stages of the CRISPR pathway: integration{{:Team:Arizona State/Templates/ref|17}}{{:Team:Arizona State/Templates/ref|54}}{{:Team:Arizona State/Templates/ref|22}}, expression, and adaptation.</p>
=== Integration / Adaptation ===
=== Integration / Adaptation ===
-
<p>In this step, DNA, commonly derived from phages and plasmids{{:Team:Arizona State/Templates/ref|45}}, is recognized and processed by Cas proteins. Information from outside of the genome is recognized and incorporated into the leader end of an existing array. This involves cas1 and cas2{{:Team:Arizona State/Templates/ref|30}}{{:Team:Arizona State/Templates/ref|57}}. The integration stage is currently the least understood aspect of the pathway.</p>
+
<p>In this step, DNA, commonly derived from phages and plasmids{{:Team:Arizona State/Templates/ref|48}}, is recognized and processed by Cas proteins. Information from outside of the genome is recognized and incorporated into the leader end of an existing array. This involves cas1 and cas2{{:Team:Arizona State/Templates/ref|32}}{{:Team:Arizona State/Templates/ref|58}}. The integration stage is currently the least understood aspect of the pathway.</p>
=== Expression ===
=== Expression ===
-
<p>In the expression stage, the CRISPR array is transcribed in its entirety, yielding pre-crRNA. This pre-crRNA is cleaved at repeat regions{{:Team:Arizona State/Templates/ref|70}}{{:Team:Arizona State/Templates/ref|7}}{{:Team:Arizona State/Templates/ref|25}}{{:Team:Arizona State/Templates/ref|29}} to yield crRNA. In ''E. coli'', this crRNA is 61 bp long, consisting of a 31 bp spacer, flanked by repeat-derived segments on both ends{{:Team:Arizona State/Templates/ref|60}} (8 bp at 5'{{:Team:Arizona State/Templates/ref|23}}{{:Team:Arizona State/Templates/ref|26}}{{:Team:Arizona State/Templates/ref|25}}, 21 bp forming a hairpin at 3', with a 5' hydroxyl group). crRNA is then typically bound to a protein complex (known as Cascade in ''E. coli''{{:Team:Arizona State/Templates/ref|60}}).</p>
+
<p>In the expression stage, the CRISPR array is transcribed in its entirety, yielding pre-crRNA. This pre-crRNA is cleaved at repeat regions{{:Team:Arizona State/Templates/ref|73}}{{:Team:Arizona State/Templates/ref|9}}{{:Team:Arizona State/Templates/ref|27}}{{:Team:Arizona State/Templates/ref|31}} to yield crRNA. In ''E. coli'', this crRNA is 61 bp long, consisting of a 31 bp spacer, flanked by repeat-derived segments on both ends{{:Team:Arizona State/Templates/ref|63}} (8 bp at 5'{{:Team:Arizona State/Templates/ref|25}}{{:Team:Arizona State/Templates/ref|28}}{{:Team:Arizona State/Templates/ref|27}}, 21 bp forming a hairpin at 3', with a 5' hydroxyl group). crRNA is then typically bound to a protein complex (known as Cascade in ''E. coli''{{:Team:Arizona State/Templates/ref|63}}).</p>
=== Interference ===
=== Interference ===
-
<p>This stage requires bound crRNA, as well as cas3 in ''E. coli''{{:Team:Arizona State/Templates/ref|60}}. The interference stage targets DNA in most organisms{{:Team:Arizona State/Templates/ref|23}}{{:Team:Arizona State/Templates/ref|26}}{{:Team:Arizona State/Templates/ref|51}}, but RNA targeting has been demonstrated in the case of ''P. furiosus''{{:Team:Arizona State/Templates/ref|33}}. Recognition of target DNA is thought to take place by means of R-loops{{:Team:Arizona State/Templates/ref|60}}{{:Team:Arizona State/Templates/ref|70}}{{:Team:Arizona State/Templates/ref|1}}. An R-loop is an RNA strand that has base paired with a complementary DNA strand, displacing the other identical DNA strand{{:Team:Arizona State/Templates/ref|1}}. This base pairing between the crRNA spacer sequence and target strand may mark the region for interference by other proteins such as cas3{{:Team:Arizona State/Templates/ref|60}}.</p>
+
<p>This stage requires bound crRNA, as well as Cas3 in ''E. coli''{{:Team:Arizona State/Templates/ref|63}}. The interference stage targets DNA in most organisms{{:Team:Arizona State/Templates/ref|25}}{{:Team:Arizona State/Templates/ref|28}}{{:Team:Arizona State/Templates/ref|54}}, but RNA targeting has been demonstrated in the case of ''P. furiosus''{{:Team:Arizona State/Templates/ref|35}}. Recognition of target DNA is thought to take place by means of R-loops{{:Team:Arizona State/Templates/ref|63}}{{:Team:Arizona State/Templates/ref|73}}{{:Team:Arizona State/Templates/ref|1}}. An R-loop is an RNA strand that has base paired with a complementary DNA strand, displacing the other identical DNA strand{{:Team:Arizona State/Templates/ref|1}}. This base pairing between the crRNA spacer sequence and target strand may mark the region for interference by other proteins such as cas3{{:Team:Arizona State/Templates/ref|63}}.</p>
<p>In ''Streptococcus thermophilus'', only Cas9 is necessary for CRISPR functionality{{:Team:Arizona State/Templates/ref|74}}. However, a specific sequence, known as a proto-adjacent-motif (PAM) was found to be required for interference. The predicted sequence is 5'-NGGNG-3'. This sequence is found several base pairs upstream of the proto-spacer (target DNA). Single base pair mutations in the PAM completely abolish CRISPR interference{{:Team:Arizona State/Templates/ref|74}}.</p>
<p>In ''Streptococcus thermophilus'', only Cas9 is necessary for CRISPR functionality{{:Team:Arizona State/Templates/ref|74}}. However, a specific sequence, known as a proto-adjacent-motif (PAM) was found to be required for interference. The predicted sequence is 5'-NGGNG-3'. This sequence is found several base pairs upstream of the proto-spacer (target DNA). Single base pair mutations in the PAM completely abolish CRISPR interference{{:Team:Arizona State/Templates/ref|74}}.</p>
== Core Cas genes ==
== Core Cas genes ==
-
<p>There are 6 “core” Cas genes, found in a wide variety of organisms and here referred to as Cas1-Cas6{{:Team:Arizona State/Templates/ref|13}}.</p>
+
<p>There are 6 “core” Cas genes, found in a wide variety of organisms and here referred to as Cas1-Cas6{{:Team:Arizona State/Templates/ref|15}}.</p>
=== Cas1, Cas2 ===
=== Cas1, Cas2 ===
-
<p>Cas1 is nearly universally conserved throughout organisms with CRISPR{{:Team:Arizona State/Templates/ref|30}}. It is strongly implicated in the integration stage of the pathway{{:Team:Arizona State/Templates/ref|30}}{{:Team:Arizona State/Templates/ref|57}}. Cas1 is a metal-dependent (Mg, Mn) DNA-specific endonuclease that generates an 80 bp fragment{{:Team:Arizona State/Templates/ref|30}}. How this is converted into an ~32 bp spacer is unknown.</p>
+
<p>Cas1 is nearly universally conserved throughout organisms with CRISPR{{:Team:Arizona State/Templates/ref|32}}. It is strongly implicated in the integration stage of the pathway{{:Team:Arizona State/Templates/ref|32}}{{:Team:Arizona State/Templates/ref|58}}. Cas1 is a metal-dependent (Mg, Mn) DNA-specific endonuclease that generates an 80 bp fragment{{:Team:Arizona State/Templates/ref|32}}. How this is converted into an ~32 bp spacer is unknown.</p>
-
<p>Cas2 is also involved in integration{{:Team:Arizona State/Templates/ref|30}}{{:Team:Arizona State/Templates/ref|57}}, and is a metal dependent endoribonuclease{{:Team:Arizona State/Templates/ref|22}}.</p>
+
<p>Cas2 is also involved in integration{{:Team:Arizona State/Templates/ref|32}}{{:Team:Arizona State/Templates/ref|58}}, and is a metal dependent endoribonuclease{{:Team:Arizona State/Templates/ref|24}}.</p>
=== Cas3 ===
=== Cas3 ===
-
<p>Cas3  is not regulated by H-NS{{:Team:Arizona State/Templates/ref|39}}. It cooperates with the Cascade complex{{:Team:Arizona State/Templates/ref|23}} in the interference stage. Cas3 has predicted ATP-dependent helicase activity{{:Team:Arizona State/Templates/ref|4}}, as well as demonstrated ATP independent annealing of RNA to DNA{{:Team:Arizona State/Templates/ref|70}}. It forms an R-loop with DNA, requiring magnesium or manganese as a co-factor{{:Team:Arizona State/Templates/ref|70}}, but has an antagonistic function in the presence of ATP, dissociating the R-loop.</p>
+
<p>Cas3  is not regulated by H-NS{{:Team:Arizona State/Templates/ref|41}}. It cooperates with the Cascade complex{{:Team:Arizona State/Templates/ref|25}} in the interference stage. Cas3 has predicted ATP-dependent helicase activity{{:Team:Arizona State/Templates/ref|4}}, as well as demonstrated ATP independent annealing of RNA to DNA{{:Team:Arizona State/Templates/ref|73}}. It forms an R-loop with DNA, requiring magnesium or manganese as a co-factor{{:Team:Arizona State/Templates/ref|73}}, but has an antagonistic function in the presence of ATP, dissociating the R-loop.</p>
-
+
-
== The CRISPR array ==
+
-
<p>Genetic information from previous encounters is stored in the array as spacers. These spacers are consistent in length (30-40 bp), and are flanked by repeating elements (also 30-40 bp). The repeating elements are usually partially palindromic, and form secondary structures when transcribed into pre-crRNA. These structures may be necessary for recognition and cleavage.</p>
+
== Prevention of self targeting (autoimmunity) ==
== Prevention of self targeting (autoimmunity) ==
-
<p>The 5' handle of crRNA allows self / nonself discrimination in the csm subtypetype{{:Team:Arizona State/Templates/ref|37}}. In the Cse subtype, regions flanking the proto spacer contain PAMs{{:Team:Arizona State/Templates/ref|37}}{{:Team:Arizona State/Templates/ref|12}}{{:Team:Arizona State/Templates/ref|20}}{{:Team:Arizona State/Templates/ref|28}}{{:Team:Arizona State/Templates/ref|19}}, which may be necessary for interference. In general, it is thought that mismatches at positions outside of the spacer sequence allow for targeting, while extended base pairing with the surrounding repeats prevents targeting{{:Team:Arizona State/Templates/ref|37}}.</p>
+
<p>The 5' handle of crRNA allows self / nonself discrimination in the csm subtype{{:Team:Arizona State/Templates/ref|39}}. In the Cse subtype, regions flanking the proto spacer contain PAMs{{:Team:Arizona State/Templates/ref|39}}{{:Team:Arizona State/Templates/ref|14}}{{:Team:Arizona State/Templates/ref|22}}{{:Team:Arizona State/Templates/ref|30}}{{:Team:Arizona State/Templates/ref|21}}, which may be necessary for interference. In general, it is thought that mismatches at positions outside of the spacer sequence allow for targeting, while extended base pairing with the surrounding repeats prevents targeting{{:Team:Arizona State/Templates/ref|39}}.</p>
-
== Cas gene regulation ==
+
== CRISPR regulation ==
-
<p>In E. coli (Cse subtype), transcription of the Cascade genes and CRISPR array is repressed by H-NS{{:Team:Arizona State/Templates/ref|45}}{{:Team:Arizona State/Templates/ref|41}}. H-NS is a global repressor of transcription in many gram negative bacteria that binds AT rich sequences{{:Team:Arizona State/Templates/ref|14}}. This repression is mediated by "DNA stiffening"{{:Team:Arizona State/Templates/ref|35}}, as well as formation of "DNA-protein-DNA" bridges{{:Team:Arizona State/Templates/ref|10}}. The creation of an H-NS knockout can be shown to increase expression of cas genes{{:Team:Arizona State/Templates/ref|45}}{{:Team:Arizona State/Templates/ref|5}}. This correlates with phage sensitivity{{:Team:Arizona State/Templates/ref|45}}.</p>
+
<p>In E. coli (Cse subtype), transcription of the Cascade genes and CRISPR array is repressed by H-NS{{:Team:Arizona State/Templates/ref|48}}{{:Team:Arizona State/Templates/ref|44}}. H-NS is a global repressor of transcription in many gram negative bacteria that binds AT rich sequences{{:Team:Arizona State/Templates/ref|16}}. This repression is mediated by "DNA stiffening"{{:Team:Arizona State/Templates/ref|37}}, as well as formation of "DNA-protein-DNA" bridges{{:Team:Arizona State/Templates/ref|11}}. The creation of an H-NS knockout can be shown to increase expression of cas genes{{:Team:Arizona State/Templates/ref|48}}{{:Team:Arizona State/Templates/ref|5}}. This correlates with phage sensitivity{{:Team:Arizona State/Templates/ref|48}}.</p>
-
<p>Transcription is antagonistically{{:Team:Arizona State/Templates/ref|24}} de-repressed by LeuO{{:Team:Arizona State/Templates/ref|45}}, a protein of the lysR transcription factor family{{:Team:Arizona State/Templates/ref|24}} near the leuABCD (leucine synthesis{{:Team:Arizona State/Templates/ref|2}}) operon{{:Team:Arizona State/Templates/ref|11}}. LeuO expression is also repressed by H-NS{{:Team:Arizona State/Templates/ref|3}}{{:Team:Arizona State/Templates/ref|6}}. Expression of H-NS repressed proteins can be manipulated by plasmid-encoded leuO in a constitutive promoter{{:Team:Arizona State/Templates/ref|32}}. Plasmids: pCA24N (lac1 promoter), pKEDR13 (pTac promoter), pNH41 (IPTG). Increased LeuO expression leads to increased expression of casABCDE, cas1, and cas2{{:Team:Arizona State/Templates/ref|45}}{{:Team:Arizona State/Templates/ref|32}}, but does not affect cas3 expression{{:Team:Arizona State/Templates/ref|45}}. Constitutively expressing leuO had a stronger affect than knocking out H-NS{{:Team:Arizona State/Templates/ref|45}}.</p>
+
<p>Transcription is antagonistically{{:Team:Arizona State/Templates/ref|26}} de-repressed by LeuO{{:Team:Arizona State/Templates/ref|48}}, a protein of the lysR transcription factor family{{:Team:Arizona State/Templates/ref|26}} near the leuABCD (leucine synthesis{{:Team:Arizona State/Templates/ref|2}}) operon{{:Team:Arizona State/Templates/ref|13}}. LeuO expression is also repressed by H-NS{{:Team:Arizona State/Templates/ref|3}}{{:Team:Arizona State/Templates/ref|6}}. Expression of H-NS repressed proteins can be manipulated by plasmid-encoded leuO in a constitutive promoter{{:Team:Arizona State/Templates/ref|33}}. Plasmids: pCA24N (lac1 promoter), pKEDR13 (pTac promoter), pNH41 (IPTG). Increased LeuO expression leads to increased expression of casABCDE, cas1, and cas2{{:Team:Arizona State/Templates/ref|48}}{{:Team:Arizona State/Templates/ref|33}}, but does not affect cas3 expression{{:Team:Arizona State/Templates/ref|48}}. Constitutively expressing leuO had a stronger affect than knocking out H-NS{{:Team:Arizona State/Templates/ref|48}}.</p>
== Classification of CRISPR systems ==
== Classification of CRISPR systems ==
<p>For a comprehensive listing of Cas genes, see [ftp://ftp.ncbi.nih.gov/pub/wolf/_suppl/CRISPRclass/crisprPro.html].</p>
<p>For a comprehensive listing of Cas genes, see [ftp://ftp.ncbi.nih.gov/pub/wolf/_suppl/CRISPRclass/crisprPro.html].</p>
-
<p>Haft (2005) {{:Team:Arizona State/Templates/ref|13}}: Recognition of core Cas genes (1-6). Organized remaining genes into 9 subtypes: Ecoli, Ypest, Nmeni, Dvulg, Tneap, Hmari, Apern, Mtube, RAMP.</p>
+
<p>Haft (2005) {{:Team:Arizona State/Templates/ref|15}}: Recognition of core Cas genes (1-6). Organized remaining genes into 9 subtypes: Ecoli, Ypest, Nmeni, Dvulg, Tneap, Hmari, Apern, Mtube, RAMP.</p>
-
<p>Makarova (2011) {{:Team:Arizona State/Templates/ref|72}}: Classification into I, II, and III subtypes, based on mechanism of action as well as homology. These subtypes correspond with the 9 given by Haft to a large extent:
+
<p>Makarova (2011) {{:Team:Arizona State/Templates/ref|65}}: Classification into I, II, and III subtypes, based on mechanism of action as well as homology. These subtypes correspond with the 9 given by Haft to a large extent:
-
:* I-A: Apern
+
:* I-A: Apern (Csa)
-
:* I-B: Tneap / Hmari
+
:* I-B: Tneap (Cst) / Hmari (Csh)
-
:* I-C: Dvulg
+
:* I-C: Dvulg (Csd)
:* I-D
:* I-D
-
:* I-E: Ecoli
+
:* I-E: Ecoli (Cse)
-
:* I-F: Ypest
+
:* I-F: Ypest (Csy)
-
:* II-A: Nmeni
+
:* II-A: Nmeni (Csn)
-
:* II-B: Nmeni
+
:* II-B: Nmeni (Csn)
-
:* III-A: Mtube
+
:* III-A: Mtube (Csm)
-
:* III-B: Polymerase-RAMP</p>
+
:* III-B: Polymerase-RAMP (Cmr)</p>
=== Type I, II and III systems ===
=== Type I, II and III systems ===
-
<p>This classification takes into account differing mechanisms at all three stages of the pathway{{:Team:Arizona State/Templates/ref|72}}.</p>
+
<p>This classification takes into account differing mechanisms at all three stages of the pathway{{:Team:Arizona State/Templates/ref|65}}.</p>
:<p> Integration: In type I and II systems, the integration of proto-spacers depends on a proto-adjacent-motif. Cas1 and Cas2 are involved in this stage in all three subtypes.</p>
:<p> Integration: In type I and II systems, the integration of proto-spacers depends on a proto-adjacent-motif. Cas1 and Cas2 are involved in this stage in all three subtypes.</p>
-
:<p> Expression: The CRISPR locus is transcribed into pre-crRNA. In type I systems, the Cascade complex binds to pre-crRNA, which is then cleaved by the Cas6e or Cas6f. Type II systems use a trans-encoded small RNA (tracrRNA) that binds with the repeat segment of pre-crRNA{{:Team:Arizona State/Templates/ref|59}}, followed by cleavage by RNase III with Cas9. Cas6 cleaves pre-crRNA in Type III systems. The crRNAs are then transferred to a distinct Cas complex (Csm in subtype III-A and Cmr in subtype III-B).</p>
+
:<p> Expression: The CRISPR locus is transcribed into pre-crRNA. In type I systems, the Cascade complex binds to pre-crRNA, which is then cleaved by the Cas6e or Cas6f. Type II systems use a trans-encoded small RNA (tracrRNA) that binds with the repeat segment of pre-crRNA{{:Team:Arizona State/Templates/ref|62}}, followed by cleavage by RNase III with Cas9. Cas6 cleaves pre-crRNA in Type III systems. The crRNAs are then transferred to a distinct Cas complex (Csm in subtype III-A and Cmr in subtype III-B).</p>
:<p>Interference: In type I systems, the Cascade complex is guided by the crRNA to the target strand. Cas3 then cleaves the DNA. In type II systems, Cas9 directly targets and cleaves the DNA without any additional proteins. Type I and type II systems both probably require a specific PAM for this stage. The Csm or Cmr proteins in type III systems also directly target DNA without additional proteins.</p>
:<p>Interference: In type I systems, the Cascade complex is guided by the crRNA to the target strand. Cas3 then cleaves the DNA. In type II systems, Cas9 directly targets and cleaves the DNA without any additional proteins. Type I and type II systems both probably require a specific PAM for this stage. The Csm or Cmr proteins in type III systems also directly target DNA without additional proteins.</p>
Line 85: Line 134:
* The [http://cmr.jcvi.org/cgi-bin/CMR/shared/GenomePropDefinition.cgi?prop_acc=GenProp0021 J. Craig Venter Institute] CMR genome properties database contains Cas gene information for several hundred genomes. The CMR database is currently without direct funding and is not being actively maintained.
* The [http://cmr.jcvi.org/cgi-bin/CMR/shared/GenomePropDefinition.cgi?prop_acc=GenProp0021 J. Craig Venter Institute] CMR genome properties database contains Cas gene information for several hundred genomes. The CMR database is currently without direct funding and is not being actively maintained.
* [http://crispi.genouest.org CRISPI: a CRISPR Interactive database] analyzes archaeal and bacterial genomes for CRISPR arrays and Cas genes. This database should be used with caution, as many of the purported repeats and spacers are several hundred base pairs, which is in conflict with the literature. Last updated 2008-11-04.
* [http://crispi.genouest.org CRISPI: a CRISPR Interactive database] analyzes archaeal and bacterial genomes for CRISPR arrays and Cas genes. This database should be used with caution, as many of the purported repeats and spacers are several hundred base pairs, which is in conflict with the literature. Last updated 2008-11-04.
 +
* [http://www.drive5.com/pilercr/ PILER-CR], a software tool for detecting CRISPR arrays.
 +
* [http://www.room220.com/crt/ CRISPR Recognition Tool], another array recognition tool.
}}
}}

Latest revision as of 02:53, 29 September 2011


CRISPR


ASU Logo.png
 

See glossary for explanation of various abbreviations used on this page.

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a genomic feature of many prokaryotic and archaeal species. 40% of sequenced bacterial genomes and 90% of archaeal genomes contain at least one CRISPR array[20]. It is possible that many laboratory strains of bacteria, which are the sources of many available genome sequences, have lost CRISPR due to a lack of exposure to phages[42].

CRISPR functions as an adaptive and inheritable immune system[40][53][34][38][42]. A CRISPR locus consists of a set of Cas (CRISPR associated) genes, a leader, or promoter, sequence, and an array. This array consists of repeating elements along with "spacers". These spacer regions direct the CRISPR machinery to degrade or otherwise inactivate a complementary sequence in the cell.

Contents

The CRISPR array

Genetic information from previous encounters is stored in the array as spacers. These spacers are consistent in length (30-40 bp), and are flanked by repeating elements (also 30-40 bp). The repeating elements are usually partially palindromic, and form secondary structures when transcribed into pre-crRNA. These structures may be necessary for recognition and cleavage.

Engineered arrays

By engineering a spacer complementary to T3 phage, increased survival was demonstrated[17][25][28][51][59]. A customized spacer can prevent transformation of PC194 plasmids with a matching sequence[28].

CRISPR in Escherichia coli K-12 substr. MG1655

Project page

E. coli contains a type I CRISPR system. There are four CRISPR loci in this organism. CRISPR1, the largest, is associated with eight Cas genes[73]. In the classification scheme presented by Haft et al[15], these genes form the Cse family: casA, casB, casC, casD, casE, aka cse1, cse2, cse3, cse4, cas5e[15]. These 5 proteins combine to form the Cascade complex[63]. This is a protein complex of all 5 Cse genes, resembling a seahorse in shape[63]. Its full composition is 1x casA, 2x casB, 6x casC, 1x casD, 1x casE[63]. Specifically, casE cleaves pre-crRNA[25], and casA and casB can be omitted without affecting crRNA generation, but are necessary for phage resistance[63]. This complex binds double stranded target DNA without need or enhancement by cofactors such as metal ions or ATP[63]. It also undergoes conformational changes when binding DNA[63][76].

Cas gene transcription is repressed by H-NS[70], and de-repressed by leuO[48] or baeR[57]

.

E coli crispr full.png

Coordinates:
CrisprI: 2875723-2876485
Leader: 2876485-2876592
Cas2 (ygbF): 2876592-2876875
Cas1 (ygbT): 2876877-2877794
CasE: 2877810-2878409
CasD: 2878396-2879070
CasC: 2879073-2880164
CasB: 2880177-2880659
CasA: 2880652-2882160
IGLB regulatory region: 2882160-2882575
Cas3 (ygcB): 2882575-2885241

Structure of the CRISPR I locus in E. coli. 3 promoters have been characterized[43]: Pcrispr1, Pcas, and anti-Pcas.

CRISPR in Pyrococcus furiosus DSM 3638

This organism is notable due to the diversity of its Cas genes, as well as its possible RNA targeting. P. furiosus contains 7 CRISPR loci, along with 29 Cas genes in 2 gene clusters[35]. All 6 core Cas genes (cas1-cas6), as well as genes from the Cmr (type III), Cst (type I), and Csa (type I) families are present. Cmr1-6 have been found to form a Cascade-like complex that targets RNA in in-vitro experiments[35].

CRISPR in Bacillus halodurans C-125

Project page

B. halodurans contains 6 Cmr genes (Cmr1-6) in a single locus. This is a type III CRISPR system. The organism also contains Csd1 and Csd2 (Dvulg subtype I-C) along with Cas1, Cas2, Cas3, Cas4, and Cas5 in another locus.

ASU B halodurans crispr.png
*Note: These regions have not been characterized in the literature and are speculative. The two regions shown are homologous.

Coordinates:
CrisprI: 339740-340773
Leader: 345821-345991
CrisprII: 345991-347087
Cmr1 (RAMP): 347483-348859
Cmr2 (RAMP): 348829-350643
Cmr3 (RAMP): 350603-351874
Cmr4 (RAMP): 351874-352785
Cmr5 (RAMP): 352782-353159
Cmr6 (RAMP): 353156-353902
Leader: 355127-355361
CrisprIII: 355361-356328
Cas3 (Dvulg): 358059-360461
Cas5 (Dvulg): 360504-361214
Csd1 (Dvulg): 361216-363099
Csd2 (Dvulg): 363096-363947
Cas4 (Dvulg): 364072-364596
Cas1 (Dvulg): 364593-365624
Cas2 (Dvulg): 365633-365923
CrisprIV: 366106-368458
CrisprV: 378211-378913

CRISPR in Listeria innocua Clip11262

Project page

L. innocua contains a type II CRISPR system. A single gene (Cas9 / Csn1) has been shown to be necessary for the expression and inactivation stages of the pathway[65]. A separate trans-encoded small RNA (tracrRNA) binds with the repeat segment of the pre-crRNA[62], followed by cleavage by RNase III and binding with Cas9.

ASU L innocua crispr.png

Coordinates:
CrisprI: 2768992-2769687
Leader: 2769687-2769814
Cas2 (lin2742): 2769814-2770404
Cas1 (lin2743): 2770410-2770706
Csn1 (lin2744): 2770707-2774711
TracrRNA: 2774711-2774865

Stages of the CRISPR pathway

There are 3 distinct stages of the CRISPR pathway: integration[17][54][22], expression, and adaptation.

Integration / Adaptation

In this step, DNA, commonly derived from phages and plasmids[48], is recognized and processed by Cas proteins. Information from outside of the genome is recognized and incorporated into the leader end of an existing array. This involves cas1 and cas2[32][58]. The integration stage is currently the least understood aspect of the pathway.

Expression

In the expression stage, the CRISPR array is transcribed in its entirety, yielding pre-crRNA. This pre-crRNA is cleaved at repeat regions[73][9][27][31] to yield crRNA. In E. coli, this crRNA is 61 bp long, consisting of a 31 bp spacer, flanked by repeat-derived segments on both ends[63] (8 bp at 5'[25][28][27], 21 bp forming a hairpin at 3', with a 5' hydroxyl group). crRNA is then typically bound to a protein complex (known as Cascade in E. coli[63]).

Interference

This stage requires bound crRNA, as well as Cas3 in E. coli[63]. The interference stage targets DNA in most organisms[25][28][54], but RNA targeting has been demonstrated in the case of P. furiosus[35]. Recognition of target DNA is thought to take place by means of R-loops[63][73][1]. An R-loop is an RNA strand that has base paired with a complementary DNA strand, displacing the other identical DNA strand[1]. This base pairing between the crRNA spacer sequence and target strand may mark the region for interference by other proteins such as cas3[63].

In Streptococcus thermophilus, only Cas9 is necessary for CRISPR functionality[74]. However, a specific sequence, known as a proto-adjacent-motif (PAM) was found to be required for interference. The predicted sequence is 5'-NGGNG-3'. This sequence is found several base pairs upstream of the proto-spacer (target DNA). Single base pair mutations in the PAM completely abolish CRISPR interference[74].

Core Cas genes

There are 6 “core” Cas genes, found in a wide variety of organisms and here referred to as Cas1-Cas6[15].

Cas1, Cas2

Cas1 is nearly universally conserved throughout organisms with CRISPR[32]. It is strongly implicated in the integration stage of the pathway[32][58]. Cas1 is a metal-dependent (Mg, Mn) DNA-specific endonuclease that generates an 80 bp fragment[32]. How this is converted into an ~32 bp spacer is unknown.

Cas2 is also involved in integration[32][58], and is a metal dependent endoribonuclease[24].

Cas3

Cas3 is not regulated by H-NS[41]. It cooperates with the Cascade complex[25] in the interference stage. Cas3 has predicted ATP-dependent helicase activity[4], as well as demonstrated ATP independent annealing of RNA to DNA[73]. It forms an R-loop with DNA, requiring magnesium or manganese as a co-factor[73], but has an antagonistic function in the presence of ATP, dissociating the R-loop.

Prevention of self targeting (autoimmunity)

The 5' handle of crRNA allows self / nonself discrimination in the csm subtype[39]. In the Cse subtype, regions flanking the proto spacer contain PAMs[39][14][22][30][21], which may be necessary for interference. In general, it is thought that mismatches at positions outside of the spacer sequence allow for targeting, while extended base pairing with the surrounding repeats prevents targeting[39].

CRISPR regulation

In E. coli (Cse subtype), transcription of the Cascade genes and CRISPR array is repressed by H-NS[48][44]. H-NS is a global repressor of transcription in many gram negative bacteria that binds AT rich sequences[16]. This repression is mediated by "DNA stiffening"[37], as well as formation of "DNA-protein-DNA" bridges[11]. The creation of an H-NS knockout can be shown to increase expression of cas genes[48][5]. This correlates with phage sensitivity[48].

Transcription is antagonistically[26] de-repressed by LeuO[48], a protein of the lysR transcription factor family[26] near the leuABCD (leucine synthesis[2]) operon[13]. LeuO expression is also repressed by H-NS[3][6]. Expression of H-NS repressed proteins can be manipulated by plasmid-encoded leuO in a constitutive promoter[33]. Plasmids: pCA24N (lac1 promoter), pKEDR13 (pTac promoter), pNH41 (IPTG). Increased LeuO expression leads to increased expression of casABCDE, cas1, and cas2[48][33], but does not affect cas3 expression[48]. Constitutively expressing leuO had a stronger affect than knocking out H-NS[48].

Classification of CRISPR systems

For a comprehensive listing of Cas genes, see [1].

Haft (2005) [15]: Recognition of core Cas genes (1-6). Organized remaining genes into 9 subtypes: Ecoli, Ypest, Nmeni, Dvulg, Tneap, Hmari, Apern, Mtube, RAMP.

Makarova (2011) [65]: Classification into I, II, and III subtypes, based on mechanism of action as well as homology. These subtypes correspond with the 9 given by Haft to a large extent:

  • I-A: Apern (Csa)
  • I-B: Tneap (Cst) / Hmari (Csh)
  • I-C: Dvulg (Csd)
  • I-D
  • I-E: Ecoli (Cse)
  • I-F: Ypest (Csy)
  • II-A: Nmeni (Csn)
  • II-B: Nmeni (Csn)
  • III-A: Mtube (Csm)
  • III-B: Polymerase-RAMP (Cmr)

Type I, II and III systems

This classification takes into account differing mechanisms at all three stages of the pathway[65].

Integration: In type I and II systems, the integration of proto-spacers depends on a proto-adjacent-motif. Cas1 and Cas2 are involved in this stage in all three subtypes.

Expression: The CRISPR locus is transcribed into pre-crRNA. In type I systems, the Cascade complex binds to pre-crRNA, which is then cleaved by the Cas6e or Cas6f. Type II systems use a trans-encoded small RNA (tracrRNA) that binds with the repeat segment of pre-crRNA[62], followed by cleavage by RNase III with Cas9. Cas6 cleaves pre-crRNA in Type III systems. The crRNAs are then transferred to a distinct Cas complex (Csm in subtype III-A and Cmr in subtype III-B).

Interference: In type I systems, the Cascade complex is guided by the crRNA to the target strand. Cas3 then cleaves the DNA. In type II systems, Cas9 directly targets and cleaves the DNA without any additional proteins. Type I and type II systems both probably require a specific PAM for this stage. The Csm or Cmr proteins in type III systems also directly target DNA without additional proteins.

CRISPR resources

  • The [http://crispr.u-psud.fr/crispr/ CRISPR database] analyzes archaeal and bacterial genomes for CRISPR arrays. The sequence of each locus can be displayed. This database is regularly updated.
  • The [http://cmr.jcvi.org/cgi-bin/CMR/shared/GenomePropDefinition.cgi?prop_acc=GenProp0021 J. Craig Venter Institute] CMR genome properties database contains Cas gene information for several hundred genomes. The CMR database is currently without direct funding and is not being actively maintained.
  • [http://crispi.genouest.org CRISPI: a CRISPR Interactive database] analyzes archaeal and bacterial genomes for CRISPR arrays and Cas genes. This database should be used with caution, as many of the purported repeats and spacers are several hundred base pairs, which is in conflict with the literature. Last updated 2008-11-04.
  • [http://www.drive5.com/pilercr/ PILER-CR], a software tool for detecting CRISPR arrays.
  • [http://www.room220.com/crt/ CRISPR Recognition Tool], another array recognition tool.