Team:Arizona State/Project/CRISPR
From 2011.igem.org
CRISPR
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a genomic feature of many prokaryotic and archeal species. CRISPR functions as an adaptive and inheritable immune system[10][18][45][46][47]. A CRISPR locus consists of a set of cas (CRISPR associated) genes, a leader, or promoter, sequence, and an array. This array consists of repeating elements along with "spacers". These spacer regions direct the CRISPR machinery to degrade or otherwise inactivate a complementary sequence in the cell.
Contents |
Engineered arrays
- By engineering a spacer complementary to T3 phage, increased survival was demonstrated[2][4][5][35][37].
- A customized spacer can prevent transformation of PC194 plasmids with a matching sequence[5].
CRISPR in E. coli
There are four crispr loci in E. coli. CRISPR1, the largest, is associated with eight cas genes[34]. In the classification scheme presented by Haft et al[1], these genes form the cse family, outlined below:
- casA, casB, casC, casD, casE, aka cse1, cse2, cse3, cse4, cas5e[1]:
CRISPR in P. furiosus
P. furiosus contains 7 crispr loci, along with 29 cas proteins in 2 gene clusters[7]. All 6 core cas proteins (cas1-cas6), as well as genes from the cmr, cst, and csa families are present.
Stages of the CRISPR pathway
There are 3 distinct stages of the CRISPR pathway: integration [2][36][52], expression, and adaptation. The cas subtype this section will refer to as an example is cse (E. coli).
Integration / Adaptation
In this step, DNA, commonly derived from phages and plasmids[14], is recognized and processed by cas proteins. Information from outside of the genome is recognized and incorporated into the leader end of an existing array. This involves cas1 and cas2[38][42]. The integration stage is the least understood aspect of the pathway.
Expression
In the expression stage, the CRISPR array is transcribed in its entirety, yielding pre-crRNA. This pre-crRNA is cleaved at repeat regions[34][56][57][58] by casE to yield crRNA. This crRNA is 61 bp long, consisting of a 31 bp spacer, flanked by repeat-derived segments on both ends[24] (8 bp at 5'[4][5][57], 21 bp forming a hairpin at 3', with a 5' hydroxyl group). crRNA is then bound to CASCADE, a protein complex consisting of casA, B, C, D, and E[24].
Interference
This stage requires cascade bound with crRNA, as well as cas3 [24]. The cascade complex may target DNA in the case of cse[4][5][36], or RNA in the cmr subtype[7]. Recognition of target DNA takes place by means of R loops[24][34][54]. An r loop is an RNA strand that has base paired with a complementary DNA strand, displacing the other identical DNA strand[54]. This base pairing between the crRNA spacer sequence and target strand may mark the region for interference by other proteins such as cas3[24].
CASCADE complex
This is a protein complex of casA-E, resembling a seahorse in shape[24]. Its full composition is 1x casA, 2x casB, 6x casC, 1x casD, 1x casE[24]. All protein components (casA-casE) are required for virus resistance[24]. This complex binds double stranded target DNA without need or enhancement by cofactors such as metal ions or ATP[24]. It also undergoes conformational changes when binding DNA[24].
Core cas genes
There are 6 “core” cas genes, found in a wide variety of organisms and here referred to as cas1-cas6[1].
cas1, cas2
cas1 is nearly universally conserved throughout organisms with CRISPR[38]. It is strongly implicated in the integration stage of the pathway[38][42]. cas1 is a metal-dependent (Mg, Mn) DNA specific endonuclease that generates an 80 bp fragment[38]. How is this converted into a ~32 bp spacer is unknown. cas2 is also involved in integration[38][42], and is a metal dependent endoribunuclease[59].
cas3
cas3 is not regulated by H-NS[40]. It cooperates with the cascade complex[4] in the interference stage. Cas3 has predicted ATP dependent helicase activity[41], as well as demonstrated ATP independent annealing of RNA to DNA[34]. It forms an r-loop with DNA, requiring magnesium or manganese as a co-factor[34], but has an antagonistic function in the presence of ATP, dissociating the r-loop.
The CRISPR array
Genetic information from previous encounters is stored in the array as spacers. These spacers are consistent in length (30-40 bp), and are flanked by repeating elements (also 30-40 bp). The repeating elements are usually partially palindromic, and form secondary structures when transcribed into pre-crRNA. These structures may be necessary for recognition and cleavage.
Prevention of self targeting (autoimmunity)
The 5' handle of crRNA allows self / nonself discrimination in the csm subtypetype[9]. In cse, regions flanking the proto spacer contain PAMs[9][50][52][53][55].
cas gene regulation
In E. coli (cse subtype), transcription of the cascade genes and CRISPR array is repressed by H-NS[14][60]. H-NS is a global repressor of transcription in many gram negative bacteria that binds AT rich sequences[62]. This repression is mediated by "DNA stiffening"[63], as well as formation of "DNA-protein-DNA" bridges[64]. The creation of an H-NS knockout can be shown to increase expression of cas genes[14][61]. This correlates with phage sensitivity[14]. Transcription is antagonistically[65] de-repressed by LeuO[14], a protein of the lysR transcription factor family[65] near the leuABCD (leucine synthesis[67]) operon[66]. LeuO expression is also repressed by H-NS[68][69]. Expression of H-NS repressed proteins can be manipulated by plasmid-encoded leuO in a consitutive promoter[70]. Plasmids: pCA24N (lac1 promoter), pKEDR13 (pTac promoter), pNH41 (IPTG). Increased LeuO expression leads to increased expression of casABCDE, cas1, and cas2[14][70], but does not affect cas3 expression[14]. Constitutively expressing leuO had a stronger affect than knocking out H-NS[14].
Classfication of crispr systems
- (todo)
- 3 important papers: