Team:Arizona State/Project/CRISPR
From 2011.igem.org
CRISPR
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are a genomic feature of many prokaryotic and archeal species. CRISPR functions as an adaptive and inheritable immune system[38][50][31][36][40]. A CRISPR locus consists of a set of cas (CRISPR associated) genes, a leader, or promoter, sequence, and an array. This array consists of repeating elements along with "spacers". These spacer regions direct the CRISPR machinery to degrade or otherwise inactivate a complementary sequence in the cell.
Contents |
Engineered arrays
- By engineering a spacer complementary to T3 phage, increased survival was demonstrated[15][23][26][48][55].
- A customized spacer can prevent transformation of PC194 plasmids with a matching sequence[26].
CRISPR in E. coli
There are four crispr loci in E. coli. CRISPR1, the largest, is associated with eight cas genes[70]. In the classification scheme presented by Haft et al[13], these genes form the cse family, outlined below:
- casA, casB, casC, casD, casE, aka cse1, cse2, cse3, cse4, cas5e[13]:
CRISPR in P. furiosus
P. furiosus contains 7 crispr loci, along with 29 cas proteins in 2 gene clusters[33]. All 6 core cas proteins (cas1-cas6), as well as genes from the cmr, cst, and csa families are present.
Stages of the CRISPR pathway
There are 3 distinct stages of the CRISPR pathway: integration [15][51][20], expression, and adaptation. The cas subtype this section will refer to as an example is cse (E. coli).
Integration / Adaptation
In this step, DNA, commonly derived from phages and plasmids[45], is recognized and processed by cas proteins. Information from outside of the genome is recognized and incorporated into the leader end of an existing array. This involves cas1 and cas2[30][57]. The integration stage is the least understood aspect of the pathway.
Expression
In the expression stage, the CRISPR array is transcribed in its entirety, yielding pre-crRNA. This pre-crRNA is cleaved at repeat regions[70][7][25][29] by casE to yield crRNA. This crRNA is 61 bp long, consisting of a 31 bp spacer, flanked by repeat-derived segments on both ends[60] (8 bp at 5'[23][26][25], 21 bp forming a hairpin at 3', with a 5' hydroxyl group). crRNA is then bound to CASCADE, a protein complex consisting of casA, B, C, D, and E[60].
Interference
This stage requires cascade bound with crRNA, as well as cas3 [60]. The cascade complex may target DNA in the case of cse[23][26][51], or RNA in the cmr subtype[33]. Recognition of target DNA takes place by means of R loops[60][70][1]. An r loop is an RNA strand that has base paired with a complementary DNA strand, displacing the other identical DNA strand[1]. This base pairing between the crRNA spacer sequence and target strand may mark the region for interference by other proteins such as cas3[60].
CASCADE complex
This is a protein complex of casA-E, resembling a seahorse in shape[60]. Its full composition is 1x casA, 2x casB, 6x casC, 1x casD, 1x casE[60]. All protein components (casA-casE) are required for virus resistance[60]. This complex binds double stranded target DNA without need or enhancement by cofactors such as metal ions or ATP[60]. It also undergoes conformational changes when binding DNA[60].
Core cas genes
There are 6 “core” cas genes, found in a wide variety of organisms and here referred to as cas1-cas6[13].
cas1, cas2
cas1 is nearly universally conserved throughout organisms with CRISPR[30]. It is strongly implicated in the integration stage of the pathway[30][57]. cas1 is a metal-dependent (Mg, Mn) DNA specific endonuclease that generates an 80 bp fragment[30]. How is this converted into a ~32 bp spacer is unknown. cas2 is also involved in integration[30][57], and is a metal dependent endoribunuclease[22].
cas3
cas3 is not regulated by H-NS[39]. It cooperates with the cascade complex[23] in the interference stage. Cas3 has predicted ATP dependent helicase activity[4], as well as demonstrated ATP independent annealing of RNA to DNA[70]. It forms an r-loop with DNA, requiring magnesium or manganese as a co-factor[70], but has an antagonistic function in the presence of ATP, dissociating the r-loop.
The CRISPR array
Genetic information from previous encounters is stored in the array as spacers. These spacers are consistent in length (30-40 bp), and are flanked by repeating elements (also 30-40 bp). The repeating elements are usually partially palindromic, and form secondary structures when transcribed into pre-crRNA. These structures may be necessary for recognition and cleavage.
Prevention of self targeting (autoimmunity)
The 5' handle of crRNA allows self / nonself discrimination in the csm subtypetype[37]. In cse, regions flanking the proto spacer contain PAMs[37][12][20][28][19].
cas gene regulation
In E. coli (cse subtype), transcription of the cascade genes and CRISPR array is repressed by H-NS[45][41]. H-NS is a global repressor of transcription in many gram negative bacteria that binds AT rich sequences[14]. This repression is mediated by "DNA stiffening"[35], as well as formation of "DNA-protein-DNA" bridges[10]. The creation of an H-NS knockout can be shown to increase expression of cas genes[45][5]. This correlates with phage sensitivity[45]. Transcription is antagonistically[24] de-repressed by LeuO[45], a protein of the lysR transcription factor family[24] near the leuABCD (leucine synthesis[2]) operon[11]. LeuO expression is also repressed by H-NS[3][6]. Expression of H-NS repressed proteins can be manipulated by plasmid-encoded leuO in a consitutive promoter[32]. Plasmids: pCA24N (lac1 promoter), pKEDR13 (pTac promoter), pNH41 (IPTG). Increased LeuO expression leads to increased expression of casABCDE, cas1, and cas2[45][32], but does not affect cas3 expression[45]. Constitutively expressing leuO had a stronger affect than knocking out H-NS[45].
Classfication of crispr systems
- (todo)
- 3 important papers: