Team:DTU-Denmark/Bioinformatic

From 2011.igem.org

(Difference between revisions)
(References)
Line 1: Line 1:
{{:Team:DTU-Denmark/Templates/Standard_page_begin|Bioinformatic}}
{{:Team:DTU-Denmark/Templates/Standard_page_begin|Bioinformatic}}
 +
== Bioinformatic study ==
With the aim of investigating the flexibility one has when engineering a sRNA regulator based on our system we performed a bioinformatic study to elucidate sequence and structure conservation among 24  bacterial species representative of the diversity within Enterobacteriaceae.
With the aim of investigating the flexibility one has when engineering a sRNA regulator based on our system we performed a bioinformatic study to elucidate sequence and structure conservation among 24  bacterial species representative of the diversity within Enterobacteriaceae.
Line 5: Line 6:
First the ChiP homolog including its ribosome binding site (RBS) was indentified in the 24 species by BlastP using the E.Coli protein as bait. Then, to identify putative ChiXs a local BlastN search was performed with the sequence flanking the RBS as query and the given genome as target (window size of 7). Hits in coding genes were excluded since we restrict ChiX-homologs to be in intergenic regions as is the case in E.Coli and Salmonella [Overgaard et al., 2009, Figueroa-Bossi et al., 2009]. In addition, hits which did not seem to have a putative -35 and -10 in a reasonable distance from the putative start-site were also exluded, leaving one hit for each of the 24 sequences. Next, the secondary structure of the 24 putative ChiX homologs were analysed by RNAfold (default parameter setting) from the Vienna RNA package version 2.0.0. and sequences length were selected for further analysis  so  they start with 1 or 2 nucleotides  upstream of the first stemloop and end with 4 or 5  Ts downstream of the second stem-loop. In the next step, a structural alignment was made by the LocARNA server, also from the Vienna RNA package version 2.0.0 (using default parameter settings), and finally, the RNA secondary structure and sequence conservation was visualized by RNAlogo webserver. Afterwards the percentage of G/C and A/T (out of 24) was manually calculated. Note that at gap positions these two numbers do not sum to 100.
First the ChiP homolog including its ribosome binding site (RBS) was indentified in the 24 species by BlastP using the E.Coli protein as bait. Then, to identify putative ChiXs a local BlastN search was performed with the sequence flanking the RBS as query and the given genome as target (window size of 7). Hits in coding genes were excluded since we restrict ChiX-homologs to be in intergenic regions as is the case in E.Coli and Salmonella [Overgaard et al., 2009, Figueroa-Bossi et al., 2009]. In addition, hits which did not seem to have a putative -35 and -10 in a reasonable distance from the putative start-site were also exluded, leaving one hit for each of the 24 sequences. Next, the secondary structure of the 24 putative ChiX homologs were analysed by RNAfold (default parameter setting) from the Vienna RNA package version 2.0.0. and sequences length were selected for further analysis  so  they start with 1 or 2 nucleotides  upstream of the first stemloop and end with 4 or 5  Ts downstream of the second stem-loop. In the next step, a structural alignment was made by the LocARNA server, also from the Vienna RNA package version 2.0.0 (using default parameter settings), and finally, the RNA secondary structure and sequence conservation was visualized by RNAlogo webserver. Afterwards the percentage of G/C and A/T (out of 24) was manually calculated. Note that at gap positions these two numbers do not sum to 100.
-
From the sequence logo it can be seen that the structure but not the sequence is conserved in the first stemloop, whereas both the structure and sequence is conserved in the second stemloop. Another interesting feature is the conserved A/U stretch around position 40 which resemble the characteristic Hfq binding motif. The sequence which base-pair with the ChiP RBS is perfectly conserved, this might be due to evolutionary selection for a strong RBS rather than functionally constraints (AAAGAGG is not a bad RBS, and is somewhat similar to the RBS BioBricks: BBa_B0030, BBa_B0034, BBa_B0035, and BBa_B0064 which has a relative strength of 0.35-1.124). Consequently we expect that complementary mutations can freely be made to match the RBS in any mRNA of interest.
+
From the sequence logo it can be seen that the structure but not the sequence is conserved in the first stemloop, whereas both the structure and sequence is conserved in the second stemloop. Another interesting feature is the conserved A/U stretch around position 40 which resemble the characteristic Hfq binding motif [Vogetl and Luisi, 2011]. The sequence which base-pair with the ChiP RBS is perfectly conserved, this might be due to evolutionary selection for a strong RBS rather than functionally constraints (AAAGAGG is not a bad RBS, and is somewhat similar to the RBS BioBricks: BBa_B0030, BBa_B0034, BBa_B0035, and BBa_B0064 which has a relative strength of 0.35-1.124). Consequently we expect that complementary mutations can freely be made to match the RBS in any mRNA of interest.
 +
=== Species ===
 +
Below are the names as accession numbers listed for the 24 species included in the analysis.
 +
 +
{|
 +
| '''Name''' || '''Accession Number'''
 +
|-
 +
|Citrobacter koseri || NC_009792
 +
|-
 +
|Citrobacter rodentium || NC_013716
 +
|-
 +
|Cronobacter sakazakii || CP000783
 +
|-
 +
|Cronobacter turicensis || FN543093
 +
|-
 +
|Edwardsiella ictaluri || CP001600
 +
|-
 +
|Edwardsiella tarda || CP002154
 +
|-
 +
|Enterobacter aerogenes || CP002824
 +
|-
 +
|Enterobacter cloacae || CP002272
 +
|-
 +
|Enterobacter sp. || CP000653
 +
|-
 +
|Erwinia billingiae || FP236843
 +
|-
 +
|Escherichia coli || AE005174
 +
|-
 +
|Klebsiella pneumonia || CP002910
 +
|-
 +
|Klebsiella pneumonia || CP000647
 +
|-
 +
|Klebsiella variicola || CP001891
 +
|-
 +
|Pantoea sp. || CP002433
 +
|-
 +
|Salmonella bongori || FR877557
 +
|-
 +
|Salmonella enteric || NC_010067
 +
|-
 +
|Serratia proteamaculans || CP000826
 +
|-
 +
|Shigella flexneri || AE014073
 +
|-
 +
|Shigella sonnei || NC_007384
 +
|-
 +
|Yersinia enterocolitica || FR729477
 +
|-
 +
|Yersinia pestis || CP000308
 +
|-
 +
|Yersinia pestis || NC_005810
 +
|-
 +
|Yersinia pseudotuberculosis || NC_006155
 +
|}
<div style="white-space: nowrap; width: 671px; border: 1px solid #CCCCCC; overflow: auto;">
<div style="white-space: nowrap; width: 671px; border: 1px solid #CCCCCC; overflow: auto;">
-
HER ER EN MEGET LANG LINIE HER ER EN MEGET LANG LINIEHER ER EN MEGET LANG LINIEHER ER EN MEGET LANG LINIEHER ER EN MEGET LANG LINIEHER ER EN MEGET LANG LINIEHER ER EN MEGET LANG LINIEHER ER EN MEGET LANG LINIE
+
Erwinia_billingi  -----CCCCG-CUGGCAACCUGAUUGCUACCGG---AUAACUAAAAUCAUA-AAAAAAUUCCUCUUUGA-CGGGCCGAUAGCAAUAUUGGCCAUUUUU
 +
Pantoea_sp__At_9  AUUGGCCCGCGCCAGCCA-GUAAUGGCUGUGUG---AUAACCAAAAUCAUA-AACAAAUUCCUCUUUGA-CGGGCCGAUAGUAAUAUUGGCCUUCUUU
 +
Serratia_proteam  -C---UAUUGUAACCGUUAACA--GGGUUACAAUGCGUA-UAACUACAAUACAAGAAAUUCCUCUUUGA-CUGGCCAGUAGCGAUAUUGGCCACUUUU
 +
Yersinia_enteroc  -CU-GCUC--AUAUUUUCCGCAAGGAAUGUGAGGGCUUAAUAACUAAAAUAAUGAAAAUUCCUCUUUGA-CUGGCCGAUAGCGAUAUCGGCCAUUUUU
 +
Yersinia_pesti_1  --UGGCGCUCACAUUAUCCGCAAGGAGUGUGAGUGCUUAAUAACAAAAAUAAUGAAAAUUCCUCUUUGA-CUGGCCGGUAGUGAUAUCGGCCAUUUUU
 +
Yersinia_pestis_  --UGGCGCUCACAUUAUCCGCAAGGAGUGUGAGUGCUUAAUAACAAAAAUAAUGAAAAUUCCUCUUUGA-CUGGCCGGUAGUGAUAUCGGCCAUUUUU
 +
Yersinia_pseudot  -CUGGUGCUCACAUUAUCCGCAAGGAGUGUGAGUGCUUAAUAACAAAAAUAAUGAAAAUUCCUCUUUGA-CUGGCCGGUAGUGAUAUCGGCCAUUUUU
 +
Citrobacter_rode  --------ACCGUC--GCUUAA-AGCGGCGGC----AUAACAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU
 +
Cronobacter_saka  -----A--ACCGUC--CGCUAAGGCGCACGGC----AUAACGACAAUAACG--AAAAGUUCCUCUUUGA-CGGGCCAGUAGCGAUACUGGCCUUCUUU
 +
Cronobacter_turi  --------ACCGUU--CGCUAACGCGCACGGC----AUAACGAUAAUAACG---AAAGUUCCUCUUUGA-CGGGCCAGUAGCGAUACUGGCCUUCUUU
 +
Edwardsiella_ict  -----ACCGGGCAC--CCCUUGGGGGGCGCCC-GGAAUAAUAAU---AUC---UGAAGUUCCUCUUUGACUG-GCCAAUAGCAAUAUUGGCCUUUUUU
 +
Edwardsiella_tar  -----A--CCGGGU--CCCUAUGGGGGCCCGGCAUAAUAAUAAU---AUC---UGAAGUUCCUCUUUGACUG-GCCAAUAGCGAUAUUGGCCUUUUUU
 +
Enterobacter_aer  -----A--UCCGGG--AUGUAU--AUCCCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU
 +
Enterobacter_clo  -----A--UCCGGA--GUGCGA--ACUCCGGG----AUAAUAAU--AACGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGAAAUAUUGGCCAUUUUU
 +
Enterobacter_sp_  -----A--ACCGAG--GGUCU---CCUUCGGC----AUAAUAAU--AACGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGAAAUAUUGGCCAUUUUU
 +
Escherichia_coli  -----C--ACCGUC--GCUUAA-AGUGACGGC----AUAAUAAUAAAAAAA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU
 +
Klebsiella_pne_1  -----A--UCCGGG--AUGCAA--AUCCCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU
 +
Klebsiella_pneum  -----A--UCCGGG--AUGCAA--AUCCCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU
 +
Klebsiella_varii  -----A--UCCGGG--AUGCAA--AUCCCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU
 +
Salmonella_bongo  -----A--UCCGAA--GUGAAA--GCUUCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU
 +
Salmonella_enter  -----A--UCCGAA--GCGAAA--GCGUCGGG----AUAAUAAU--AACGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU
 +
Shigella_flexner  --------ACCGUC--GCUUAA-AGUGACGGC----AUAAUAAUAAAAAAA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU
 +
Shigella_sonnei_  --------ACCGUC--GCUUAA-AGUGACGGC----AUAAUAAUAAAAAAA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU
 +
citrobacter_kose  -----A--ACCAGG--GCGCUA-CGUCCUGGC----AUAAUAAU--AACGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGAAAUAUUGGCCAUUUUU
</div>
</div>

Revision as of 14:41, 18 September 2011

Bioinformatic

Bioinformatic study

With the aim of investigating the flexibility one has when engineering a sRNA regulator based on our system we performed a bioinformatic study to elucidate sequence and structure conservation among 24 bacterial species representative of the diversity within Enterobacteriaceae.

First the ChiP homolog including its ribosome binding site (RBS) was indentified in the 24 species by BlastP using the E.Coli protein as bait. Then, to identify putative ChiXs a local BlastN search was performed with the sequence flanking the RBS as query and the given genome as target (window size of 7). Hits in coding genes were excluded since we restrict ChiX-homologs to be in intergenic regions as is the case in E.Coli and Salmonella [Overgaard et al., 2009, Figueroa-Bossi et al., 2009]. In addition, hits which did not seem to have a putative -35 and -10 in a reasonable distance from the putative start-site were also exluded, leaving one hit for each of the 24 sequences. Next, the secondary structure of the 24 putative ChiX homologs were analysed by RNAfold (default parameter setting) from the Vienna RNA package version 2.0.0. and sequences length were selected for further analysis so they start with 1 or 2 nucleotides upstream of the first stemloop and end with 4 or 5 Ts downstream of the second stem-loop. In the next step, a structural alignment was made by the LocARNA server, also from the Vienna RNA package version 2.0.0 (using default parameter settings), and finally, the RNA secondary structure and sequence conservation was visualized by RNAlogo webserver. Afterwards the percentage of G/C and A/T (out of 24) was manually calculated. Note that at gap positions these two numbers do not sum to 100.

From the sequence logo it can be seen that the structure but not the sequence is conserved in the first stemloop, whereas both the structure and sequence is conserved in the second stemloop. Another interesting feature is the conserved A/U stretch around position 40 which resemble the characteristic Hfq binding motif [Vogetl and Luisi, 2011]. The sequence which base-pair with the ChiP RBS is perfectly conserved, this might be due to evolutionary selection for a strong RBS rather than functionally constraints (AAAGAGG is not a bad RBS, and is somewhat similar to the RBS BioBricks: BBa_B0030, BBa_B0034, BBa_B0035, and BBa_B0064 which has a relative strength of 0.35-1.124). Consequently we expect that complementary mutations can freely be made to match the RBS in any mRNA of interest.


Species

Below are the names as accession numbers listed for the 24 species included in the analysis.

Name Accession Number
Citrobacter koseri NC_009792
Citrobacter rodentium NC_013716
Cronobacter sakazakii CP000783
Cronobacter turicensis FN543093
Edwardsiella ictaluri CP001600
Edwardsiella tarda CP002154
Enterobacter aerogenes CP002824
Enterobacter cloacae CP002272
Enterobacter sp. CP000653
Erwinia billingiae FP236843
Escherichia coli AE005174
Klebsiella pneumonia CP002910
Klebsiella pneumonia CP000647
Klebsiella variicola CP001891
Pantoea sp. CP002433
Salmonella bongori FR877557
Salmonella enteric NC_010067
Serratia proteamaculans CP000826
Shigella flexneri AE014073
Shigella sonnei NC_007384
Yersinia enterocolitica FR729477
Yersinia pestis CP000308
Yersinia pestis NC_005810
Yersinia pseudotuberculosis NC_006155

Erwinia_billingi -----CCCCG-CUGGCAACCUGAUUGCUACCGG---AUAACUAAAAUCAUA-AAAAAAUUCCUCUUUGA-CGGGCCGAUAGCAAUAUUGGCCAUUUUU Pantoea_sp__At_9 AUUGGCCCGCGCCAGCCA-GUAAUGGCUGUGUG---AUAACCAAAAUCAUA-AACAAAUUCCUCUUUGA-CGGGCCGAUAGUAAUAUUGGCCUUCUUU Serratia_proteam -C---UAUUGUAACCGUUAACA--GGGUUACAAUGCGUA-UAACUACAAUACAAGAAAUUCCUCUUUGA-CUGGCCAGUAGCGAUAUUGGCCACUUUU Yersinia_enteroc -CU-GCUC--AUAUUUUCCGCAAGGAAUGUGAGGGCUUAAUAACUAAAAUAAUGAAAAUUCCUCUUUGA-CUGGCCGAUAGCGAUAUCGGCCAUUUUU Yersinia_pesti_1 --UGGCGCUCACAUUAUCCGCAAGGAGUGUGAGUGCUUAAUAACAAAAAUAAUGAAAAUUCCUCUUUGA-CUGGCCGGUAGUGAUAUCGGCCAUUUUU Yersinia_pestis_ --UGGCGCUCACAUUAUCCGCAAGGAGUGUGAGUGCUUAAUAACAAAAAUAAUGAAAAUUCCUCUUUGA-CUGGCCGGUAGUGAUAUCGGCCAUUUUU Yersinia_pseudot -CUGGUGCUCACAUUAUCCGCAAGGAGUGUGAGUGCUUAAUAACAAAAAUAAUGAAAAUUCCUCUUUGA-CUGGCCGGUAGUGAUAUCGGCCAUUUUU Citrobacter_rode --------ACCGUC--GCUUAA-AGCGGCGGC----AUAACAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU Cronobacter_saka -----A--ACCGUC--CGCUAAGGCGCACGGC----AUAACGACAAUAACG--AAAAGUUCCUCUUUGA-CGGGCCAGUAGCGAUACUGGCCUUCUUU Cronobacter_turi --------ACCGUU--CGCUAACGCGCACGGC----AUAACGAUAAUAACG---AAAGUUCCUCUUUGA-CGGGCCAGUAGCGAUACUGGCCUUCUUU Edwardsiella_ict -----ACCGGGCAC--CCCUUGGGGGGCGCCC-GGAAUAAUAAU---AUC---UGAAGUUCCUCUUUGACUG-GCCAAUAGCAAUAUUGGCCUUUUUU Edwardsiella_tar -----A--CCGGGU--CCCUAUGGGGGCCCGGCAUAAUAAUAAU---AUC---UGAAGUUCCUCUUUGACUG-GCCAAUAGCGAUAUUGGCCUUUUUU Enterobacter_aer -----A--UCCGGG--AUGUAU--AUCCCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU Enterobacter_clo -----A--UCCGGA--GUGCGA--ACUCCGGG----AUAAUAAU--AACGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGAAAUAUUGGCCAUUUUU Enterobacter_sp_ -----A--ACCGAG--GGUCU---CCUUCGGC----AUAAUAAU--AACGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGAAAUAUUGGCCAUUUUU Escherichia_coli -----C--ACCGUC--GCUUAA-AGUGACGGC----AUAAUAAUAAAAAAA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU Klebsiella_pne_1 -----A--UCCGGG--AUGCAA--AUCCCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU Klebsiella_pneum -----A--UCCGGG--AUGCAA--AUCCCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU Klebsiella_varii -----A--UCCGGG--AUGCAA--AUCCCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU Salmonella_bongo -----A--UCCGAA--GUGAAA--GCUUCGGG----AUAAUAAU--AAUGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCAAUAUUGGCCAUUUUU Salmonella_enter -----A--UCCGAA--GCGAAA--GCGUCGGG----AUAAUAAU--AACGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU Shigella_flexner --------ACCGUC--GCUUAA-AGUGACGGC----AUAAUAAUAAAAAAA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU Shigella_sonnei_ --------ACCGUC--GCUUAA-AGUGACGGC----AUAAUAAUAAAAAAA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGCGAUAUUGGCCAUUUUU citrobacter_kose -----A--ACCAGG--GCGCUA-CGUCCUGGC----AUAAUAAU--AACGA--UGAAAUUCCUCUUUGA-CGGGCCAAUAGAAAUAUUGGCCAUUUUU


References

Figueroa-Bossi, Nara, Martina Valentini, Laurette Malleret, and Lionello Bossi. “Caught at its own game: regulatory small RNA inactivated by an inducible transcript mimicking its target.” Genes & Development 23, no. 17 (2009): 2004 -2015. http://genesdev.cshlp.org/content/23/17/2004.abstract.

Overgaard, Martin, Jesper Johansen, Jakob Møller‐Jensen, and Poul Valentin‐Hansen. “Switching off small RNA regulation with trap‐mRNA.” Molecular Microbiology 73, no. 5 (September 2009): 790-800. http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2958.2009.06807.x/abstract.

Vogel, Jörg, and Ben F Luisi. “Hfq and its constellation of RNA.” Nature reviews. Microbiology 9, no. 8 (2011): 578-589.