Team:UPO-Sevilla/Foundational Advances/MiniTn7/Bioinformatics/Whole phylogeny

From 2011.igem.org

Grey iGEM Logo UPO icon

Results. Analysis throughout the whole phylogeny

Previously to the study of the codifying nucleotide glmS sequences, Glms protein was analyzed to check the reported high level of conservation of the C-terminal part of this protein (Milewski, 2002). Thus, a multialignment, a circular and squared tree and several logos of all the downloaded Glms sequences were obtained (figure 5). It is mentionable the high level of conservation found in the last 15 amino acids of this protein in all organisms which contrast with the lack of any conservation in the previous amino acidic sequence (this difference is perfectly shown in all sequence logos). It is also interesting that the consensus sequence of this C-terminal part of Glms protein match perfectly with the amino acidic sequence resulting when translating the nucleotide consensus obtained in figure 3.

Glms protein multi-sequence alignment

Figure 5. Glms protein multi-sequence alignment


Phylogenetic trees obtained for the figure 6 multi sequence alignment

Figure 6. A. Circular phylogenetic tree obtained for the figure 6 multi sequence alignment. B. Squared phylogenetic tree for the same sequences as in A.


Bacteria, Archaea and Fungi logo

Figure 7. A. Bacteria and Archaea obtained logo for the last 16 amino acids of the C-terminal region of Glms. B. Fungi obtained logo for the last part of Glms.


Eukaryotic and Complete logo

Figure 8. A. Calculated logo by using the Glms eukaryotic sequences shown in figure 5 (without fungal sequences). B. Obtained logo with all the Glms sequences aligned in the figure 5 multi-alignment.


After the conservational study of the Glms protein sequence, some glmS gene codifying sequences (those representing the main groups and model organisms) were also aligned to elucidate the level of conservation of the attTn7 insertion site (figure 9.A). In this case, the level of conservation was much lower than observed at protein level. Moreover, a logo was obtained for this nucleotide sequence range (figure 9.B), whose sequence was also translated. In this logo (as in the logo obtained for the organisms with Tn7 family transposon) the first two positions at each codon were conserved, showing a strong variation at the ‘wobble’ position. This result was expected because it corresponds to a codifying sequence. When comparing this last logo sequence with the logo sequence obtained for the sequences of organisms with the Tn7 system (figure 10) it is surprisingly the high level of similarity founded.


Multialignment and logo of all kingdoms

Figure 9. A. glmS codifying sequence multialignment of organisms of all kingdoms. B. Obtained logo for the attTn7 site. Logo nucleotide sequence translation shown below..


Multialignment and logo of all kingdoms

Figure 10. Comparison of the obtained attTn7 nucleotide logos. A. Logo representing organisms which have the Tn7 family transposon. B. Logo calculated with all the sequences showed in figure 9.A (sequences distributed among the whole phylogeny).