Team:Calgary/Notebook/Calendar/Week3
From 2011.igem.org
Emily Hicks (Talk | contribs) |
Emily Hicks (Talk | contribs) |
||
Line 4: | Line 4: | ||
TITLE=Bioinformatics| | TITLE=Bioinformatics| | ||
BODY=<html> | BODY=<html> | ||
- | |||
<h4>An Unusual Interaction</h4> | <h4>An Unusual Interaction</h4> |
Revision as of 21:26, 27 September 2011
Bioinformatics
An Unusual Interaction
Recently, David Lloyd, a TA, came across an interesting study which analyzed the capacity of bacteria to degrade naphthenic acids. According to the study, both Pseudomonas putida and Pseudomonas fluorescens are capable of degrading small amounts of naphthenic acids, but when put in co-cultures with each other, their capacity increases to 95% elimination. What's more is that this degradation was effective across a broad spectrum of naphthenic acids, including those with one, two, and even three rings. The inference we drew from this effect is that there are some unique genes within each bacteria that, when allowed to interact with each other, are responsible for the degradation of naphthenic acids. In a sense, one bacteria's garbage is another bacteria's treasure. This project's hypothesis is that it is possible to narrow down the candidates for this pathway by using a bioinformatic survey.
Strategy for the Bioinformatic Survey
The goal of the bioinformatic survey is to provide leads to the experimental side (or wet lab) of the project. Two assumptions were made at the beginning of the survey. The first is that the two genomes are homologous enough to eliminate a substantial portion of each genome from consideration. The other assumption is that the gene of interest is located within the non-homologous regions of the either genome. If both assumptions are correct, then it should be possible to create a short list of candidate genes involved in the degradation pathway. Knowing what genes are involved in the degradation means that wet lab can simply look upstream for a naphthenic acid promoter.
June 27- July 1, 2011
My initial thinking is that both genomes consist of two parts: parts that have a homology to the other genome, and parts that don't. Eliminating the parts that do by definition would immediately reveal the parts that aren't. Since homology tends to be between similar sequences rather than exact matches, a statistical approach could be used that determines homology based on the significance of the finding.
Patrick Wu, my colleague, started off by looking for software that we could reuse for our application. DNA manipulation is sufficicently complicated enough that it makes no sense to reinvent the wheel; after some digging around, he found an open-source DNA alignment tool called MUMmer. MUMmer uses a suffix tree (a type of data structure) that works in O(n) time to rapidly align whole genomes; in the process, MUMmer provides information on single nucleotide changes, translocations, and homologous/similar genes.
Currently, we are looking into how MUMmer can be used to compare the homology of the genomes. If it turns out that the homology is not significant enough, then some other criteria must be used to narrow down the list of unique genes.
July 4-8, 2011
This week, Patrick went back to working on the Wiki, and I started to develop software for performing the big computation. My hope was to develop an original application based on several modules, which would process the entire genome. Module 1 would read and transcribe each of the genes in the entire genome. Module 2 would "compare the genes" and eliminate similar genes, and Module 3 would get the identity of each gene and sort them in some relevant order. However, my program for discarding junk DNA in Module 1 was not working as well as I had hoped; it identified over 150 genes in Mumps Virus, when there is only 7 according to BLAST. In the process of developing this software, I learned more about evolution and homology, horizontal gene transfer, the nitty-gritty details of transcription, and about codon tables.
July 11-15, 2011
This week, I realized that any organism with a sufficient number of genes that were horizontally transferred would, by definition, have genes that are non-homologous to other species within the same genus. Therefore, one way to find the naphthenic acid promoter is to look for genes which are horizontally transferred; horizontal transfer is a process by which bacteria share genetic information with dissimilar species.
In the process of looking up how to identify such a gene, I encountered a wonderful program called Dark Horse, which uses an algorithm for finding phylogenetically atypical proteins among bacterial strains on a genome-wide basis - in other words, horizontally transferred genes. So I looked up Pseudomonas putida and fluorescens and we instantly obtained a list of 200 such genes. We are now cataloguing the bacterial species in this list, in order to create a comprehensive list of genes to test in the lab.