Team:CBNU-Korea/Methods/Stat

From 2011.igem.org

(Difference between revisions)
Line 84: Line 84:
#logo{
#logo{
-
width:700px;
+
width:900px;
background:transparent;
background:transparent;
overflow:auto;
overflow:auto;
Line 103: Line 103:
<img src="https://static.igem.org/mediawiki/2011/9/90/MethodInformation_Statistics.png">
<img src="https://static.igem.org/mediawiki/2011/9/90/MethodInformation_Statistics.png">
</div>
</div>
-
<div id="test" style="width:700px; height:450px; background:transparent; overflow:auto;">
+
<div id="test" style="width:900px; height:450px; background:transparent; overflow:auto;">
<br><font color=white size="4" face="Tahoma">
<br><font color=white size="4" face="Tahoma">
-
To design the Synthetic Minimal Chromosome of a bacterium, that the information on essential genes, such as direction, position, length and function, is essential. In addition a new analyzing method which calculates the distance between replication origin and each essential gene (DTO; Distance to origin) in each species, and provides the number of essential genes within 10 percent of total genome size are required. Using this method, we confirmed a distribution of essential genes in each organism.<br>
+
We analysis basic statistics and graph to study characterization of various parameter about 15 species. After classified EG, strand, direction, we analysis mean, standard deviation about 15 species. As a results of analysis, group mean is similar with no significant difference. Also, standard deviation can be said to satisfy the homoscedacity without a significant difference. So, the test about average difference of DistToOri was meaningless. Therefore, analysis of frequency about species.<br>
-
In this study the information of essential genes will be obtained from DEG (Database of Essential Genes). We will re-group essential genes by COG distribution for construction of our database which is connected to a software named GOD (Genome Organization Database & Designer).
+
<img src="https://static.igem.org/mediawiki/2011/4/4a/Synb_Stat_001.png"><br>
 +
To find out the characteristics of species, we draw a histogram like that.<br>
 +
-The frequency of gene about scale<br>
 +
<img src="https://static.igem.org/mediawiki/2011/9/94/Synb_Stat_002.png"><br>
 +
-The frequency of scale about strand and direction<br>
 +
<img src="https://static.igem.org/mediawiki/2011/2/25/Synb_Stat_003.png"><br>
 +
-The frequency of gene about two group (leading, lagging)<br>
 +
<img src="https://static.igem.org/mediawiki/2011/9/9c/Synb_Stat_004.png"><br>
 +
<br>
 +
Before the estimation of distribution, we start to study about transform dataset of Gamma’s 8 species and analysis basic statistics and graph. We make all Gamma’s data (NC_000907 (Haemophilus influenzae Rd KW20), NC_000913 (Escherichia coli MG1655), NC_002505 (Vibrio cholerae N16961), NC_002506 (Vibrio cholerae N16961), NC_003197 (Salmonella typhimurium LT2), NC_004631 (Salmonella enterica serovar Typhi), NC_005966 (Acinetobacter baylyi ADP1), NC_008463 (Pseudomonas aeruginosa UCBPP-PA14)) into one data set(dataset gamma_all)<br>
 +
<img src="https://static.igem.org/mediawiki/2011/6/6f/Synb_Stat_005.png"><br>
 +
<img src="https://static.igem.org/mediawiki/2011/d/d5/Synb_Stat_006.png"><br>
 +
<br>
 +
To see if the proportion between the essential gene is differences in gamma, we performed the hypothesis test. Hypotheses are as follows.<br>
 +
 
 +
 
</font>
</font>
</div>
</div>

Revision as of 21:33, 5 October 2011


We analysis basic statistics and graph to study characterization of various parameter about 15 species. After classified EG, strand, direction, we analysis mean, standard deviation about 15 species. As a results of analysis, group mean is similar with no significant difference. Also, standard deviation can be said to satisfy the homoscedacity without a significant difference. So, the test about average difference of DistToOri was meaningless. Therefore, analysis of frequency about species.

To find out the characteristics of species, we draw a histogram like that.
-The frequency of gene about scale

-The frequency of scale about strand and direction

-The frequency of gene about two group (leading, lagging)


Before the estimation of distribution, we start to study about transform dataset of Gamma’s 8 species and analysis basic statistics and graph. We make all Gamma’s data (NC_000907 (Haemophilus influenzae Rd KW20), NC_000913 (Escherichia coli MG1655), NC_002505 (Vibrio cholerae N16961), NC_002506 (Vibrio cholerae N16961), NC_003197 (Salmonella typhimurium LT2), NC_004631 (Salmonella enterica serovar Typhi), NC_005966 (Acinetobacter baylyi ADP1), NC_008463 (Pseudomonas aeruginosa UCBPP-PA14)) into one data set(dataset gamma_all)



To see if the proportion between the essential gene is differences in gamma, we performed the hypothesis test. Hypotheses are as follows.