Team:Harvard/Template:NotebookData3

From 2011.igem.org

(Difference between revisions)
Line 429: Line 429:
|}
|}
*Green cells are our target sequences.</div>
*Green cells are our target sequences.</div>
-
<div id="624" style="display:none">
 
-
==June 24==
 
-
*Designed primer for testing HisB deletion, reuse His_Internal_R to test the band
 
-
 
-
===Updated Closest Zif268 Fingers===
 
-
We realized that some of our "close non-zif268 fingers" were actually not all that close to Zif268, and so we went into the 88,000 zinc finger database and pulled out zinc fingers surrounding zif268.  In fact, there were many, many, many zinc fingers that had identical sequences to the Zif268 F2 finger, and so we looked at sequences around it.  The tree below shows the new non-zif268 backbones that are actually close to zif268 compared to our old set.  The new set is in gray, the old set is in black.  This gives us a potential seven more backbones to work with.
 
-
[[File:ComparisonTree.png‎]]
 
-
==June 24th - Bioinformatics==
 
-
 
-
===Sequence Generation===
 
-
We made some small updates to the sequence generator, based on the frequencies we noticed in the outputs of the tests we ran.
 
-
*We decided to only include pseudocounts for position 6 for 'CNN' and 'ANN.' Originally, 'CNN' and 'ANN' were using pseudocounts for all seven positions. However, this introduced a noticeable increase in amino acids, such as tyrosine (Y), that have been shown to occur rarely in zinc fingers (according to our data from OPEN and Persikov). Additionally, because tryosines occured so rarely in the data (11 times total in the open data set), we decided not to give tyrosine a pseudocount.
 
-
*We added the capability to prevent repeat backbone-helix combinations on the chip. That is, we wanted to make sure that the same exact zinc finger was not generated for different triplet inputs.
 
-
 
-
 
-
To test the sequence generator, we made two sets of 2000 sequences for GAA, then infographic-d the results. Comparing these with the images for OPEN and OPEN+Persikov shows that our generation follows the major themes of those datasets, but also introduces variation. The two generated sets also vary slightly from each other, which shows the influence of randomness on the generation.
 
-
 
-
{|
 
-
| [[File:GAA_generated_round_1.png|thumb|left|Round 1 of generating sequences for GAA with the program.]]
 
-
| [[File:GAA_generated_round_2.png|thumb|left|Round 2 of generating sequences for GAA with the program.]]
 
-
|-
 
-
| [[File:GAA_open_and_persikov.png|thumb|left|GAA sequences from the OPEN dataset.]]
 
-
| [[File:GAA_open_only.png|thumb|left|GAA sequences from Persikov and OPEN datasets.]]
 
-
|}
 
-
 
-
{| class="wikitable" border="3" cellpadding="5"
 
-
| align="center" style="background:#f0f0f0;"|'''Disease'''
 
-
| align="center" style="background:#f0f0f0;"|'''Target DNA Finger 1'''
 
-
| align="center" style="background:#f0f0f0;"|'''Helices in Zif268 Backbone'''
 
-
| align="center" style="background:#f0f0f0;"|'''Helices in Zif268 Closely-Related Backbones'''
 
-
| align="center" style="background:#f0f0f0;"|'''Helices in Zif268 Distantly-Related Backbones'''
 
-
|-
 
-
| Colorblindness ''(Bottom)''||TGG||5150||3000||1000
 
-
|-
 
-
| Colorblindness ''(Top)''||ATG||3050||3050||3050
 
-
|-
 
-
| Familial Hypercholesterolemia ''(Bottom)''||GAC||5150||3000||1000 
 
-
|-
 
-
| Familial Hypercholesterolemia ''(Top)''||CTG||3050||3050||3050
 
-
|-
 
-
| Myc ''(Top<sub>198</sub>)''||CTC||3050||3050||3050
 
-
|-
 
-
| Myc ''(Top<sub>981</sub>)''||AAA||3050||3050||3050
 
-
|-
 
-
|}
 
-
 
-
Table of target sequences and helix distribution across backbones
 
-
 
-
*Distribution: Zif268 : Zif268 similar : Zif 268 dissimilar
 
-
**Conservative distribution 56.3 : 32.8 : 10.9
 
-
**Riskier distribution 33.3 : 33.3 : 33.3
 
-
 
-
==June 24th==
 
-
'''pZE21G:'''
 
-
*reinoculated culture with 100µL of saturated solution, grew to mid-log, and made glycerol stock
 
-
*backbone PCR: ran E gel but no bands--PCR unsuccessful. We may need to use a different backbone for the zinc fingers.
 
-
 
-
'''Omega and Omega+Zif268:'''
 
-
*these were the only two PCR reactions from 6/22/11 to work
 
-
*PCR purified using Qiagen kit:
 
-
**omega: 6.1ng/µL, 260/280=1.83
 
-
**omega+Zif268: 11.3 ng/µL, 260/280=1.67
 
-
 
-
'''Lambda red recombination of selection system:'''
 
-
*reinoculated selection strain+pKD46 with 100µL of saturated solution
 
-
*just before mid-log (about 4 hours after inoculation) divided culture in half (1.5mL) and added either 37.5µL or 3.75µL of 20% arabinose solution (to try two different induction levels). Cultures grew for another hour.
 
-
*The rest of the procedure was the same as the 6/22/11 attempt but without the 42C water bath.
 
-
 
-
==June 24th - Bioinformatics==
 
-
===Playing with Pseudocounts===
 
-
 
-
Using CTC because of position 6's reliance on the CNN frequencies, we see what difference values of pseudocounts (if in the frequency table, the frequency of an amino acid is 0, bump it up to the psuedocount: ex. A = 0 becomes A = .015 with a psuedocount of .015) make. Pseudocounts are necessary for data that has small sample size - we could be missing out on working helices because a letter's frequency is 0 when it shouldn't be.
 
-
 
-
Various pseudocount (psu = ) values. Look at the 7th column, which is position 6 in the helix:
 
-
 
-
{|
 
-
| [[File:CTC_0.png|thumb|left|psu = 0]]
 
-
| [[File:CTC_.005_psuedo.png|thumb|left|psu = .005]]
 
-
| [[File:CTC_.008_psuedo.png|thumb|left|psu = .008]]
 
-
|-
 
-
| [[File:CTC_.01.png|thumb|left|psu = .01]]
 
-
| [[File:CTC_.015_psuedo.png|thumb|left|psu = .015.]]
 
-
| [[File:CTC_.02_psuedo.png|thumb|left|psu = .020.]]
 
-
|}
 
-
 
-
The variation from E being the top letter to A being top back to E is from a slight adjustment in how we add on psuedocounts: the 'new' way is a more proportional approach.
 
-
 
-
Notice how psu = 0 gives only the four letters found in our dataset, while psu > 0 adds in other letters, each with a small probability ranging from .5% to 2%.
 
-
 
-
The question is how much psu to add: less means we weight our (possibly flawed) data of proven zinc fingers more. Higher psu adds more randomness (variation) to our sequences, but some fraction of those sequences will not work.
 
-
 
-
'''List of Remaining Goals:'''
 
-
*Sort fingers by target
 
-
*Pick and assign primer sets
 
-
*Reverse translate fingers avoiding type II restriction enzymes and primers
 
-
*Append type II restriction enzyme and primer sequences to each finger
 
-
*Yay</div>
 
-
<div id="625" style="display:none">
 
-
==June 25th-26th - Bioinformatics==
 
-
 
-
This is the set of final target sequences with assigned forward and reverse primers (tags for PCR):
 
-
{| class="wikitable" cellpadding="5"
 
-
| align="center" style="background:#f0f0f0;"|'''Disease'''
 
-
| align="center" style="background:#f0f0f0;"|'''Target Sequence'''
 
-
| align="center" style="background:#f0f0f0;"|'''Forward Primer (5'-3' NOT REVERSE COMPLEMENT)'''
 
-
| align="center" style="background:#f0f0f0;"|'''Reverse Primer (5'-3' NOT REVERSE COMPLEMENT)'''
 
-
|-
 
-
| Colorblindness||GCT GGC TGG||ATATAGATGCCGTCCTAGCG||AAGTATCTTTCCTGTGCCCA
 
-
|-
 
-
| Colorblindness||GCG GTA ATG||CCCTTTAATCAGATGCGTCG||TGGTAGTAATAAGGGCGACC
 
-
|-
 
-
| Familial Hypercholesterolemia||GGC TGA GAC||TTGGTCATGTGCTTTTCGTT||AGGGGTATCGGATACTCAGA
 
-
|-
 
-
| Familial Hypercholesterolemia||GGA GTC CTG||GGGTGGGTAAATGGTAATGC||ATCGATTCCCCGGATATAGC
 
-
|-
 
-
| Myc-gene Cancer||GGC TGA CTC||TCCGACGGGGAGTATATACT||TACTAACTGCTTCAGGCCAA
 
-
|-
 
-
| Myc-gene Cancer||GGC TGG AAA||CATGTTTAGGAACGCTACCG||AATAATCTCCGTTCCCTCCC
 
-
|}
 
-
 
-
 
-
Additionally, primer tags '''(forward: GTACATGAAACGATGGACGG, reverse:CTGGTATAGTCTCCTCAGCG)''' will be assigned to the 100 control sequences.</div>
 

Revision as of 23:51, 2 August 2011