Team:Harvard/Template:NotebookData4

From 2011.igem.org

(Difference between revisions)
(June 24th - Bioinformatics)
 
(8 intermediate revisions not shown)
Line 2: Line 2:
==June 24==
==June 24==
*Designed primer for testing HisB deletion, reuse His_Internal_R to test the band
*Designed primer for testing HisB deletion, reuse His_Internal_R to test the band
 +
 +
'''pZE21G:'''
 +
*reinoculated culture with 100µL of saturated solution, grew to mid-log, and made glycerol stock
 +
*backbone PCR: ran E gel but no bands--PCR unsuccessful. We may need to use a different backbone for the zinc fingers.
 +
 +
'''Omega and Omega+Zif268:'''
 +
*these were the only two PCR reactions from 6/22/11 to work
 +
*PCR purified using Qiagen kit:
 +
**omega: 6.1ng/µL, 260/280=1.83
 +
**omega+Zif268: 11.3 ng/µL, 260/280=1.67
 +
 +
'''Lambda red recombination of selection system:'''
 +
*reinoculated selection strain+pKD46 with 100µL of saturated solution
 +
*just before mid-log (about 4 hours after inoculation) divided culture in half (1.5mL) and added either 37.5µL or 3.75µL of 20% arabinose solution (to try two different induction levels). Cultures grew for another hour.
 +
*The rest of the procedure was the same as the 6/22/11 attempt but without the 42C water bath.
 +
 +
==June 24th - Bioinformatics==
 +
===Playing with Pseudocounts===
 +
 +
Using CTC because of position 6's reliance on the CNN frequencies, we see what difference values of pseudocounts (if in the frequency table, the frequency of an amino acid is 0, bump it up to the psuedocount: ex. A = 0 becomes A = .015 with a psuedocount of .015) make. Pseudocounts are necessary for data that has small sample size - we could be missing out on working helices because a letter's frequency is 0 when it shouldn't be.
 +
 +
Various pseudocount (psu = ) values. Look at the 7th column, which is position 6 in the helix:
 +
 +
{|
 +
| [[File:HARVCTC_0.png|thumb|left|psu = 0]]
 +
| [[File:HARVCTC_.005_psuedo.png|thumb|left|psu = .005]]
 +
| [[File:HARVCTC_.008_psuedo.png|thumb|left|psu = .008]]
 +
|-
 +
| [[File:HARVCTC_.01.png|thumb|left|psu = .01]]
 +
| [[File:HARVCTC_.015_psuedo.png|thumb|left|psu = .015.]]
 +
| [[File:HARVCTC_.02_psuedo.png|thumb|left|psu = .020.]]
 +
|}
 +
 +
The variation from E being the top letter to A being top back to E is from a slight adjustment in how we add on psuedocounts: the 'new' way is a more proportional approach.
 +
 +
Notice how psu = 0 gives only the four letters found in our dataset, while psu > 0 adds in other letters, each with a small probability ranging from .5% to 2%.
 +
 +
The question is how much psu to add: less means we weight our (possibly flawed) data of proven zinc fingers more. Higher psu adds more randomness (variation) to our sequences, but some fraction of those sequences will not work.
===Updated Closest Zif268 Fingers===
===Updated Closest Zif268 Fingers===
We realized that some of our "close non-zif268 fingers" were actually not all that close to Zif268, and so we went into the 88,000 zinc finger database and pulled out zinc fingers surrounding zif268.  In fact, there were many, many, many zinc fingers that had identical sequences to the Zif268 F2 finger, and so we looked at sequences around it.  The tree below shows the new non-zif268 backbones that are actually close to zif268 compared to our old set.  The new set is in gray, the old set is in black.  This gives us a potential seven more backbones to work with.
We realized that some of our "close non-zif268 fingers" were actually not all that close to Zif268, and so we went into the 88,000 zinc finger database and pulled out zinc fingers surrounding zif268.  In fact, there were many, many, many zinc fingers that had identical sequences to the Zif268 F2 finger, and so we looked at sequences around it.  The tree below shows the new non-zif268 backbones that are actually close to zif268 compared to our old set.  The new set is in gray, the old set is in black.  This gives us a potential seven more backbones to work with.
[[File:HARVComparisonTree.png‎]]
[[File:HARVComparisonTree.png‎]]
-
==June 24th - Bioinformatics==
 
-
 
===Sequence Generation===
===Sequence Generation===
We made some small updates to the sequence generator, based on the frequencies we noticed in the outputs of the tests we ran.  
We made some small updates to the sequence generator, based on the frequencies we noticed in the outputs of the tests we ran.  
Line 50: Line 86:
**Conservative distribution 56.3 : 32.8 : 10.9
**Conservative distribution 56.3 : 32.8 : 10.9
**Riskier distribution 33.3 : 33.3 : 33.3
**Riskier distribution 33.3 : 33.3 : 33.3
-
 
-
==June 24th==
 
-
'''pZE21G:'''
 
-
*reinoculated culture with 100µL of saturated solution, grew to mid-log, and made glycerol stock
 
-
*backbone PCR: ran E gel but no bands--PCR unsuccessful. We may need to use a different backbone for the zinc fingers.
 
-
 
-
'''Omega and Omega+Zif268:'''
 
-
*these were the only two PCR reactions from 6/22/11 to work
 
-
*PCR purified using Qiagen kit:
 
-
**omega: 6.1ng/µL, 260/280=1.83
 
-
**omega+Zif268: 11.3 ng/µL, 260/280=1.67
 
-
 
-
'''Lambda red recombination of selection system:'''
 
-
*reinoculated selection strain+pKD46 with 100µL of saturated solution
 
-
*just before mid-log (about 4 hours after inoculation) divided culture in half (1.5mL) and added either 37.5µL or 3.75µL of 20% arabinose solution (to try two different induction levels). Cultures grew for another hour.
 
-
*The rest of the procedure was the same as the 6/22/11 attempt but without the 42C water bath.
 
-
 
-
==June 24th - Bioinformatics==
 
-
===Playing with Pseudocounts===
 
-
 
-
Using CTC because of position 6's reliance on the CNN frequencies, we see what difference values of pseudocounts (if in the frequency table, the frequency of an amino acid is 0, bump it up to the psuedocount: ex. A = 0 becomes A = .015 with a psuedocount of .015) make. Pseudocounts are necessary for data that has small sample size - we could be missing out on working helices because a letter's frequency is 0 when it shouldn't be.
 
-
 
-
Various pseudocount (psu = ) values. Look at the 7th column, which is position 6 in the helix:
 
-
 
-
{|
 
-
| [[File:HARVCTC_0.png|thumb|left|psu = 0]]
 
-
| [[File:HARVCTC_.005_psuedo.png|thumb|left|psu = .005]]
 
-
| [[File:HARVCTC_.008_psuedo.png|thumb|left|psu = .008]]
 
-
|-
 
-
| [[File:HARVCTC_.01.png|thumb|left|psu = .01]]
 
-
| [[File:HARVCTC_.015_psuedo.png|thumb|left|psu = .015.]]
 
-
| [[File:HARVCTC_.02_psuedo.png|thumb|left|psu = .020.]]
 
-
|}
 
-
 
-
The variation from E being the top letter to A being top back to E is from a slight adjustment in how we add on psuedocounts: the 'new' way is a more proportional approach.
 
-
 
-
Notice how psu = 0 gives only the four letters found in our dataset, while psu > 0 adds in other letters, each with a small probability ranging from .5% to 2%.
 
-
 
-
The question is how much psu to add: less means we weight our (possibly flawed) data of proven zinc fingers more. Higher psu adds more randomness (variation) to our sequences, but some fraction of those sequences will not work.
 
'''List of Remaining Goals:'''
'''List of Remaining Goals:'''
Line 97: Line 94:
*Yay</div>
*Yay</div>
<div id="625" style="display:none">
<div id="625" style="display:none">
 +
==June 25th-26th - Bioinformatics==
==June 25th-26th - Bioinformatics==
Line 156: Line 154:
===100 Control Sequences===
===100 Control Sequences===
-
* See our [[File:HARVPositive Control Sequences PostMacro.xlsx]], updated June 28th
+
* See our [https://static.igem.org/mediawiki/2011/5/5d/HARVPositive_Control_Sequences_PostMacro.pdf Positive Control Sequences], updated June 28th
* Selected known binding zinc fingers from the CODA table that bind sequences similar to our target sequences  
* Selected known binding zinc fingers from the CODA table that bind sequences similar to our target sequences  
* All control helices from CODA were inserted into Zif268 F2 backbones and have been assigned a seventh primer tag separate from the tags given to the 6 target sequences.
* All control helices from CODA were inserted into Zif268 F2 backbones and have been assigned a seventh primer tag separate from the tags given to the 6 target sequences.
Line 184: Line 182:
===Cut Site Design===
===Cut Site Design===
-
*See our [[Cut Site Design]] page
+
*See our [https://2011.igem.org/Team:Harvard/Cut_Site_Design Cut Site Design] page
*We left in one proline (P) between the linker and the starting FCQ... of finger 2, but as this proline is the last AA of the OPEN linker (TGEKP) and occurs before the beta sheet in every zinc finger in Zif268 (see zif268's sequence on its [http://www.pdb.org/pdb/explore/remediatedSequence.do?structureId=1AAY PDB page])
*We left in one proline (P) between the linker and the starting FCQ... of finger 2, but as this proline is the last AA of the OPEN linker (TGEKP) and occurs before the beta sheet in every zinc finger in Zif268 (see zif268's sequence on its [http://www.pdb.org/pdb/explore/remediatedSequence.do?structureId=1AAY PDB page])
*This configuration also allows the library to be used at any finger position because proline ends the OPEN linker.
*This configuration also allows the library to be used at any finger position because proline ends the OPEN linker.
Line 232: Line 230:
<font color=red>'''''Attention all Harvard iGEM-ers!!!'''''</font> <font color=blue> According to the [https://2011.igem.org/Main_Page iGEM Main Page], our preliminary project descriptions and safety proposals are due on</font> <font color=red>'''''July 15'''''</font>. <font color=blue> Please see the aforementioned link so we can get this done ASAP- we don't want to miss any deadlines and have all our hard work wasted!</font>  
<font color=red>'''''Attention all Harvard iGEM-ers!!!'''''</font> <font color=blue> According to the [https://2011.igem.org/Main_Page iGEM Main Page], our preliminary project descriptions and safety proposals are due on</font> <font color=red>'''''July 15'''''</font>. <font color=blue> Please see the aforementioned link so we can get this done ASAP- we don't want to miss any deadlines and have all our hard work wasted!</font>  
-
 
+
*Finalized our [https://static.igem.org/mediawiki/2011/5/5d/HARVPositive_Control_Sequences_PostMacro.pdf Positive Control Sequences], using Justin's macro to insert the F1 helices into the appropriate zif268 F2 backbone
-
*Finalized our [[File:HARVPositive Control Sequences PostMacro.xlsx|Positive Control Sequence Table]], using Justin's macro to insert the F1 helices into the appropriate zif268 F2 backbone
+
-
 
+
*Length of chip oligos: 131-140bp (based on [[Cut Site Design]])
*Length of chip oligos: 131-140bp (based on [[Cut Site Design]])
Line 242: Line 238:
**Type II binding/cut sites= 11bp on each side (22bp total)
**Type II binding/cut sites= 11bp on each side (22bp total)
**Standard legnth: 40 + 69 + 22 = 131bp
**Standard legnth: 40 + 69 + 22 = 131bp
-
 
*Use WebLogos as a final visual check of our final generated sequences
*Use WebLogos as a final visual check of our final generated sequences
-
 
===Plasmid and Oligo Design Schematics===
===Plasmid and Oligo Design Schematics===
Line 263: Line 257:
===Harvard Logo===
===Harvard Logo===
-
 
{|
{|
  | [[File:HARVHarvard_logo.png|thumb|left|]]
  | [[File:HARVHarvard_logo.png|thumb|left|]]
|}
|}
-
 
-
 
===Running the Generator!===
===Running the Generator!===
[[File:HARVFasta_total.csv]] NOTE: LATER GENERATED NEW SEQUENCES. NOT UP TO DATE.
[[File:HARVFasta_total.csv]] NOTE: LATER GENERATED NEW SEQUENCES. NOT UP TO DATE.
Line 276: Line 267:
**During this time, we created a function that will re-translate the sequences that the generator output. It compares the original helix with the re-translated helix to make sure that our reverse-translate works properly.
**During this time, we created a function that will re-translate the sequences that the generator output. It compares the original helix with the re-translated helix to make sure that our reverse-translate works properly.
***This step went smoothly, and we verified that the sequences were reverse-translated properly.
***This step went smoothly, and we verified that the sequences were reverse-translated properly.
-
**To make sure that the distributions generated were as expected, we made [[#Generated WebLogos for Final Chip|WebLogos]] of the helices generated(see below).
+
**To make sure that the distributions generated were as expected, we made WebLogos of the helices generated(see below).
*The output file (in the Dropbox: iGem > chip > final chip.csv) originally had the following headers: 'Target', 'Backbone #', 'Helix Sequence', 'Backbone Sequence', 'Nucleotide Sequence of Zinc Finger'
*The output file (in the Dropbox: iGem > chip > final chip.csv) originally had the following headers: 'Target', 'Backbone #', 'Helix Sequence', 'Backbone Sequence', 'Nucleotide Sequence of Zinc Finger'
**We wanted to convert this information into FASTA format.
**We wanted to convert this information into FASTA format.
Line 295: Line 286:
  | [[File:HARVTGG.png|thumb|left|TGG]]
  | [[File:HARVTGG.png|thumb|left|TGG]]
|}
|}
-
 
*FASTA-Formatted Chip Data:
*FASTA-Formatted Chip Data:
Line 308: Line 298:
|}
|}
-
 
+
[[File:HARVPrimer_Index_iGEM_2011.xls]]
-
[[File:HARVPrimer Index_iGEM 2011]]
+
===Design of Plate Practice Sequences===
===Design of Plate Practice Sequences===
Line 328: Line 317:
The primer tag sequences for the 90 generated sequence subset will be the same as they are on the chip (for the sake of explanation, we will refer to them now as P1F and P1R in this paragraph).  The positive controls will be flanked immediately by the same primers as the generated subset so that we can amplify everything as one pool altogether should we need to (so this will be P1F and P1R).  However, we will also put an additional set of primers outside of the P1F/P1R primers for the positive controls so that we can specifically amplify the positive control subpool, should we want to.  These primers will be the same as the primers for the positive control on the chip (which will be called P2F and P2R here).
The primer tag sequences for the 90 generated sequence subset will be the same as they are on the chip (for the sake of explanation, we will refer to them now as P1F and P1R in this paragraph).  The positive controls will be flanked immediately by the same primers as the generated subset so that we can amplify everything as one pool altogether should we need to (so this will be P1F and P1R).  However, we will also put an additional set of primers outside of the P1F/P1R primers for the positive controls so that we can specifically amplify the positive control subpool, should we want to.  These primers will be the same as the primers for the positive control on the chip (which will be called P2F and P2R here).
-
 
To recap, on the chip we will have the following oligos :
To recap, on the chip we will have the following oligos :
Line 343: Line 331:
Oligo (+ control, 6 total):  | P2F | P1F | type II binding site | generated F1 | type II binding site | P1R | P2R |
Oligo (+ control, 6 total):  | P2F | P1F | type II binding site | generated F1 | type II binding site | P1R | P2R |
</pre>
</pre>
-
 
Once we get our test sequences back from IDT, they will come in a 96-well plate with one oligo in each plate.  We should make a mixture using some of each well in order to create a tube that contains all 96 sequences.  This will simulate the tube that we will receive from Agilent, except instead of 55,000 sequences we will have 96 sequences only in this tube.  From here, we can practice using this as a library.
Once we get our test sequences back from IDT, they will come in a 96-well plate with one oligo in each plate.  We should make a mixture using some of each well in order to create a tube that contains all 96 sequences.  This will simulate the tube that we will receive from Agilent, except instead of 55,000 sequences we will have 96 sequences only in this tube.  From here, we can practice using this as a library.
Line 353: Line 340:
We will be repeating these exact same steps once we get the chip, so if we can perfect our protocols with these practice sequences, we should be golden when the chip comes in.</div>
We will be repeating these exact same steps once we get the chip, so if we can perfect our protocols with these practice sequences, we should be golden when the chip comes in.</div>
<div id="629" style="display:none">
<div id="629" style="display:none">
 +
==June 29th==
==June 29th==
Our first day with everyone in the wet lab!
Our first day with everyone in the wet lab!

Latest revision as of 16:56, 8 August 2011