Team:Harvard/Notebook

From 2011.igem.org

(Difference between revisions)

Revision as of 01:12, 2 August 2011

bar

June 28th

Sequencing:

the following samples from 6/27 were sent to Genewiz for sequencing:
- PyrF F, R (one sample with PyrF_F, one with PyrF_R)
- rpoz F, R (one sample with rpoz_F, one with rpoz_R)

Lambda red results:

the colonies on the plates did not look promising, and the ones we chose and grew up in LB+kan did not actually grow. Just to be certain, we choose 18 more colonies: 6 from 37.5µL arabinose 100µL plated, 6 from 3.75µL arabinose 100µL plated, and 6 from 37.5µL arabinose 1.5mL plated. Three from each plate were grown in plain LB and three with kan. We will let it grow in 30˚C, overnight if necessary, and hopefully see bacteria for PCR.
Assuming this does not work, we prepared more ∆HisB∆PyrF∆rpoZ+pKD46 in two ways: we put 3 colonies in LB+amp from the 6/16 transformation plate, and we streaked a new amp plate from the glycerol stock
Another possibility is that something is wrong with our lambda red. We designed primers to verify that the pKD46 plasmid is really in the cells.

Kan-ZFB-wp-his3-ura3 construct:

Our last few PCR purifications have given us very low yields, and consequently we have had to use large amounts of our DNA (and the large amounts of buffer salts may also be why our lambda red recombinations have failed). When we tried to amplify our current DNA using the hisura-kan_F and ZFB-wp-hisura_R primers and the Phusion mastermix, it did not work (see 6/23). We will try to gain more product in two ways:

1) Repeat 6/23 PCR but use KAPA mastermix

the KAPA mix may work better than the Phusion.
Used KAPA protocol with 1µL of kan-ZFB overlap as template, hisura-kan_F and ZFB-wp-hisura_R primers, 65˚C annealing temp, 90 sec elongation time
made 2 reactions

2) Repeat overlap extension PCR (see 6/16) with KAPA mastermix

used KAPA protocol with 1µL kan cassette and 1µL of ZFB-wp-hisura, 65˚C annealing temp, 90 sec elongation
10 cycles without primers; hisura-kan_F and ZFB-wp-hisura_R primers added; 15 more cycles
made 6 reactions

E gel to check reactions worked: all 6 overlap PCRs successful, but not the other two reactions.

File:2011.06.28kanZFBconstruct(labeled).png

kan-ZFB-wp-hisura construct 6/28/11

combined samples 1-3 and 4-6 and ran on 1% agarose gel for extraction

File:2011.06.28kanZFBconstruct for extract1(labeled).png

kan-ZFB-wp-hisura construct for gel extraction 6/28/11

used Qiagen gel extraction kit and instructions with the following modifications:
- gel bands were dissolved in 500µL of buffer QG regardless of the gel volume
- gel heated at 50C for 20 min (to make up for reduced amount of buffer QG)
- after melting, 10µL of NaOAC (3M) were added to adjust the pH
- DNA from samples 1-3 were eluted in 20µL of ddH2O; DNA from samples 4-6 were eluted in 20µL of buffer EB
- water sample: 273.4 ng/µL, 260/280=1.92
- EB sample: 136.9 ng/µL, 260/280=2.38

June 28th - Bioinformatics

Attention all Harvard iGEM-ers!!! According to the iGEM Main Page, our preliminary project descriptions and safety proposals are due on July 15. Please see the aforementioned link so we can get this done ASAP- we don't want to miss any deadlines and have all our hard work wasted!

Finalized our Positive Control Sequence Table, using Justin's macro to insert the F1 helices into the appropriate zif268 F2 backbone

Length of chip oligos: 131-140bp (based on Cut Site Design)
- Primers: 20bp (x2= 40bp)
- zif268 F2 backbone + helix= 23aa (x3=69bp; some fingers ~3aa longer)
- Some alternate backbones are longer than zif268 F2 backbone
- Type II binding/cut sites= 11bp on each side (22bp total)
- Standard legnth: 40 + 69 + 22 = 131bp

Use WebLogos as a final visual check of our final generated sequences

Plasmid and Oligo Design Schematics

File:Oligo design on board.jpg

Oligo Design

File:Plasmid design on board.jpg

Expression Plasmid Design

Chip-Based Sequence Design Schematic

File:Chip protocol.png

Chip-based process for sequence design, taken from Kosuri, et al. 2010 model of scalable gene synthesis Kosuri2010

References

Kosuri2010 pmid=21113165

</biblio>

Harvard Logo

File:Harvard logo.png

Running the Generator!

File:Fasta total.csv NOTE: LATER GENERATED NEW SEQUENCES. NOT UP TO DATE.

Generated Final Chip Sequences

We ran the generator once earlier this afternoon, but had to re-run it again due to a typo in the cut sites and the number of sequences we desired for each backbone. Luckily, we caught these errors, and after checking the program once again, we ran it a final time this afternoon.
- It took about 45 minutes for the program to generate and reverse translate the 54900 sequences.
- During this time, we created a function that will re-translate the sequences that the generator output. It compares the original helix with the re-translated helix to make sure that our reverse-translate works properly.
  - This step went smoothly, and we verified that the sequences were reverse-translated properly.
- To make sure that the distributions generated were as expected, we made WebLogos of the helices generated(see below).
The output file (in the Dropbox: iGem > chip > final chip.csv) originally had the following headers: 'Target', 'Backbone #', 'Helix Sequence', 'Backbone Sequence', 'Nucleotide Sequence of Zinc Finger'
- We wanted to convert this information into FASTA format.
  - We wrote a function that converted our original file into fasta format (in the Dropbox: iGem > chip > fasta.csv)
  - The file FASTA_total (also linked above) contains the FASTA for all 50000 sequences (including the 100 controls).
  - For those curious, the FASTA format just a format that looks like this:

>Header (For us the header is: Target, Backbone #, Helix Sequence, Backbone Sequence)(The header for the controls are: Index Number, 'control')
 Sequence (In our case, the nucleotide sequence of the zinc fingers)

Generated WebLogos for Final Chip

File:AAA.png AAA	File:ACC.png ACC	File:CTC.png CTC
File:CTG.png CTG	File:GAC.png GAC	File:TGG.png TGG

FASTA-Formatted Chip Data:

>NNN(Target Triplet) BB# Helix Seq.

Nucleotide seq. of ZF

Bioinformatics Candids

File:Justin speaking.jpg

File:Justin writing.jpg

File:Zif268 sequence by memory.jpg

zif268 sequence by memory. You know you've stared at too many zif268 sequences when...

File:Primer Index iGEM 2011

Design of Plate Practice Sequences

While we wait for the chip to come in, we have a number of techniques and protocols that we can practice on beforehand, so that when the chip comes we'll be ready to go to use what they give us. We will be practicing the following techniques:

Cutting ZF1 out of our oligos
Inserting ZF1 into the expression plasmid in between the omega subunit and the linker before F2
Verifying that combination of our F1 from the oligo with the plasmid produces a viable, functional ZF
Amplifying subpools of oligos for testing
Inserting the expression plasmids into the E. coli containing our selection genome
Verifying that our ZF-binding site/GFP expression paradigm works

To this end, we will be ordering a 96-well plate from IDT containing oligos that will simulate the entire tube of oligos that we will receive from Agilent in four weeks. These oligos will consist of the following:

6 positive controls (we know which DNA sequences these bind to)
- 3 of them being the F1 fingers of Zif268, OZ052, and OZ123
- 3 of them being ZF F1s derived from CODA.
90 generated sequences, picked from a subset of the chip
- These are picked evenly across the 9,150 sequences generated on the cihp for the TGG triplet F1 target from the colorblindness "bottom finger" target, GTG GGA TGG. This particular target was chosen because the F2/F1 is a GNNTNN combo, which might be more likely to get hits from our chip generation sequences.

The primer tag sequences for the 90 generated sequence subset will be the same as they are on the chip (for the sake of explanation, we will refer to them now as P1F and P1R in this paragraph). The positive controls will be flanked immediately by the same primers as the generated subset so that we can amplify everything as one pool altogether should we need to (so this will be P1F and P1R). However, we will also put an additional set of primers outside of the P1F/P1R primers for the positive controls so that we can specifically amplify the positive control subpool, should we want to. These primers will be the same as the primers for the positive control on the chip (which will be called P2F and P2R here).

To recap, on the chip we will have the following oligos :

45750 other oligos for the 5 other target sequences
Oligos (TGG set, 9150 total):   | P1F | type II binding site | generated F1 | type II binding site | P1R |
Oligos (+ control, 100 total):  | P2F | type II binding site |  control F1  | type II binding site | P2R |

In our test pool of 96 sequences, we will have two types of oligos (note the two pairs of primers around the positive controls):

Oligo (TGG set, 90 total):          | P1F | type II binding site | generated F1 | type II binding site | P1R |
Oligo (+ control, 6 total):   | P2F | P1F | type II binding site | generated F1 | type II binding site | P1R | P2R |

Once we get our test sequences back from IDT, they will come in a 96-well plate with one oligo in each plate. We should make a mixture using some of each well in order to create a tube that contains all 96 sequences. This will simulate the tube that we will receive from Agilent, except instead of 55,000 sequences we will have 96 sequences only in this tube. From here, we can practice using this as a library.

We can pretend that this tube is just 96 generated sequences on the chip, treating the positive controls as if they were also generated sequences (we only include them in the 96 to ensure that we will indeed get a "hit" from this practice screening). Thus, we can just use the P1F/P1R primer set to amplify all of them in order to use them for the subsequent steps.

These subsequent steps will be those that were outlined above, namely cutting out the F1 sequence from each oligo, ligating this F1 into our expression plasmid, putting the expression plasmid into our selection strain, observing colonies which get infused with ZFs that bind to our target site (the "hits"), and sequencing the colonies that get hits to determine which ZF they are expressing.

We will be repeating these exact same steps once we get the chip, so if we can perfect our protocols with these practice sequences, we should be golden when the chip comes in.

June 30th

Lambda Red, Backbone, and Sequencing PCR

Gel run on the presence of a Lambda Red protein in the pKD46 plasmid showed that it is indeed present, so our recombination failures have not been due to an incorrect plasmid.
Gel run on the backbone of pZE21G plasmid was success and took us one step closer to obtaining all parts necessary for the three part assembly
Gel run on the pyrF and rpoZ was success
- Therefore we sent the PCR products and primers to GENEWIZ for sequencing again

File:2011.06.30.lambda spec pyrFrpoZ(labeled).png

pKD46, pZE21G, and PyrF and rpoZ loci 6/30/11

pZE21G backbone:

Since last night's PCR was successful, we will redo it with a few protocol adaptations to get a cleaner product and to increase our yield when we purify
KAPA mastermix and protocol: primers HindIII-F and KpnI-R
- template: pZE21G miniprepped plasmid, 1µL
- 2 min elongation time, 30 cycles
- 2 samples at 55˚C annealing, 2 samples at 60˚C

Lambda red and MAGE:

Yesterday's prep produced tiny colonies on the MAGE plates and (so far) none on the kan-ZFB plates. Just in case it didn't work, we will redo the lambda red using even more DNA and perform a second round of MAGE using culture from yesterday that was not plated.
same procedure, but with the following changes:
- 5µL kan-ZFB (about 1 mg)
- recover 3 hrs
- kan-ZFB: plate 100µL and 2 mL on kan plates
- MAGE: plate 1µL and 10µL on amp plates
To see if the colonies on the MAGE plate knocked out HisB, we chose 24 colonies, resuspended them in water, and put half the cells in LB (complete media) and half in NM media (does not have histidine). 96 well plate, 150µL media, grown overnight at 30˚C.

ZF Expression Plasmid Ultramer and Primer Design

Today, we designed primers ZF_073 through ZF_085 as listed in the iGEM Primer Index spreadsheet. These were basically two sets of primers: the primers to clone out the omega subunit and linker, and the ultramers that would construct the last part of the linker along with the type II binding sites and F2/F3 fingers. One should refer to the primer list for the sequences.

Note: the annealing sequence for the ultramer overlap contained a 72 degree melting temp hairpin. To get around this, I changed one of the codons in the F2 backbone. The F2 backbone begins with "FQCRIC", and so I changed the codon for the arginine (R) from CGC to CGT, which resolved the hairpin problem.

June 30th - Bioinformatics

Updated Primer list and FASTA formatting

We ran into a small hiccup, when we were informed that we had forgotten to reverse translate the reverse primer sequences that were being appended to the generated sequence. This is because the primer sequences we were given were the sequences for the actual primers, rather than sequences to which the primers would bind. Luckily, we caught this error! We did have the re-run the generator because we had to make sure that our generated sequences did not contain the new primers.

Here is the updated primer list:

This is the set of final target sequences with assigned forward and reverse primers (tags for PCR):

Disease	Target Sequence	Forward Primer (5'-3' NOT REVERSE COMPLEMENT)	Reverse Primer (5'-3' REVERSE COMPLEMENT)
Colorblindness	GCT GGC TGG	ATATAGATGCCGTCCTAGCG	TGGGCACAGGAAAGATACTT
Colorblindness	GCG GTA ACC	CCCTTTAATCAGATGCGTCG	GGTCGCCCTTATTACTACCA
Familial Hypercholesterolemia	GGC TGA GAC	TTGGTCATGTGCTTTTCGTT	TCTGAGTATCCGATACCCCT
Familial Hypercholesterolemia	GGA GTC CTG	GGGTGGGTAAATGGTAATGC	GCTATATCCGGGGAATCGAT
Myc-gene Cancer	GGC TGA CTC	TCCGACGGGGAGTATATACT	TTGGCCTGAAGCAGTTAGTA
Myc-gene Cancer	GGC TGG AAA	CATGTTTAGGAACGCTACCG	GGGAGGGAACGGAGATTATT
Controls	n/a	GTACATGAAACGATGGACGG	CGCTGAGGAGACTATACCAG

There was also a small error in the FASTA formatting. There are not supposed to be any spaces in the header, so the spaces were replaced with underscores.

Example:

>1_control
GTACATGAAACGATGGACGGGGTCTCAGCCATTCCAATGTCGTATCTGTATGCGTAATTTTTCACGCAAACACCATTTGGGTCGTCATATCCGTACGCACACGGTGAGACCCGCTGAGGAGACTATACCAG

</html>

Acid	-1	1	2	3	5	6	7
A	77	140	210	197	0	312	85
C	12	24	1	6	14	0	0
D	413	16	694	258	0	142	14
E	125	74	152	107	0	58	132
F	0	0	22	0	10	0	0
G	12	201	328	125	0	177	62
H	93	144	232	652	0	51	17
I	70	21	3	26	0	94	73
K	108	372	46	169	6	321	52
L	176	37	20	22	3325	75	55
M	36	54	5	28	0	31	10
N	23	150	129	940	0	182	61
P	3	298	77	7	0	36	8
Q	813	158	180	13	0	136	30
R	870	539	137	55	3	428	2517
S	99	970	859	278	0	140	12
T	243	134	223	350	0	834	83
V	166	26	27	115	0	341	146
W	19	0	13	0	0	0	0
Y	0	0	0	10	0	0	1

'	A	C	D	E	F	G	H	I	K	L	M	N	P	Q	R	S	T	V	W	Y
A	10	0	99	55	0	29	122	20	32	332	2	59	55	63	255	87	24	43	0	0
C	0	0	15	0	0	3	0	0	0	5	0	0	6	0	31	6	14	0	0	0
D	99	15	94	92	0	39	62	6	84	342	15	120	55	42	277	290	87	21	0	8
E	55	0	92	42	0	34	77	1	38	141	2	39	4	29	134	28	90	26	0	1
F	0	0	0	0	0	0	0	10	0	0	0	22	4	0	2	4	6	0	0	0
G	29	3	39	34	0	38	56	0	14	126	1	95	28	47	119	125	38	7	0	0
H	122	0	62	77	0	56	118	9	103	498	4	88	24	26	87	159	70	2	0	0
I	20	0	6	1	10	0	9	6	8	95	3	5	17	3	62	16	17	4	0	0
K	32	0	84	38	0	14	103	8	84	386	24	44	19	102	269	163	113	22	1	0
L	332	5	342	141	0	126	498	95	386	174	32	686	16	112	362	276	875	360	0	8
M	2	0	15	2	0	1	4	3	24	32	0	7	2	11	39	14	3	1	0	0
N	59	0	120	39	22	95	88	5	44	686	7	8	36	28	120	254	84	34	1	0
P	55	6	55	4	4	28	24	17	19	16	2	36	0	3	29	150	21	13	11	0
Q	63	0	42	29	0	47	26	3	102	112	11	28	3	100	261	314	125	19	0	0
R	255	31	277	134	2	119	87	62	269	362	39	120	29	261	618	343	504	281	0	0
S	87	6	290	28	4	125	159	16	163	276	14	254	150	314	343	592	173	91	0	0
T	24	14	87	90	6	38	70	17	113	875	3	84	21	125	504	173	154	28	0	0
V	43	0	21	26	0	7	2	4	22	360	1	34	13	19	281	91	28	12	0	0
W	0	0	0	0	0	0	0	0	1	0	0	1	11	0	0	0	0	0	0	0
Y	0	0	8	1	0	0	0	0	0	8	0	0	0	0	0	0	0	0	0	0

Position	Very Phobic	Hydrophobic	Neutral	Hydrophillic
6	285	85	204	2782
5	542	312	1334	1169
4	3334	14	0	9
3	191	203	1417	1536
2	91	211	1819	1236
1	138	164	1604	1451
-1	468	90	1257	1542

Position	Polar	Nonpolar
6	2917	440
5	2290	1067
4	9	3348
3	2830	527
2	2652	705
1	2555	802
-1	2784	573

Disease	Target Range	Binding Site Location	Bottom Finger	Top Finger	Bottom AA (F3 to F1)	Top AA (F3 to F1)
Colorblindness	chrX:153,403,001-153,407,000	370	GTATTTGTT	GGGCCTGCT	N/A	N/A
Colorblindness	chrX:153,403,001-153,407,000	3627	GCTGGCTGG	GCGGTAATG	EGSGLKR.EAHHLSR.#######	RRDDLTR.QRSSLVR.#######
Cystic Fibrosis	chr7:117,074,084-117,089,556	14767	GCAGGTGAT	AAAGAGCCC	QNGTLGR.EAHHLSR.#######	N/A
Familial Hypercholesterolemia	chr19:11,175,000-11,195,000	14001	GGCTGAGAC	GGAGTCCTG	ESGHLKR.QREHLTT.#######	QTTHLSR.DHSSLKR.#######
Tay-Sachs	chr15:72,674,944-72,688,031	5888	GTCTGGTCA	TCAAACTCC	DRSSLRR.RREHLTI.#######	N/A
Pancreatic Cancer	chr7:117,074,084-117,089,556	1739	GATCAAGCT	GTTTCAGTG	N/A	N/A

@@ Line 3: / Line 3: @@
 <html>
-<table class="whitebox">
+<!--<table class="whitebox">
 <caption><b>June</b></caption>
 <tr><th>Sun</th><th>Mon</th><th>Tue</th><th>Wed</th><th>Thu</th><th>Fri</th><th>Sat</th></tr>
@@ Line 143: / Line 143: @@
 </tr>
 </table>
+-->
+{{:Team:Harvard/Template:PracticeBar2}}
+{{Template:Team:Harvard/templateabouttest}}
+<html>
+<script>
+function show(k)
+{
+elem = document.getElementById('606');
+  elem.style.display = 'none';
+if(k==1){
+ elem.style.display = 'block';
+}
+elem = document.getElementById('607');
+  elem.style.display = 'none';
+if(k==2){
+  elem.style.display = 'block';
+}
+elem = document.getElementById('608');
+  elem.style.display = 'none';
+if(k==3){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('609');
+  elem.style.display = 'none';
+if(k==4){
+  elem.style.display = 'block';
+}
+elem = document.getElementById('610');
+  elem.style.display = 'none';
+if(k==5){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('613');
+  elem.style.display = 'none';
+if(k==6){
+  elem.style.display = 'block';
+}
+elem = document.getElementById('614');
+  elem.style.display = 'none';
+if(k==7){
+  elem.style.display = 'block';
+}
+elem = document.getElementById('615');
+  elem.style.display = 'none';
+if(k==8){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('616');
+  elem.style.display = 'none';
+if(k==9){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('617');
+  elem.style.display = 'none';
+if(k==10){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('620');
+  elem.style.display = 'none';
+if(k==11){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('621');
+  elem.style.display = 'none';
+if(k==12){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('622');
+  elem.style.display = 'none';
+if(k==13){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('623');
+  elem.style.display = 'none';
+if(k==14){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('624');
+  elem.style.display = 'none';
+if(k==15){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('625');
+  elem.style.display = 'none';
+if(k==16){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('627');
+  elem.style.display = 'none';
+if(k==17){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('628');
+  elem.style.display = 'none';
+if(k==18){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('629');
+  elem.style.display = 'none';
+if(k==19){
+   elem.style.display = 'block';
+}
+elem = document.getElementById('630');
+  elem.style.display = 'none';
+if(k==20){
+   elem.style.display = 'block';
+}
+}
+</script>
+</style>
+<div id="overhead">
+<div id="besedilo">
+<div id="vse_students">
+<div id="desno_students">
+<div id="levo_students">
+<table style="text-align:center">
+<caption><b>June</b></caption>
+<tr><th>Sun</th><th>Mon</th><th>Tue</th><th>Wed</th><th>Thu</th><th>Fri</th><th>Sat</th></tr>
+<tr><td></td><tr><td></td><td></td><td></td><td>1</a></td><td> 2</a></td><td> 3</a></td><td> 4</a></td>
+</tr><td>5</td><td>
+<span id="student" onclick="show(1)">
+</span ></td><td>
+<span id="student" onclick="show(2)">
+</span ></td><td>
+<span id="student" onclick="show(3)">
+</span ></td><td>
+<span id="student" onclick="show(4)">
+</span ></td><td>
+<span id="student" onclick="show(5)">
+</span ></td><td>11</td></tr><tr><td>12</td><td>
+<span id="student" onclick="show(6)">
+</span ></td><td>
+<span id="student" onclick="show(7)">
+</span ></td><td>
+<span id="student" onclick="show(8)">
+</span ></td><td>
+<span id="student" onclick="show(9)">
+</span ></td><td>
+<span id="student" onclick="show(10)">
+</span ></td><td>18</td></tr><tr><td>19</td><td>
+<span id="student" onclick="show(11)">
+</span ></td><td>
+<span id="student" onclick="show(12)">
+</span ></td><td>
+<span id="student" onclick="show(13)">
+</span ></td><td>
+<span id="student" onclick="show(14)">
+</span ></td><td>
+<span id="student" onclick="show(15)">
+</span ></td><td>
+<span id="student" onclick="show(16)">
+</span ></td></td></tr><tr><td>26</td><td>
+<span id="student" onclick="show(17)">
+</span ></td><td>
+<span id="student" onclick="show(18)">
+</span ></td><td>
+<span id="student" onclick="show(19)">
+</span ></td><td>
+<span id="student" onclick="show(20)">
+</span ></td></tr>
+</table>
+</div>
+</div>
+</html>
+<div id="606" style="display:none"></div>
+<div id="607" style="display:none">
+== June 7th ==
+'''Miniprep of pKD42 (lambda red)'''
+The lambda red plasmid is needed to enable the recombination used to insert the selection/expression systems into our E. coli cultures.
+Procedure: followed Qiagen Kit instructions, each student (8) using 1 mL cell suspension
+Results: DNA reasonably pure (260/280 between 1.8 and 2) and between 25 and 50 ng/µL
+</div>
+<div id="608" style="display:none">
+== June 8th ==
+'''PCR to connect ultramers into OZ052 (Zif268 F2 triplicate, GCCGATGTC)and OZ123 (Zif268 F2 triplicate, GAGTGGTTA):'''
+OZ052:
+*3µL OZ052_F (10µM stock)
+*3µL OZ052_R (10µM stock)
+*5µL 10x Pfx amplification buffer
+*1.5µL dNTPs
+*1µL MgSO4
+*0.4µL DNA polymerase
+*36.1µL ddH2O
+OZ123:
+*3µL OZ123_F (10µM stock)
+*3µL OZ123_R (10µM stock)
+*5µL 10x Pfx amplification buffer
+*1.5µL dNTPs
+*1µL MgSO4
+*0.4µL DNA polymerase
+*36.1µL ddH2O
+Parameters:
+*1) 94⁰C for 5 min
+*2) 94⁰C for 15 sec
+*3) 60⁰C for 30 sec
+*4) 68⁰C for 1 min
+*5) Repeat 2-4 for 25 cycles
+*6) 68⁰C for 5 min
+*7) 4⁰C forever
+</div>
+<div id="609" style="display:none">
+== June 9th ==
+*Created cell culture with selection construct (contains ZFB, His3, pyrF on plasmid) and reporter RFP (this will be used to test positive control ZFs, cells fluoresce green when ZF binds)
+**Picked colonies, grew in LB/amp liquid media until mid-log
+***3 mL of LB, 1.5 µL of 2000x amp
+**Once mid-log reached, created glycerol stock, stored stock at -80⁰C.
+***<del>300 µL bacteria, 1200 µL 80% glycerol </del> '''This should have been 1200 µL bacteria media, 300 µL 80% glycerol ''(Corrected 6/14/2011)'' ''' (80% pure glycerol, 20% molecular grade water)
+**Spiked new tubes of media with 25 µL bacteria from the mid-log tube to leave overnight
+NOTE: reporter RFP did not grow to mid-log by end of day, will let grow overnight to saturation and continue creating glycerol stock tomorrow.
+*Plated selection strain from gel stab onto tet plate.
+*Began primer design for creating the kan/selection construct fusion (see our [[Primers | primer spreadsheet]] for details).
+==June 9th - Bioinformatics==
+Today we focused on reacquainting and familiarizing ourselves with Python. We completed the parsing (reading in) of the sequence and amino acid data so that it is easy to work with: by substituting each amino acid abbreviation (ex. A, N) with its numeric equivalent (ex. 1, 14), we can use a lot of nice math comparisons instead of messy letter/"string" comparisons.
+After that, we worked on counting the number of times each amino acid appears in each of the 7 positions (unfortunately given by -1,1,2,3,5,6,7), and counting the number of times amino acids are next to each other (ex. ACTQRNF has AC, CT, TQ, etc pairings). Taken overall, we found that L is overwhelmingly in position 5.
+<!--
+Acid    Position
+	-1	1	2	3	5	6	7
+A	77	140	210	197	0	312	85
+C	12	24	1	6	14	0	0
+D	413	16	694	258	0	142	14
+E	125	74	152	107	0	58	132
+F	0	0	22	0	10	0	0
+G	12	201	328	125	0	177	62
+H	93	144	232	652	0	51	17
+I	70	21	3	26	0	94	73
+K	108	372	46	169	6	321	52
+L	176	37	20	22	3325	75	55
+M	36	54	5	28	0	31	10
+N	23	150	129	940	0	182	61
+P	3	298	77	7	0	36	8
+Q	813	158	180	13	0	136	30
+R	870	539	137	55	3	428	2517
+S	99	970	859	278	0	140	12
+T	243	134	223	350	0	834	83
+V	166	26	27	115	0	341	146
+W	19	0	13	0	0	0	0
+Y	0	0	0	10	0	0	1
+-->
+{|
+| align="center" style="background:#f0f0f0;"|'''Acid'''
+| align="center" style="background:#f0f0f0;"|'''-1'''
+| align="center" style="background:#f0f0f0;"|'''1'''
+| align="center" style="background:#f0f0f0;"|'''2'''
+| align="center" style="background:#f0f0f0;"|'''3'''
+| align="center" style="background:#f0f0f0;"|'''5'''
+| align="center" style="background:#f0f0f0;"|'''6'''
+| align="center" style="background:#f0f0f0;"|'''7'''
+|-
+| A||77||140||210||197||0||312||85
+|-
+| C||12||24||1||6||14||0||0
+|-
+| D||413||16||694||258||0||142||14
+|-
+| E||125||74||152||107||0||58||132
+|-
+| F||0||0||22||0||10||0||0
+|-
+| G||12||201||328||125||0||177||62
+|-
+| H||93||144||232||652||0||51||17
+|-
+| I||70||21||3||26||0||94||73
+|-
+| K||108||372||46||169||6||321||52
+|-
+| L||176||37||20||22||3325||75||55
+|-
+| M||36||54||5||28||0||31||10
+|-
+| N||23||150||129||940||0||182||61
+|-
+| P||3||298||77||7||0||36||8
+|-
+| Q||813||158||180||13||0||136||30
+|-
+| R||870||539||137||55||3||428||2517
+|-
+| S||99||970||859||278||0||140||12
+|-
+| T||243||134||223||350||0||834||83
+|-
+| V||166||26||27||115||0||341||146
+|-
+| W||19||0||13||0||0||0||0
+|-
+| Y||0||0||0||10||0||0||1
+|-
+|
+|}
+For pairings, we found patterns, but none as obvious as the L-in-position-5. Read this like a multiplication table: the intersection of L row and M column is how often that pairing was observed.
+{|
+| align="center" style="background:#f0f0f0;"|''''''
+| align="center" style="background:#f0f0f0;"|'''A'''
+| align="center" style="background:#f0f0f0;"|'''C'''
+| align="center" style="background:#f0f0f0;"|'''D'''
+| align="center" style="background:#f0f0f0;"|'''E'''
+| align="center" style="background:#f0f0f0;"|'''F'''
+| align="center" style="background:#f0f0f0;"|'''G'''
+| align="center" style="background:#f0f0f0;"|'''H'''
+| align="center" style="background:#f0f0f0;"|'''I'''
+| align="center" style="background:#f0f0f0;"|'''K'''
+| align="center" style="background:#f0f0f0;"|'''L'''
+| align="center" style="background:#f0f0f0;"|'''M'''
+| align="center" style="background:#f0f0f0;"|'''N'''
+| align="center" style="background:#f0f0f0;"|'''P'''
+| align="center" style="background:#f0f0f0;"|'''Q'''
+| align="center" style="background:#f0f0f0;"|'''R'''
+| align="center" style="background:#f0f0f0;"|'''S'''
+| align="center" style="background:#f0f0f0;"|'''T'''
+| align="center" style="background:#f0f0f0;"|'''V'''
+| align="center" style="background:#f0f0f0;"|'''W'''
+| align="center" style="background:#f0f0f0;"|'''Y'''
+|-
+| A||10||0||99||55||0||29||122||20||32||332||2||59||55||63||255||87||24||43||0||0
+|-
+| C||0||0||15||0||0||3||0||0||0||5||0||0||6||0||31||6||14||0||0||0
+|-
+| D||99||15||94||92||0||39||62||6||84||342||15||120||55||42||277||290||87||21||0||8
+|-
+| E||55||0||92||42||0||34||77||1||38||141||2||39||4||29||134||28||90||26||0||1
+|-
+| F||0||0||0||0||0||0||0||10||0||0||0||22||4||0||2||4||6||0||0||0
+|-
+| G||29||3||39||34||0||38||56||0||14||126||1||95||28||47||119||125||38||7||0||0
+|-
+| H||122||0||62||77||0||56||118||9||103||498||4||88||24||26||87||159||70||2||0||0
+|-
+| I||20||0||6||1||10||0||9||6||8||95||3||5||17||3||62||16||17||4||0||0
+|-
+| K||32||0||84||38||0||14||103||8||84||386||24||44||19||102||269||163||113||22||1||0
+|-
+| L||332||5||342||141||0||126||498||95||386||174||32||686||16||112||362||276||875||360||0||8
+|-
+| M||2||0||15||2||0||1||4||3||24||32||0||7||2||11||39||14||3||1||0||0
+|-
+| N||59||0||120||39||22||95||88||5||44||686||7||8||36||28||120||254||84||34||1||0
+|-
+| P||55||6||55||4||4||28||24||17||19||16||2||36||0||3||29||150||21||13||11||0
+|-
+| Q||63||0||42||29||0||47||26||3||102||112||11||28||3||100||261||314||125||19||0||0
+|-
+| R||255||31||277||134||2||119||87||62||269||362||39||120||29||261||618||343||504||281||0||0
+|-
+| S||87||6||290||28||4||125||159||16||163||276||14||254||150||314||343||592||173||91||0||0
+|-
+| T||24||14||87||90||6||38||70||17||113||875||3||84||21||125||504||173||154||28||0||0
+|-
+| V||43||0||21||26||0||7||2||4||22||360||1||34||13||19||281||91||28||12||0||0
+|-
+| W||0||0||0||0||0||0||0||0||1||0||0||1||11||0||0||0||0||0||0||0
+|-
+| Y||0||0||8||1||0||0||0||0||0||8||0||0||0||0||0||0||0||0||0||0
+|-
+|
+|}
+Follow up work on this will be to convert this table to frequencies instead of values: values are less meaningful.
+</div>
+<div id="610" style="display:none">
+== June 10th ==
+*What we learned today: don't put E. coli plates in the -20C freezer!
+*Observed a well populated selection strain plate and placed it in the 4C refrigerator
+*Took the selection construct culture and extracted the plasmid through miniprep
+**Observed 260/280 ratio of 1.90 and 1.88 through Nanodrop
+**Observed concentrations of 87.7 and 100.6 ng/µL through Nanodrop
+*Made 10 new agar plates with LB and amp
+==June 10th - Bioinformatics ==
+===Visualizations===
+We spent the first few hours today making cool visualizations and graphs of the data we found on the 9th: heatmaps turned out to be an annoying limitation of Python, so a Python/R hybrid was used, and bar charts were made exclusively in Python. See the dropbox for our pretty (and hopefully informative compared to spreadsheets) charts/graphs.
+{|
+ | [[File:heatmap_pairing.png|thumb|left|A heatmap of the pairing data. The darker the blues indicate that the pairing occurs more often.]]
+|}
+We then started work on TNN and GNN properties specifically (essentially repeating the June 9th work, but confined to smaller data sets). There are some differences between TNN and GNN: see graphs in dropbox. We decided that there was not enough data for fingers that bind to ANN and CNN triplets to perform significant analysis on it.
+{|
+ | [[File:gnn_pairing_heatmap.png|thumb|left|A heatmap of the GNN pairing data.]]
+ | [[File:tnn_pairing_heatmap.png|thumb|left|A heatmap of the TNN pairing data.]]
+|}
+*Overall, similar color clusters are found in the heatmaps. In all cases, L and N are often placed consecutively on the helix. There are fewer clusters of high frequency when looking at TNN binders.
+We then, using the theorized framework from a paper (2011 Persikov [http://iopscience.iop.org.ezp-prod1.hul.harvard.edu/1478-3975/8/3/035010/]), tried to match amino acid binding to each base pair to see if there was a pattern. See dropbox document .../bioinformatics/Binding Frequency for that data. There's a lot of it.
+===Properties of amino acids===
+We then worked on finding properties of the each position (hydrophobic/phillic, non/polar):
+'''Hydrophilic vs Hydrophobic'''
+{|
+| align="center" style="background:#f0f0f0;"|'''Position'''
+| align="center" style="background:#f0f0f0;"|'''Very Phobic'''
+| align="center" style="background:#f0f0f0;"|'''Hydrophobic'''
+| align="center" style="background:#f0f0f0;"|'''Neutral'''
+| align="center" style="background:#f0f0f0;"|'''Hydrophillic'''
+|-
+| 6||285||85||204||2782
+|-
+| 5||542||312||1334||1169
+|-
+| 4||3334||14||0||9
+|-
+| 3||191||203||1417||1536
+|-
+| 2||91||211||1819||1236
+|-
+| 1||138||164||1604||1451
+|-
+| -1||468||90||1257||1542
+|-
+|
+|}
+'''Polar vs Nonpolar'''
+{|
+| align="center" style="background:#f0f0f0;"|'''Position'''
+| align="center" style="background:#f0f0f0;"|'''Polar'''
+| align="center" style="background:#f0f0f0;"|'''Nonpolar'''
+|-
+| 6||2917||440
+|-
+| 5||2290||1067
+|-
+| 4||9||3348
+|-
+| 3||2830||527
+|-
+| 2||2652||705
+|-
+| 1||2555||802
+|-
+| -1||2784||573
+|-
+|
+|}
+Follow up work here is to check more properties, and maybe try individual pairings (ex. phobic-philic, polar-phillic).
+</div>
+<div id="613" style="display:none">
+== June 13th ==
+The control zinc fingers OZ052 and OZ123 were amplified with overhanging primers to allow its insertion into the Wolfe plasmid:
+'''Overhang PCR for ultramers:''' the template was the product of the ultramer PCR (see 6/8/11), and several concentrations were used
+In all the tubes:
+*5 µL Pfx amplification buffer
+*1.5 µL dNTPs
+*1 µL MgSO4
+*0.4 µL polymerase
+*38.1 µL ddH2O
+*1.5 µL OZ052_up and 1.5 µL OZ052_down OR 1.5 µL OZ123_up and 1.5 µL OZ123_down
+In OZ052 (1) and OZ123 (1):
+*1 µL of ultramer PCR product
+In OZ052 (1:10) and OZ123 (1:10):
+*1 µL of a 1 in 10 dilution of ultramer PCR product
+In OZ052 (1:100) and OZ123 (1:100):
+*1 µL of a 1 in 100 dilution of ultramer PCR product
+Parameters:
+*94⁰C for 5 min
+*94⁰C for 15 sec
+*55⁰C for 30 sec
+*68⁰C for 30 sec
+*Repeat steps 2-4 for 25 cycles
+*68⁰C for 5 min
+*4⁰C forever
+Gel to verify proper amplification (1% agarose gel, 10 µL 1 kb ladder, 120 V):
+The OZ052 lanes (1-3) had bands at the proper length (328 bp) at all three concentrations, although there were several fainter bands likely from side products. Only the undiluted OZ123 lane showed any bands, and from the faint band at 328 and the stronger band around 250 it appears that the PCR did not work well, and the majority of the product was the ultramer from the first PCR.
+'''PCR around vector:''' the template used was the Wolfe selection construct plasmid miniprepped 6/10/11 (100.6 ng/µL stock)
+Reagents the same as above except:
+*1.5 µL of Wolfe_F and 1.5 µL of Wolfe_R primers to each tube
+*plasmid tube (1 ng) given 1 ng of template (1 µL of a 1 in 100 dilution)
+*plasmid tube (10 ng) given 10 ng of template (1 µL of a 1 in 10 dilution)
+Parameters same as above except:
+*elongation (step 4) 5 minutes (vector approximately 5 kb)
+Gel to verify proper amplification (1% agarose, 10 µL 1 kb ladder, 170V)
+There were no bands of the correct size in the lanes.  The only band that appeared was a faint, short band in one lane that likely was a primer.  Since the DNA ladder worked, the problem likely was not with the electrophoresis but with the PCR reaction, perhaps due to issues with the primers.
+===Gel images===
+[[File:2011.06.13.ultrameroverhang052,123gel1(labeled).png|thumb|left|Ultramer Overhang 6/13/11]]
+[[File:2011.06.13.wolfebackbonegel1(labeled).png|thumb|none|Backbone plasmid 6/13/11]]
+== June 13th - Bioinformatics ==
+Today we started work on a program to statistically generate possible sequences.
+The four functions needed to do this are:
+* generate(matrix, pseudocounts (lambda), dependency tuples)
+:: takes a matrix of zinc-finger AA position counts, a list of dependent amino acid pairs, and a pseudocount multiplier and generates a list of potential amino acid sequences weighted by independent and dependent probabilities
+* add_pseudo(dependent matrix row,independent matrix row)
+:: given a matrix row of dependent counts (i.e. how many times 'a' occurs at position n when 'b' is set to some AA at position m) and a row of independent matrix counts (how many times 'a' occurs at n regardless of b's AA) return an adjusted matrix row, based on the dependent matrix row, that has pseudocounts added to the values that are empty in the dependent matrix row but filled in the independent matrix row.
+* generate_indep(matrix)
+:: randomly pick an amino acid, given a matrix row, from a weighted random distribution based on the values in the row
+* generate_dep(indep_row, dep_row, lambda)
+:: add pseudo counts (call add_pseudo) and generate a dependent random call for a position (using generate_indep on the adjusted matrix)
+We finished generate_indep, generate_dep, and add_pseudo today, along with creating a 140x140 matrix of needed values.
+</div>
+<div id="614" style="display:none">
+== June 14 ==
+*Made four LB-based media solutions, and later created glycerol stocks from these and placed in -80⁰C freezer
+**Selection strain (ΔHis3ΔPyrFΔrpoZ) in 3 mL LB and 3µL of 1000x Tet solution stock
+**Selection strain (ΔHis3ΔPyrFΔrpoZ) in 3 mL of LB only solution stock (control)
+**Kan cassette (pZE22G) in 3 mL of LB solution stock and 3 µL of kanamycin solution stock
+**Lambda Red (pKD42) in 3 mL of LB and 3 µL Amp solution stock
+***For all of these stocks, we tried to grow all to mid-log and then place them in 1,200 µL of culture and 300 µL of 80 % glycerol solution (''This is the correct protocol for creating a glycerol stock; refer to June 9th'')
+***We were only able to get the kan cassette to mid-log and created glycerol stock of the kan cassette
+***Observations included contamination of a pKD46 liquid culture, and we are leaving Lambda Red and both solutions with the selection strain for overnight growth
+*Ran 1% gel (150V) with the rest of the OZ123 and OZ052 overhang PCR samples
+**Used better ladder today, less the 1 kb ladder
+**Bands followed the same pattern as the gel run on 6/13/11
+*Used gel extraction to obtain the correct OZ123 and OZ052 PCR product from the gel
+**Used Qiagen quick gel extraction kit
+**OZ052 (from undiluted lane): 7.0 ng/µL, 260/280=2.42 (Note: this sample had a strange yellow substance in the column--may have been contaminated)
+**OZ052 (from undiluted lane): 12.0 ng/µL, 260/280=2.02
+**OZ052 (from 1:10 dilution): 10.8 ng/µL, 260/280=2.04
+**OZ052 (from 1:10 dilution): 15.2 ng/µL, 260/280=1.82
+**OZ052 (from 1:100 dilution): 20.6 ng/µL, 260/280=2.17
+**OZ123 (from undiluted lane): 6.3 ng/µL, 260/280=2.17
+*PCR the backbone fragment of the plasmid using Wolfe_R and Wolfe_L primers and a lower annealing temperature than before due to the lower melting point of Wolfe_L
+**Reagents:
+***22µL Invitrogen Platinum PCR supermix
+***1µL template from a 1 in 100 dilution (1 ng)
+***1µL Wolfe_F and 1µL Wolfe_R
+**94°C for 30 s
+**94°C for 30 s
+**53°C for 30 s
+**70°C for 5 min
+***Previous three steps repeat 30 times
+**70°C for 5 min
+**4°C forever
+*Performed a restriction enzyme digestion on the selection construct plasmid using EcoRI to test for presence/absence of inserted selection construct
+**1µL EcoR1
+**1µL buffer 4
+**2µL backbone plasmid
+**6µL ddH2O
+**Incubate at 37 degrees for 90 min
+**There is only one EcoR1 site (GAATTC) in the plasmid, so we should see 1 band at about 5kb
+*Ran a gel (1%, 170 V) on the backbone fragment (Wolfe primers) PCR product and restriction digestion result
+**Observations: the EcoR1 digest produced the expected band of around 5kb. The backbone did produce a 5kb band but also had a secondary smaller product, perhaps due to one of the primers annealing to a sequence that is a close match to its target.
+*Began gradient PCR on the selection strain backbone with Wolfe_R and Wolfe_F primers because of the large difference in melting temperatures between the two
+**Set the annealing temperatures within the PCR to go from 50-57 C and ran with 5 minute extension phases at 70 C
+**Ran overnight
+===Today's Gel Images===
+[[File:2011.06.14.ultrameroverhang1(labeled).png|thumb|left|Ultramer Overhang 6/14/11]]
+[[File:2011.6.14.digestionbackbonePCRpaint.png|thumb|none|Digestion and PCR Backbone on Selection Construct Plasmid 6/14/11]]
+== June 14 - Bioinformatics ==
+We finished writing the generate function, and now have a working sequence generator. We also began more in-depth research into the 2011 work by Persikov which deals with how the zinc finger binds to DNA. He predicts several relations which we should be able to test.
+Persikov sent us his SVM code (used to calculate the probability of a sequence binding to given DNA), so we also worked on adapting this to use when narrowing our sequences to those most likely to work.
+*There are four canonical amino acid-base interactions involved in zinc finger interactions.
+**Amino acids in positions -1, 3, and 6 on the helix are known to interact with the 3 bases of the triplet. In addition, the amino acid in position 2 interacts with the upstream base of the complementary strand (Klug 2010).
+**In addition, Persikov (2011) has proposed [[#June 17 - Bioinformatics|novel interactions]] between these four amino acids based on his analysis of zinc finger binding data upto 2005.
+**Persikov uses the information between these four amino acid-base interactions in his SVM, to determine whether a finger would be a good binder to a particular DNA sequence.
+*In order to use his program, we need to convert the helices that are created by the generator into a format the SVM accepts.
+**The SVM only considers the four canonical interactions. It assigns a numerical value to each possible amino acid-base combination. The program that converts our data into a format the SVM accepts creates a string with these numerical values based on Persikov's key. (See [[Brainstorming Notes#Input Example for SVM:|this page]] for more details on this program.)
+Brandon learned basic Python today. Justin created a JavaScript program that recognizes potential binding sites from a given sequence.
+</div>
+<div id="615" style="display:none">
+==June 15th==
+'''Gradient PCR:''' 5µL of each PCR product were run on a 1% gel. No bands appeared: the PCR appears to not have worked.
+'''Selection construct:''' bacteria containing the selection construct (plasmid containing ZF, omega subunit, ZFB, His3, URA3, etc.) was made into a glycerol stock (see 6/14 and 6/9), miniprepped, and used for PCR:
+*Miniprep: used Qiagen kit
+**82.5ng/µL, 260/280=1.99
+**91.1ng/µL, 260/280=2.01
+**77.9ng/µL, 260/280=1.98
+*'''PCR:'''
+**PCR used to amplify section of plasmid containing zinc finger binding site, weak promoter, His3, and URA3 (with homology to join it to kan cassette)
+**'''Reagents'''
+**zinc finger binding site and weak promoter, selection construct plasmid:
+***1µL ZFB-wp-f (5µM)  (made by 1:20 dilution of 100µM stock)
+***1µL ZFB-wp-hisura-r (5µM)  (made by 1:20 dilution of 100µM stock)
+***1µL selection construct (1:100 dilution of overnight culture)
+***22.5µL of invitrogen's Platinum PCR SuperMix
+**PCR used to amplify the kan cassette
+**'''Reagents'''
+**KAN cassette, pZE22g plasmid:
+***1µL hisura-kan-f (5µM)  (made by 1:20 dilution of 100µM stock)
+***1µL kan-r (5µM)  (made by 1:20 dilution of 100µM stock)
+***1µL pZE22g (1:100 dilution of glycerol stock culture)
+***22.5µL of invitrogen's Platinum PCR SuperMix
+**'''Parameters:'''
+*1) 94°C for 2 min (denature template, activate enzyme)
+*2) 94°C for 30 sec (denature)
+*3) 53°C for 30 sec (anneal)
+*4) 72°C for 2 min (extend)
+*5) Repeat 2-4 for 25 cycles total
+*6) 72⁰C for 5 min
+*7) 4°C forever
+'''PCR Purification:'''
+*Used the Qiagen PCR purification kit and instructions in order to purify the Kan cassette and selections construct PCR products
+**Nanodrop the purification results and observed 3 ng/µL for Kan cassette and 29.8 ng/µL for ZFB-wp-His3: purification did not work well, especially for the kan
+'''Selection strain (ΔHis3ΔPyrFΔrpoZ):'''
+*saturated overnight culture was inoculated again: 3mL LB, 3µL tet, 30µL of overnight culture, at 37C until mid-log
+*glycerol stock
+*For transformation tomorrow we grew up pKD42 in 3 mL of LB, 1.5 µL of ampicillin(2000x) and one colony at '''30 C'''
+**Also grew up more of the selection strain so it will be ready for electroporation transformation
+'''Gel'''
+*Ran gel with Kan cassette and selection construct (Binding site, His3, and URA3)
+**Observations successful and image below
+**Used 1 kb plus ladder
+[[File:2011.06.15.kan-ZFB-wp-his3-ura3(labeled).png|thumb|Kan and ZFB-wp-his constructs 6/15/11]]
+'''PCR Overlap'''
+*since the purification was not very successful, we used 3µL saved from the original PCR product
+*Procedure
+**25µL of 2x Phusion Master Mix
+**1 µL of ZFB-wp-HisURA-R (100µM)
+**1µL of HisURA-Kan-F (100µM)
+**21 µL of water
+**1 µL Kan template and 1 µL of ZFB-wp-His3-URA3
+***4 tubes
+****Both undiluted
+****Both 1:10 dilution
+****Both 1:100 dilution
+****Both 1:1000 dilution
+*Protocol
+**98 C for 30 s
+**98 C for 10 s
+**53 C for 30 s
+**72 C for 3 min
+**Repeat steps 2-4 for 24 more cycles
+**72 C for 5 min
+**4 C 4EVA!!!
+==June 15th - Bioinformatics ==
+*We continued research into Persikov's and others' work on binding.
+*We worked on using the OPEN data to test Persikov's binding-predicting program: SVM
+*Justin continued work on a sequence-finding program, the most up to date version can be found in the Dropbox under code/zfsitefinder.html.
+*Justin and Will found 10 candidate sequences across 4 diseases that hopefully should encompass a good amount of diversity in terms of expanding the ZF library.  These sequences can be found in the table below, with more details [[Media:ZF_Binding_Sequence_Candidates.xlsx|here]].  The most up to date version can be found in the Dropbox under sequences/Target Loci Sequences.
+{| class="wikitable" cellpadding="5"
+| align="center" style="background:#f0f0f0;"|'''Disease'''
+| align="center" style="background:#f0f0f0;"|'''Target Range'''
+| align="center" style="background:#f0f0f0;"|'''Binding Site Location'''
+| align="center" style="background:#f0f0f0;"|'''Bottom Finger'''
+| align="center" style="background:#f0f0f0;"|'''Top Finger'''
+| align="center" style="background:#f0f0f0;"|'''Bottom AA (F3 to F1)'''
+| align="center" style="background:#f0f0f0;"|'''Top AA (F3 to F1)'''
+|-
+| Colorblindness||chrX:153,403,001-153,407,000||370||GTATTTGTT||GGGCCTGCT||N/A||N/A
+|-
+| Colorblindness||chrX:153,403,001-153,407,000||3627||GCTGGCTGG||GCGGTAATG||EGSGLKR.EAHHLSR.#######||RRDDLTR.QRSSLVR.#######
+|-
+| Cystic Fibrosis||chr7:117,074,084-117,089,556||14767||GCAGGTGAT||AAAGAGCCC||QNGTLGR.EAHHLSR.#######||N/A
+|-
+| Familial Hypercholesterolemia||chr19:11,175,000-11,195,000||14001||GGCTGAGAC||GGAGTCCTG||ESGHLKR.QREHLTT.#######||QTTHLSR.DHSSLKR.#######
+|-
+| Tay-Sachs||chr15:72,674,944-72,688,031||5888||GTCTGGTCA||TCAAACTCC||DRSSLRR.RREHLTI.#######||N/A
+|-
+| Pancreatic Cancer||chr7:117,074,084-117,089,556||1739||GATCAAGCT||GTTTCAGTG||N/A||N/A
+|}
+*We collected 15 alternative zinc finger backbones (different from zif268 backbone) and their corresponding base sequences. Many of these were from Persikov 2011 and all binding sequences were confirmed on the Protein Data Bank website at http://www.pdb.org/pdb/home/home.do. The zinc finger PDB ID's and related links are:
+{|border="1" cellpadding="5"
+ |-
+ ! scope="col" | PDB ID
+ ! scope="col" | Binding Sequence
+ ! scope="col" | Link
+ |-
+ |1F2I
+ |ATGGGCGCGCCCAT
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=1F2I]
+ |-
+ |1G2D
+ |GACGCTATAAAAGGAG
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=1G2D]
+ |-
+ |1G2F
+ |TCCTTTTATAGCGTCC
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=1G2F]
+ |-
+ |1MEY
+ |ATGAGGCAGAACT
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=1MEY]
+ |-
+ |1TF6
+ |ACGGGCCTGGTTAGTACCTGGATGGGAGACC
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=1TF6]
+ |-
+ |1UBD
+ |AGGGTCTCCATTTTGAAGCG
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=1UBD]
+ |-
+ |1TF6
+ |ACGGGCCTGGTTAGTACCTGGATGGGAGACC
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=1TF6]
+ |-
+ |1YUI
+ |GCCGAGAGTAC
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=1YUI]
+ |-
+ |2DRP
+ |CTAATAAGGATAACGTCCG
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=2DRP]
+ |-
+ |2GLI
+ |TTTCGTCTTGGGTGGTCCACG
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=2GLI]
+ |-
+ |2I13
+ |CAGATGTAGGGAAAAGCCCGGG
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=2I13]
+ |-
+ |2KMK
+ |CATAAATCACTGCCTA
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=2KMK]
+ |-
+ |2PRT
+ |CGCGGGGGCGTCTG
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=2PRT]
+ |-
+ |2WBS
+ |GAGGCGC
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=2WBS]
+ |-
+ |2WBU
+ |GAGGCGTGGC
+ |[http://www.pdb.org/pdb/explore/explore.do?structureId=2WBU]
+ |}
+</div>
+<div id="616" style="display:none">
+==June 16th==
+*So there was totally a crazy bee hive outside today!!
+'''Glycerol Stock pKD42'''
+*Grew up pKD42 in 30 C and once reached mid-log created glycerol stock and placed in -80 refrigerator
+'''Overlap PCR gel'''
+*Ran gel to test if overlap PCR that ran through the night worked, and it did not
+**Used PCR product without purification which gives good explanation for why it didn't work
+*'''PCR:''' Since the PCR done the previous day (6/15) we made a back up PCR using phusion mastermix (Finnzyme)
+**PCR used to amplify section of plasmid containing zinc finger binding site, weak promoter, His3, and URA3 (with homology to join it to kan cassette)
+**'''Reagents'''
+**zinc finger binding site and weak promoter, selection construct plasmid:
+***1µL ZFB-wp-f (100µM)  (taken directly from the primer tube)
+***1µL ZFB-wp-hisura-r (100µM)  (taken directly from the primer tube)
+***2µL selection construct (1:100 dilution of overnight culture)
+***25µL Phusion High-Fidelity PCR Master Mix
+***21µL distilled water (for total volume of 50µL)
+**PCR used to amplify the kan cassette
+**'''Reagents'''
+**KAN cassette, pZE22g plasmid:
+***1µL hisura-kan-f (100µM)  (taken directly from the primer tube)
+***1µL kan-r (100µM)  (taken directly from the primer tube)
+***2µL pZE22g (1:100 dilution of glycerol stock culture)
+***25µL of Phusion High-Fidelity PCR Master Mix
+***21µL distilled water (for total volume of 50µL)
+**'''Parameters:'''
+***1) 94°C for 2 min (denature template, activate enzyme)
+***2) 94°C for 30 sec (denature)
+***3) 53°C for 30 sec (anneal)
+***4) 72°C for 2 min (extend)
+***5) Repeat 2-4 for 25 cycles total
+***6) 72⁰C for 5 min
+***7) 4°C forever
+*Ran PCR product on a gel: bands of the correct size were observed, though the kan band was much fainter than the ZFB-wp-his3
+[[File:2011.06.16.kan,ZFB-wp-his-ura1|thumb|none|Kan cassette and ZFB-wp-his3 constructs 6/16/11]]
+*'''Repeat PCR''' (to get a higher concentration of the Kan cassette and ZFB-wp-his3 constructs)
+**same as the above backup PCR (since it was successful), but to a 4x total volume of 200µL, compared to 50µL
+**PCR products were run on a gel: the correct bands were observed--see image below ("second gel")
+**PCR product purification: followed Qiagen kit instructions. Strangely, the conc. was 64.6 ng/µL for Kan, purity 2.09 (260/280) and for ZFB-wp-his3, the conc. was 23.7ng/µL and the purity was 1.92 (260/280).
+[[File:2011.6.16.kan&selectionconstruct(labeled).png|thumb|none|Kan cassette and ZFB-wp-his3 constructs second gel 6/16/11]]
+'''Overlap PCR:''' used the kan cassette (64.6 ng/µL) and ZFB-wp-his3-ura3 (23.7 ng/µL) purified above
+*1 µL of kan and 1 µL of ZFB-wp-his3-ura in each tube according to the following conditions:
+**two tubes: undiluted
+**two tubes: both diluted 1 in 10
+**two tubes: both diluted 1 in 100
+*12.5µL Phusion master mix
+*8 µL ddH2O
+*primers: 1.25 µL ZFB-wp-hisura_r (10 µM) and 1.25 µL hisura-kan_f (10 µM)
+**we tried two different reaction types: one added the primers as usual before starting the PCR reaction, the other added the primers after 10 PCR cycles (allowing the polymerase to first use the overlapping kan and ZFB to elongate, and then the primers)
+*Parameters for PCR starting with primers:
+**(PCR machine 5, program name  EXT3KB in IGEM folder)
+**1) 98°C for 1 min (denature template, activate enzyme)
+**2) 98°C for 15 sec (denature)
+**3) 65°C for 15 sec (anneal)
+**4) 72°C for 2 min (extend)
+**5) Repeat 2-4 for 30 cycles total
+**6) 72⁰C for 5 min
+**7) 4°C forever
+*Parameters for PCR starting without primers:
+**1) 98C for 1 min
+**2) 98C for 15 sec
+**3) 65C for 15 sec
+**4) 72C for 1 min
+**5) back to step 2 for 10 cycles (PCR paused after 10 and primers added)
+**6) 98C for 15 sec
+**7) 65C for 15 sec
+**8) 72C for 2 min
+**9) back to step 6 for 20 cycles
+**10) 72C for 5 min
+**11) 4C forever
+'''Transformation'''
+*Used the selection strain (ΔHis3ΔPyrFΔrpoZ) cells at mid-log and attempted to place lambda red (pKD42) plasmid into the cell
+*Procedure
+**Keep on ice through out whole procedure before use of the electroporation machine
+**Spin 1.5 mL of mid-log cells '''at 4 C''' for 1 minute at 18000 rcf (we created two tubes through the following steps)
+**Discard supernatant and resuspend with 1 mL of '''cold''' water
+**Spin again and repeat for a second water wash
+***With each wash, try to get as much supernatant out as possible(even use pipette) because don't want salts to interfere with the electrical pulse
+**Resuspend pellet with 50 µL of cold water
+**Add 1 ng of pKD42 to one of the tubes and 45 ng of pKD 42 to the other
+**Take all of the liquid in each tube and place in two separate cuvettes for electroporation
+***Make sure the electroporation machine is on the right setting (for the cuvettes we used today it was "Ec2")
+***Wipe off all water on the side of the cuvette
+**Have 1 mL of LB in hand and after pulsing, immediately put LB in cuvette
+**Transfer to culture tube and place in 30 C for 2 hours
+**Make 4 LB/amp plates and spread E. coli using glass beads:
+***Plate 1: 10 µL of 1 ng culture
+***Plate 2: take 700 µL of 1 ng culture, spin down and remove supernatant, resuspend in about 30 µL of LB and plate
+***Plate 3: 10 µL of 45 ng culture
+***Plate 4: take 700 µL of 45 ng culture, spin down and remove supernatant, resuspend in about 30 µL of LB and plate
+**Grow overnight at 30 C
+'''PCR to confirm knockouts of selection strain'''
+*this PCR was to confirm that the ΔHis3ΔPyrFΔrpoZ was indeed a knockout for the His3, PyrF, and rpoZ genes
+*each primer set was used for two conditions: wild-type (we used a pKD42 culture) and knockout (ΔHis3ΔPyrFΔrpoZ culture, left over from transformation)
+*1 µL of either wt or ko template, diluted 1:20
+*12.5 µL Phusion master mix
+*1.25 µL of each 10µM primer:
+**test for His3:
+***1)His3_F, His3_R
+***2)His3_F, His3_internalR
+**test for PyrF:
+***3)PyrF_F, PyrF_R
+***4)PyrF_F, PyrF_internalR
+**test for rpoZ:
+***5)rpoZ_F, rpoZ_R
+***6)rpoZ_F, rpoZ_internalR
+**test for Zeocin (there are two primer sets because we don't know what orientation the Zeocin gene is in)
+***7)Zeocin_R, rpoZ_F
+***8)Zeocin_R, rpoZ_R
+*ddH2O up to 25 µL
+*Parameters:
+**98 C for 5 min
+**98 C for 10 sec
+**65 C for 25 sec
+**72 C for 45 sec
+**cycle 30 times
+**72 C for 5 min
+**4 C forever
+*Results: ran PCR products out on 1% gel (see below). There were some nonspecific bands, but the PyrF and rpoZ genes do appear to be knocked out in the selection strain. His3, however, looks like it's still present--we'll test again to confirm.
+[[File:2011.06.16.selectionstraintestfordeletion(labeled)|thumb|none|Gel Confirmation of Knockouts in Selection Strain 6/16/11]]
+==June 16 - Bioinformatics==
+*'''Research Targets'''
+#Clinically relevant targets
+#Existing ZFs that bind under-represented triplets
+'''Updating our programs'''
+*Many of our current programs currently look at overall data or data based on specific DNA triplets (for example: 'GAT' or 'AAA'). However, in order to more easily understand some of the patterns that occur in the datasets, we want to examine broader subsets of data. For example, do different patterns appear when looking at fingers that bind to 'GNN' triplets versus 'NGN' triplets (where 'N' represents any of the 4 bases)?
+**We added the capability for our programs to accept inputs with the variable 'N' by using regular expressions.
+***We can now create lists of the zinc fingers that bind to any triplet, and create interaction matrices and frequency tables for any triplet input.
+</div>
+<div id="617" style="display:none">
+==June 17th==
+'''Update on selection strain knockout status:''' We are trying to reach Addgene to check how His3 was knocked out---instead of deleting the gene, they may have simply introduced an early stop codon. If that's the case, our gel would have the correct bands because the primers we designed can only show whether a deletion or insertion was in that locus.
+'''Transformation results''': successful!!
+*The only plate with colonies was the one plated with 700 µL of cells transformed with 45 ng of pKD42
+*Chose a colony to grow in 3mL LB, 1.5µL amp, 30C; make glycerol stock with mid-log cells
+*Plate with colonies at 4C
+'''Miniprep of pZE22G:''' (to have the plasmid containing the kan cassette on hand)
+*used 2 tubes of 1.5mL overnight culture, followed Qiagen kit instructions
+*38.0 ng/µL, 260/280=1.99
+*27.8ng/µL, 260/280=2.02
+'''Overlap PCR gel and extraction''': 1%, 150V
+*Results: adding the primers in after 10 cycles was much more successful than adding them at the beginning, and all three dilutions showed the expected product band (about 2.5kb). The rest of the 1:10 dilution will be run on a gel and extracted.
+*11.2ng/µL, 260/280=2.10
+[[File:2011.06.16.kanZFBoverlap(labeled)|thumb|none|Successful Overlap of Kan Cassette and ZFB Gel 6/17/11]]
+==June 17 - Bioinformatics==
+===Goals===
+#Make BB Database in program-readable format ✓
+#Edit out BB with incomplete helices ✓
+#GNN, TNN, CNN, ANN frequencies
+*Targets (5-10; '''8''') '''x''' Backbones (???) '''x''' Helices (≥500)=55,000
+**Backbones: similar, but not ''too'' similar to zif268; more than 1-2 aa changes, but <10
+**Helices fixed based on our program-- eventually saturates and levels out
+***Graph: # of var (# of tries by the computer) vs. % space covered
+[[File:Interaction Map.png|frame|right|Proposed interactions between helical zinc finger residues and base pairs of the target DNA sequence (based on Persikov 2011 <cite>Persikov2011</cite>]]
+====Options for Target DNA Sequences / ZF Helices====
+#F3(known) / F2(known) / '''F1(novel)'''
+#F3(known) / '''F2(SNP in b<sub>1</sub> position)''' / F1(known)
+#'''F3(unknown) / F2(unknown) / F2(unknown)'''
+*Excluded Rare Codons (for ''E. coli'')<cite>CodonUsage OpenWetWareCodonUsage NIHRareCodonCalculator</cite>:
+**CTA
+**ATA
+**CCC
+**CGA
+**CGG
+**AGA
+**AGG
+**GGA
+**GGG
+----
+====References====
+<biblio>
+#Persikov2011 pmid=21572177
+#CodonUsage http://www.sci.sdsu.edu/~smaloy/MicrobialGenetics/topics/in-vitro-genetics/codon-usage.html
+#OpenWetWareCodonUsage http://openwetware.org/wiki/Escherichia_coli/Codon_usage
+#NIHRareCodonCalculator http://nihserver.mbi.ucla.edu/RACC/
+</biblio>
+</div>
+<div id="620" style="display:none">
+==June 20th==
+*Grew up colony of the selection strain with pKD46 in an attempt to reach mid-log and create glycerol stock
+**Unable to reach mid-log, so going to leave growing over night and use saturated culture tomorrow
+*Determined primers in order to piece together the omega subunit and ZFP genes into the pZE21G plasmid (spec cassette)
+*Ran PCR on His3 locus and sent to GENEWIZ to be sequenced
+**used the same procedure as the earler WT/KO PCR, but with 1µL undiluted template and only His_F and His_R primers
+**ran 3 reactions and sent in three primers (His_F, His_R, His_internalR)
+==June 20th - Bioinformatics==
+==='''Goals for the week'''===
+*Finish designing the chip, by Wednesday hopefully
+**Need chip order out, takes 4 weeks
+**Need all sequences by '''''Friday'''''!!!
+*FIRST PRIORITY: If we can get Persikov to work, good!
+**Step one: get results he’s published, get the web app to "work" with his data, then OPEN data, and finally our data
+*Brainstorming session (tomorrow?) to decide how many targets/sequences
+*Determine the importance of the first/second/third nucleotide positions
+**Look at NGN, NTN, NAN, NCN (''Not just GNN, etc.'')
+**Pick a particular GNN, plot vs. TNN- is there a pronounced difference in position 1, or -1?
+===Today===
+*Testing Persikov's Data for validation
+**Persikov v. himself  ✓
+**Persikov v. OPEN
+**Persikov v. our sequences
+===Probability data===
+*The following are graphs of the probability of finding each amino acid at each position on the alpha helix.
+{|
+| [[File:gnn_freqs.png|thumb|left|Probability data for the 783 fingers that bind to '''GNN''' triplets. Note the high probability of leucine at position 4 and arginine at position 6.]]
+| [[File:tnn_probs.png|thumb|left|Probability data for the 128 fingers that bind to '''TNN''' triplets. Note the high probability of leucine at position 4.]]
+| [[File:cnn_probs.png|thumb|left|Probability data for the 16 fingers that bind to '''CNN''' triplets. There may not be enough data to consider this information statistically significant]]
+| [[File:ann_probs.png|thumb|left|Probability data for the 29 fingers that bind to '''ANN''' triplets. There may not be enough data to consider this information statistically significant]]
+|-
+| [[File:ngn_probs.png|thumb|left|Probability data for the 298 fingers that bind to '''NGN''' triplets. The position 4 leucine motif remains. There is also a high probability (> 0.5) of a histidine at position 3 and an arginine at position 6.]]
+| [[File:ntn_probs.png|thumb|left|Probability data for the 177 fingers that bind to '''NTN''' triplets. The position 4 leucine motif remains.]]
+| [[File:ncn_probs.png|thumb|left|Probability data for the 244 fingers that bind to '''NCN''' triplets. The position 4 leucine motif remains. There is also a very high probability of an arginine at position 6.]]
+| [[File:nan_probs.png|thumb|left|Probability data for the 248 fingers that bind to '''NAN''' triplets. The position 4 leucine motif remains. There is also a very high probability (> 0.75) of an asparagine at position 3 and an arginine at position 6.]]
+|-
+| [[File:nng_probs.png|thumb|left|Probability data for the 234 fingers that bind to '''NNG''' triplets. The position 4 leucine motif remains. There is also a very high probability (> 0.75) of an asparagine at position 1 and a high probability (> 0.5) of an aspartic acid at position 2 and an arginine at position 6.]]
+| [[File:nnt_probs.png|thumb|left|Probability data for the 247 fingers that bind to '''NNT''' triplets. The position 4 leucine motif remains. There is also a high (> 0.5) probability of an arginine at position 6.]]
+| [[File:nnc_probs.png|thumb|left|Probability data for the 262 fingers that bind to '''NNC''' triplets. The position 4 leucine motif remains. There is also a very high (> 0.75) probability of an arginine at position 6.]]
+| [[File:nna_probs.png|thumb|left|Probability data for the 218 fingers that bind to '''NNA''' triplets. The position 4 leucine motif remains. There is also a very high (> 0.75) probability of a glutamine at position -1 and an arginine at position 6.]]
+|}
+</div>
+<div id="621" style="display:none">
+==June 21st==
+'''His3 sequencing results:'''
+The sequencing results showed that the His3 (HisB) gene is still present in the strain and without any early stop codons. There is a 2 aa deletion in the middle of the protein, but its purpose is unknown and the gene likely is still fully functional.
+*Restreak selection strain on plate from glycerol stock--tomorrow we will PCR the His3 locus and sequence again just to be sure.
+*Made oligos for MAGE to insert stop codons and make a frame shift in the endogenous His3 gene, so that if necessary we can knockout His3 ourselves.
+'''Selection strain with lambda red:'''
+*Reinoculated and made glycerol stock
+*prepared for MAGE tomorrow
+==June 21st - Bioinformatics==
+===Persikov Statistics - Graphs===
+{|
+| [[File:Scatterplot of top bottom 20 with SVM polynomial.png|thumb|left|Scatterplot of top/bottom 20 with SVM polynomial]]
+| [[File:Sequence by sequence (lin SVM).png|thumb|left|Sequence by sequence (lin SVM)]]
+| [[File:Top_Bottom_20_ZFs_(SVM_linear).png|thumb|left|Top/Bottom 20 ZFs (SVM linear)]]
+| [[File:Comparison of polynomial vs linear distribution (polynomial generally higher values).png|thumb|left|Comparison of polynomial vs. linear distribution (polynomial generally higher values)]]
+|}
+*FQCRICMRNFS<sub>zif268 F2 Backbone</sub>/'''''Helix F1'''''/TGEKP<sub>linker</sub>
+*The Persikov data shows weak predictive power for OPEN amino acid sequences. Our conclusion is that Persikov's program is not well-suited for incorporation into our helix generator. Testing Persikov's helices in his program yeilded mostly accurate results (approximately 24/25 matched known binding information). This is an important test because it proved that we are using the program correctly and that the program is in fact working properly. However, testing the OPEN sequences in Persikov's program resulted in numerous false negative values which informed our decision not to use Persikov's program to check our own hellix-generating program.
+===Phone Call with Dan===
+*How conservative/risky should we be in terms of using other backbones?
+**<u>'''Conservative'''</u>
+***'''Possible Pros:'''
+****More likely to get something that will work
+****Depending on how "smart" our probabilities are (from our ZF generation algorithm), we could cover a lot of novel space without straying too far from zif268
+****''Worst Case'':Something we can show for iGEM (we covered the same ground OPEN did, and found many of the same ZFs, but with a targeted approach, a "smarter" method-- not throwing random things at it; Chip is not ours, but the program is "smarter")
+***'''Possible Cons:'''
+****Might end up covering the same ground as OPEN, but doing a "worse" job than they did
+****Less likely to discover new/groundbreaking things (i.e., TNN triplets)
+**<u>'''Less Conservative'''</u>
+***Have 3-6 target sequences (we're currently going for 8)
+***More backbones from non-zif268 than zif268
+***'''Pros:'''
+****We could get luck and find something no one has ever seen before (TNN, ANN). If we throw enough things at it, we're more likely to get luck.
+***'''Cons:'''
+****''Risk:'' Many of these backbones (from entire ZF world)may NOT bind DNA (i.e., may bind proteins)
+****''Risk:'' May not find anything that binds, then the whole project is a dud
+*'''What is the more important variable, helices or backbones?'''
+**Helices seem to be more important, backbones of secondary importance
+**Backbones: ZF's unravel DNA, open the major groove-- backbone is important here, changes the bond angle, etc. (Brandon's paper-??)
+*'''''Balance''''' needed between low and high risk
+**If we find backbones that we know bind DNA, greatly lowers our risk
+**Limited spaces on chip: zero-sum game
+**With a middle of the road approach, we diminish both benefits and risk (diminishes the benefits of the high risk approach much more than it diminishes the benefits of the conservative approach; i.e., if you're playing the lottery, you're more likely to win if you buy many more tickets)
+*We need to compare probabilities of randomly-generated OPEN sequences vs. probabilities of sequences randomly generated by our program
+**OPEN tries to cover all space: smaller probability
+**If we have a "smarter" algorithm, we can produce fewer
+**However,  the idea is not to repeat OPEN, but to go somewhere else, non-GNN sequences
+**'''''Remember:''''' OPEN is a ''Cell'' paper; the point of the project is not to compare ourselves to them
+*If we find binders for 1-2 of our sequences, that would be awesome
+**Probably we'll have some that find none, some have 10, our last one might have 1,000 hits (then, we do bioinformatics to figure out why/what those hits were)
+**Point: to learn and do high-level bioinformatics, and high-tech cloning techniques in the lab
+**If you do find binders, you can write a paper about it!
+*We have all the resources we need right now to build our chip
+**We need to pick out targets
+**'''Need to decide exactly what we want for:'''
+***No. of target sequences/which ones
+***No. of helices/ which ones
+***Ratio of zif268 backbones: non-zif268 backbones
+**Avoid switching Leucine out of position 4, then change other positions based on our frequencies
+===Chip Design===
+*No. of sequences will be more than we can put on the chip
+**Helices: essentially unlimited
+***Put more-likely-to-bind helices into the risky backbones
+***Put less-likely-to-bind helices into a zif268 backbone
+*Backbones
+**Maybe revert to a more targeted approach: pick backbones that we know are transcription factors (TFs), that we know bind to DNA
+**''OR'' research the ZFs from the phylogenetic tree
+***Pick clades to research, see if one looks better than the other
+**Why did OPEN cover so many helices, without changing the backbone, but still yield predominantly GNNs?
+**If we have an idea of how the backbone might affect binding, maybe we could look into some sort of low-level modeling, etc. so that we wouldn't be grasping? Could Vatsan help with this?
+***See 2000 Wolfe paper [http://www.ncbi.nlm.nih.gov/pubmed/10940247]
+**Backbones ''could'' affect interactions between fingers
+**Theory: energy penalty to ZF binding-- unravels DNA when binds to it
+*We have 12 target sequences
+**2 per 4 diseases, 4 for the 5th disease
+**If we want to be more conservative, we could throw out Type III, but it could be something cool
+**'''We should have mostly Type I (CoDA argument, if this is an F2)'''
+**Proposed: 3 diseases, 6 sequences
+***4 Type I (F3 and F2 known, F1 novel)
+***1 Type II (GNN, ANN, GNN)
+***1 Type III (All unknown, e.g., TNN, ANN, TNN;'''''max 1''''')
+Or, for 3 diseases:
+# Type I's
+# Type I, Type II
+# Type I, Type III
+*'''<u>Clinical Targets</u>
+# Colorblindness ('''Type I's''')
+# Familial Hypercholesterolemia (FH) (1 in 500)
+# <del>Cystic Fibrosis (CF)</del>
+# <del>Tay Sachs</del>
+# KRAS- (oncogene/cancer)
+*'''Main goal of project''': to build outside of what is already known
+**If we wanted to cure a disease only, we could just use existing ZFs (i.e., find GNN binding locations)
+**Also, we lend a level of specificity for insertion/deletion
+**There is the possibility that there might be some area where specificity might demand ANN codons
+<u>'''Current decision on chip design:'''</u>
+*We will have 6 target sequences, 2 each from colorblindness, FH, and KRAS.  All are "Type I" targets (only F1 is novel) with the middle finger chosen from the CODA paper (either GNN or TNN)
+**N.B.: the CB and FH sequences make up full ZF nuclease cut sites. The KRAS sites, due to the small number of GNNTNN F3F2 combos available in CODA, are separate, with the flanking ZF nuclease site added afterwards in parentheses
+# GGT'''G'''GT'''A'''AG (CB)
+# GGA'''G'''TC'''C'''TG (FH)
+# GGC'''T'''GA'''T'''GC (KRAS) (CTGAAAATT)
+# GGC'''T'''GA'''C'''AC (FH)
+# GGC'''T'''GG'''A'''AT (KRAS) (GACAAGAGC)
+# GTC'''G'''CC'''T'''CC (CB)
+*Targets 3, 4, and 6 are similar to sequences Zif268 variants successfully bind to, so the backbones will be weighted accordingly:
+**Zif268_F2 backbone: 6000 helices (per target)
+**10 backbones more closely related to Zif268: 300 helices each
+*Targets 1, 2, and 5 will have equal distributions of backbones:
+**Zif268_F2: 3000 helices
+**10 backbones closely related to Zif268: 300 each
+**10 backbones more distantly related to Zif268: 300 each
+===Identifying dependencies===
+*We looked at the [[#Probability data|probability graphs]] to determine which amino acid positions on the finger's helix interact with which bases.
+**Some interactions are fairly well estabilished, while others have been more recently proposed (See [[#June 17 - Bioinformatics|interaction map (Persikov 2011)]])
+**To identify these interactions in our own data we looked at which helix positions varied most when you changed the bases. A more rigorous way to do this is to calculate the entropy change as you change the amino acids in each position.
+***'''xNN'''(Vary base 1): Amino acid 6 changes
+***'''NxN'''(Vary base 2): Amino acid 3 changes
+***'''NNx'''(Vary base 3): Amino acid -1 and 2(?) changes
+**Our program looks at dependencies between amino acids when generating sequences.
+***We decided on these amino acid dependencies, using both established data and patterns we saw in the OPEN data:
+****-1 and 2
+****2 and 1
+****6 and 5
+**Because there is not much data for 'CNN' and 'ANN' sequences (with 16 and 29 known fingers that bind to each triplet, respectively), we should use pseudocounts for these sequences, so that our frequency generator is not too biased toward probabilities that may not be significant.
+</div>
+<div id="624" style="display:none">
+===June 24===
+*Designed primer for testing HisB deletion, reuse His_Internal_R to test the band
+===Updated Closest Zif268 Fingers===
+We realized that some of our "close non-zif268 fingers" were actually not all that close to Zif268, and so we went into the 88,000 zinc finger database and pulled out zinc fingers surrounding zif268.  In fact, there were many, many, many zinc fingers that had identical sequences to the Zif268 F2 finger, and so we looked at sequences around it.  The tree below shows the new non-zif268 backbones that are actually close to zif268 compared to our old set.  The new set is in gray, the old set is in black.  This gives us a potential seven more backbones to work with.
+[[File:ComparisonTree.png‎]]
+==June 24th - Bioinformatics==
+===Sequence Generation===
+We made some small updates to the sequence generator, based on the frequencies we noticed in the outputs of the tests we ran.
+*We decided to only include pseudocounts for position 6 for 'CNN' and 'ANN.' Originally, 'CNN' and 'ANN' were using pseudocounts for all seven positions. However, this introduced a noticeable increase in amino acids, such as tyrosine (Y), that have been shown to occur rarely in zinc fingers (according to our data from OPEN and Persikov). Additionally, because tryosines occured so rarely in the data (11 times total in the open data set), we decided not to give tyrosine a pseudocount.
+*We added the capability to prevent repeat backbone-helix combinations on the chip. That is, we wanted to make sure that the same exact zinc finger was not generated for different triplet inputs.
+To test the sequence generator, we made two sets of 2000 sequences for GAA, then infographic-d the results. Comparing these with the images for OPEN and OPEN+Persikov shows that our generation follows the major themes of those datasets, but also introduces variation. The two generated sets also vary slightly from each other, which shows the influence of randomness on the generation.
+{|
+ | [[File:GAA_generated_round_1.png|thumb|left|Round 1 of generating sequences for GAA with the program.]]
+ | [[File:GAA_generated_round_2.png|thumb|left|Round 2 of generating sequences for GAA with the program.]]
+ |-
+ | [[File:GAA_open_and_persikov.png|thumb|left|GAA sequences from the OPEN dataset.]]
+ | [[File:GAA_open_only.png|thumb|left|GAA sequences from Persikov and OPEN datasets.]]
+|}
+{| class="wikitable" border="3" cellpadding="5"
+| align="center" style="background:#f0f0f0;"|'''Disease'''
+| align="center" style="background:#f0f0f0;"|'''Target DNA Finger 1'''
+| align="center" style="background:#f0f0f0;"|'''Helices in Zif268 Backbone'''
+| align="center" style="background:#f0f0f0;"|'''Helices in Zif268 Closely-Related Backbones'''
+| align="center" style="background:#f0f0f0;"|'''Helices in Zif268 Distantly-Related Backbones'''
+|-
+| Colorblindness ''(Bottom)''||TGG||5150||3000||1000
+|-
+| Colorblindness ''(Top)''||ATG||3050||3050||3050
+|-
+| Familial Hypercholesterolemia ''(Bottom)''||GAC||5150||3000||1000
+|-
+| Familial Hypercholesterolemia ''(Top)''||CTG||3050||3050||3050
+|-
+| Myc ''(Top<sub>198</sub>)''||CTC||3050||3050||3050
+|-
+| Myc ''(Top<sub>981</sub>)''||AAA||3050||3050||3050
+|-
+|}
+Table of target sequences and helix distribution across backbones
+*Distribution: Zif268 : Zif268 similar : Zif 268 dissimilar
+**Conservative distribution 56.3 : 32.8 : 10.9
+**Riskier distribution 33.3 : 33.3 : 33.3
+==June 24th==
+'''pZE21G:'''
+*reinoculated culture with 100µL of saturated solution, grew to mid-log, and made glycerol stock
+*backbone PCR: ran E gel but no bands--PCR unsuccessful. We may need to use a different backbone for the zinc fingers.
+'''Omega and Omega+Zif268:'''
+*these were the only two PCR reactions from 6/22/11 to work
+*PCR purified using Qiagen kit:
+**omega: 6.1ng/µL, 260/280=1.83
+**omega+Zif268: 11.3 ng/µL, 260/280=1.67
+'''Lambda red recombination of selection system:'''
+*reinoculated selection strain+pKD46 with 100µL of saturated solution
+*just before mid-log (about 4 hours after inoculation) divided culture in half (1.5mL) and added either 37.5µL or 3.75µL of 20% arabinose solution (to try two different induction levels). Cultures grew for another hour.
+*The rest of the procedure was the same as the 6/22/11 attempt but without the 42C water bath.
+==June 24th - Bioinformatics==
+===Playing with Pseudocounts===
+Using CTC because of position 6's reliance on the CNN frequencies, we see what difference values of pseudocounts (if in the frequency table, the frequency of an amino acid is 0, bump it up to the psuedocount: ex. A = 0 becomes A = .015 with a psuedocount of .015) make. Pseudocounts are necessary for data that has small sample size - we could be missing out on working helices because a letter's frequency is 0 when it shouldn't be.
+Various pseudocount (psu = ) values. Look at the 7th column, which is position 6 in the helix:
+{|
+ | [[File:CTC_0.png|thumb|left|psu = 0]]
+ | [[File:CTC_.005_psuedo.png|thumb|left|psu = .005]]
+ | [[File:CTC_.008_psuedo.png|thumb|left|psu = .008]]
+ |-
+ | [[File:CTC_.01.png|thumb|left|psu = .01]]
+ | [[File:CTC_.015_psuedo.png|thumb|left|psu = .015.]]
+ | [[File:CTC_.02_psuedo.png|thumb|left|psu = .020.]]
+|}
+The variation from E being the top letter to A being top back to E is from a slight adjustment in how we add on psuedocounts: the 'new' way is a more proportional approach.
+Notice how psu = 0 gives only the four letters found in our dataset, while psu > 0 adds in other letters, each with a small probability ranging from .5% to 2%.
+The question is how much psu to add: less means we weight our (possibly flawed) data of proven zinc fingers more. Higher psu adds more randomness (variation) to our sequences, but some fraction of those sequences will not work.
+'''List of Remaining Goals:'''
+*Sort fingers by target
+*Pick and assign primer sets
+*Reverse translate fingers avoiding type II restriction enzymes and primers
+*Append type II restriction enzyme and primer sequences to each finger
+*Yay
+</div>
+<div id="625" style="display:none">
+==June 25th-26th - Bioinformatics==
+This is the set of final target sequences with assigned forward and reverse primers (tags for PCR):
+{| class="wikitable" cellpadding="5"
+| align="center" style="background:#f0f0f0;"|'''Disease'''
+| align="center" style="background:#f0f0f0;"|'''Target Sequence'''
+| align="center" style="background:#f0f0f0;"|'''Forward Primer (5'-3' NOT REVERSE COMPLEMENT)'''
+| align="center" style="background:#f0f0f0;"|'''Reverse Primer (5'-3' NOT REVERSE COMPLEMENT)'''
+|-
+| Colorblindness||GCT GGC TGG||ATATAGATGCCGTCCTAGCG||AAGTATCTTTCCTGTGCCCA
+|-
+| Colorblindness||GCG GTA ATG||CCCTTTAATCAGATGCGTCG||TGGTAGTAATAAGGGCGACC
+|-
+| Familial Hypercholesterolemia||GGC TGA GAC||TTGGTCATGTGCTTTTCGTT||AGGGGTATCGGATACTCAGA
+|-
+| Familial Hypercholesterolemia||GGA GTC CTG||GGGTGGGTAAATGGTAATGC||ATCGATTCCCCGGATATAGC
+|-
+| Myc-gene Cancer||GGC TGA CTC||TCCGACGGGGAGTATATACT||TACTAACTGCTTCAGGCCAA
+|-
+| Myc-gene Cancer||GGC TGG AAA||CATGTTTAGGAACGCTACCG||AATAATCTCCGTTCCCTCCC
+|}
+Additionally, primer tags '''(forward: GTACATGAAACGATGGACGG, reverse:CTGGTATAGTCTCCTCAGCG)''' will be assigned to the 100 control sequences.
+</div>
+<div id="627" style="display:none">
+==June 27, Wet lab==
+'''Sequencing PyrF, rpoZ loci:'''
+*We will sequence these genes in the selection strain just to make sure they are knocked out, especially since it appears HisB is not.
+*Picked a colony off ∆HisB∆PyrF∆rpoZ plate (6/21) and grew in 150µL LB plus tet in a 96 well plate for about 2 hrs at 37˚C
+*diluted 1 in 20 and used 1µL as template in PCR with KAPA mastermix (see protocols for reagent amounts and parameters)
+**annealing temp 65˚C, elongation time 1:15
+#PyrF_F, PyrF_R primers
+#PyrF_F, PyrF_internalR
+#rpoZ_F, rpoZ_R
+#rpoZ_F, rpoZ_internalR
+#rpoZ_R, zeocin_R
+*Run on E Gel to check PCR worked: bands are at the same sizes as the original genotyping gel.
+[[File:2011.06.27.pyrFrpoZloci2(labeled)|thumb|none|PCR of kan-ZFB-wp-his3-ura3]]
+*Tomorrow we will send samples to Genewiz for sequencing
+'''Lambda Red recombination:'''
+*The plates made from the recombination (6/24) did have colonies, but they were very small and took a long time to grow, and so they may not actually have the kan-ZFB insert. We will have to PCR the locus to see.
+*Chose 8 colonies from each plate and grew at 30˚C in 150µL LB plus kan in a 96 well plate
+*When our primers arrive, we will PCR the locus to check for the insert.
+'''Selection system media:'''
+==June 27th - Bioinformatics==
+===To Do for Today===
+# 100 sequences (and control), 2 each with the same F3 and F2, but different F1, from our test sequences [zif268, OZ123, OZ052, CoDA]✓
+#Type II nuclease cut site sequences- put the binding sites into our oligos ✓
+#Final backbones with helices ✓
+#Programming stuff- Check to make sure there are no cut sites or primers in any of our backbone/helices combinations; check translation order (translates F1&rarr;F3)✓
+===100 Control Sequences===
+* See our [[Media:Positive Control Sequences PostMacro.xlsx|Positive Control Sequence Table]], updated June 28th
+* Selected known binding zinc fingers from the CODA table that bind sequences similar to our target sequences
+* All control helices from CODA were inserted into Zif268 F2 backbones and have been assigned a seventh primer tag separate from the tags given to the 6 target sequences.
+===Updated Target Sequences===
+One of our sequences from before was bad because the F3/F2 combo did not appear in the CODA table... faulty checking, my bad :(
+Here is the newest table of target sequences:
+{| class="wikitable" cellpadding="5"
+| align="center" style="background:#f0f0f0;"|'''Disease'''
+| align="center" style="background:#f0f0f0;"|'''Target Range'''
+| align="center" style="background:#f0f0f0;"|'''Binding Site Location'''
+| align="center" style="background:#f0f0f0;"|'''Bottom Finger'''
+| align="center" style="background:#f0f0f0;"|'''Top Finger'''
+| align="center" style="background:#f0f0f0;"|'''Bottom AA (F3 to F1)'''
+| align="center" style="background:#f0f0f0;"|'''Top AA (F3 to F1)'''
+|-
+| Colorblindness||chrX:153,403,001-153,407,000||3666|| style="background:#92D050" |GTG GGA TGG || style="background:#92D050" | GAA GGG ACC||RNTALQH.QSAHLKR.#######||QDGNLGR.RREHLVR.#######
+|-
+| Familial Hypercholesterolemia||chr19:11,175,000-11,195,000||14001||style="background:#92D050" | GGC TGA GAC||style="background:#92D050" | GGA GTC CTG||ESGHLKR.QREHLTT.#######||QTTHLSR.DHSSLKR.#######
+|-
+| Myc-gene Cancer||chr8:128,938,529-128,941,440||198||GGT GCA GGG||style="background:#92D050" | GGC TGA CTC||VDHHLRR.QSTTLKR.RRAHLQN||ESGHLKR.QREHLTT.#######
+|-
+| Myc-gene Cancer||chr8:128,938,529-128,941,440||981||GGA GAG GGT||style="background:#92D050" | GGC TGG AAA||QANHLSR.RQDNLGR.TRQKLET||EKSHLTR.RREHLTI.#######
+|}
+*Green cells are our target sequences.
+===Cut Site Design===
+*See our [[Cut Site Design]] page
+*We left in one proline (P) between the linker and the starting FCQ... of finger 2, but as this proline is the last AA of the OPEN linker (TGEKP) and occurs before the beta sheet in every zinc finger in Zif268 (see zif268's sequence on its [http://www.pdb.org/pdb/explore/remediatedSequence.do?structureId=1AAY PDB page])
+*This configuration also allows the library to be used at any finger position because proline ends the OPEN linker.
+===Updates on the program===
+The program appears to run extremely slowly because of the computationally intensive step of checking the reverse translated sequences
+*In addition to checking for the primers and cutsites, we also have to check for 'GGGGGG' because it can lead to undesirable structures forming. In addition, we have to check for the reverse complements for all these undesirable sequences.
+*We decided on a similarity of 0.8 as the maximum acceptable similarity between the sequence the primer bind to and any other part of the generated sequence. If the sequences are too similar, the primer might mishybridize. We originally had a similarity threshold of 0.6 but that made the program run too slowly, so we decided on a '''threshold of 0.8'''.
+</div>
+<div id="628" style="display:none">
+==June 28th==
+'''Sequencing:'''
+*the following samples from 6/27 were sent to Genewiz for sequencing:
+**PyrF F, R (one sample with PyrF_F, one with PyrF_R)
+**rpoz F, R (one sample with rpoz_F, one with rpoz_R)
+'''Lambda red results:'''
+*the colonies on the plates did not look promising, and the ones we chose and grew up in LB+kan did not actually grow.  Just to be certain, we choose 18 more colonies: 6 from 37.5µL arabinose 100µL plated, 6 from 3.75µL arabinose 100µL plated, and 6 from 37.5µL arabinose 1.5mL plated.  Three from each plate were grown in plain LB and three with kan. We will let it grow in 30˚C, overnight if necessary, and hopefully see bacteria for PCR.
+*Assuming this does not work, we prepared more ∆HisB∆PyrF∆rpoZ+pKD46 in two ways: we put 3 colonies in LB+amp from the 6/16 transformation plate, and we streaked a new amp plate from the glycerol stock
+*Another possibility is that something is wrong with our lambda red. We designed primers to verify that the pKD46 plasmid is really in the cells.
+'''Kan-ZFB-wp-his3-ura3 construct:'''
+*Our last few PCR purifications have given us very low yields, and consequently we have had to use large amounts of our DNA (and the large amounts of buffer salts may also be why our lambda red recombinations have failed). When we tried to amplify our current DNA using the hisura-kan_F and ZFB-wp-hisura_R primers and the Phusion mastermix, it did not work (see 6/23).  We will try to gain more product in two ways:
+) Repeat 6/23 PCR but use KAPA mastermix
+*the KAPA mix may work better than the Phusion.
+*Used KAPA protocol with 1µL of kan-ZFB overlap as template, hisura-kan_F and ZFB-wp-hisura_R primers, 65˚C annealing temp, 90 sec elongation time
+*made 2 reactions
+) Repeat overlap extension PCR (see 6/16) with KAPA mastermix
+*used KAPA protocol with 1µL kan cassette and 1µL of ZFB-wp-hisura, 65˚C annealing temp, 90 sec elongation
+*10 cycles without primers; hisura-kan_F and ZFB-wp-hisura_R primers added; 15 more cycles
+*made 6 reactions
+*E gel to check reactions worked: all 6 overlap PCRs successful, but not the other two reactions.
+[[File:2011.06.28kanZFBconstruct(labeled).png|thumb|none|kan-ZFB-wp-hisura construct 6/28/11]]
+*combined samples 1-3 and 4-6 and ran on 1% agarose gel for extraction
+[[File:2011.06.28kanZFBconstruct_for_extract1(labeled).png|thumb|none|kan-ZFB-wp-hisura construct for gel extraction 6/28/11]]
+*used Qiagen gel extraction kit and instructions with the following modifications:
+**gel bands were dissolved in 500µL of buffer QG regardless of the gel volume
+**gel heated at 50C for 20 min (to make up for reduced amount of buffer QG)
+**after melting, 10µL of NaOAC (3M) were added to adjust the pH
+**DNA from samples 1-3 were eluted in 20µL of ddH2O; DNA from samples 4-6 were eluted in 20µL of buffer EB
+**water sample: 273.4 ng/µL, 260/280=1.92
+**EB sample: 136.9 ng/µL, 260/280=2.38
+==June 28th - Bioinformatics==
+<font color=red>'''''Attention all Harvard iGEM-ers!!!'''''</font> <font color=blue> According to the [https://2011.igem.org/Main_Page iGEM Main Page], our preliminary project descriptions and safety proposals are due on</font> <font color=red>'''''July 15'''''</font>. <font color=blue> Please see the aforementioned link so we can get this done ASAP- we don't want to miss any deadlines and have all our hard work wasted!</font>
+*Finalized our [[Media:Positive Control Sequences PostMacro.xlsx|Positive Control Sequence Table]], using Justin's macro to insert the F1 helices into the appropriate zif268 F2 backbone
+*Length of chip oligos: 131-140bp (based on [[Cut Site Design]])
+**Primers: 20bp (x2= 40bp)
+**zif268 F2 backbone + helix= 23aa (x3=69bp; some fingers ~3aa longer)
+**Some alternate backbones are longer than zif268 F2 backbone
+**Type II binding/cut sites= 11bp on each side (22bp total)
+**Standard legnth: 40 + 69 + 22 = 131bp
+*Use WebLogos as a final visual check of our final generated sequences
+===Plasmid and Oligo Design Schematics===
+{|
+ | [[File:Oligo design on board.jpg|thumb|left|Oligo Design]]
+ | [[File:Plasmid design on board.jpg|thumb|left|Expression Plasmid Design]]
+|}
+===Chip-Based Sequence Design Schematic===
+{|
+ |[[File:Chip_protocol.png|thumb|left|Chip-based process for sequence design, taken from Kosuri, et al. 2010 model of scalable gene synthesis <cite>Kosuri2010</cite>]]
+|}
+====References====
+<biblio>
+#Kosuri2010 pmid=21113165
+</biblio>
+===Harvard Logo===
+{|
+ | [[File:Harvard_logo.png|thumb|left|]]
+|}
+===Running the Generator!===
+[[File:Fasta_total.csv]] NOTE: LATER GENERATED NEW SEQUENCES. NOT UP TO DATE.
+====Generated Final Chip Sequences====
+*We ran the generator once earlier this afternoon, but had to re-run it again due to a typo in the cut sites and the number of sequences we desired for each backbone. Luckily, we caught these errors, and after checking the program once again, we ran it a final time this afternoon.
+**It took about 45 minutes for the program to generate and reverse translate the 54900 sequences.
+**During this time, we created a function that will re-translate the sequences that the generator output. It compares the original helix with the re-translated helix to make sure that our reverse-translate works properly.
+***This step went smoothly, and we verified that the sequences were reverse-translated properly.
+**To make sure that the distributions generated were as expected, we made [[#Generated WebLogos for Final Chip|WebLogos]] of the helices generated(see below).
+*The output file (in the Dropbox: iGem > chip > final chip.csv) originally had the following headers: 'Target', 'Backbone #', 'Helix Sequence', 'Backbone Sequence', 'Nucleotide Sequence of Zinc Finger'
+**We wanted to convert this information into FASTA format.
+***We wrote a function that converted our original file into fasta format (in the Dropbox: iGem > chip > fasta.csv)
+***The file FASTA_total (also linked above) contains the FASTA for all 50000 sequences (including the 100 controls).
+***For those curious, the FASTA format just a format that looks like this:
+ >Header (For us the header is: Target, Backbone #, Helix Sequence, Backbone Sequence)(The header for the controls are: Index Number, 'control')
+  Sequence (In our case, the nucleotide sequence of the zinc fingers)
+====Generated WebLogos for Final Chip====
+{|
+ | [[File:AAA.png|thumb|left|AAA]]
+ | [[File:ACC.png|thumb|left|ACC]]
+ | [[File:CTC.png|thumb|left|CTC]]
+ |-
+ | [[File:CTG.png|thumb|left|CTG]]
+ | [[File:GAC.png|thumb|left|GAC]]
+ | [[File:TGG.png|thumb|left|TGG]]
+|}
+*FASTA-Formatted Chip Data:
+:>NNN(Target Triplet)  BB#  Helix Seq.
+:Nucleotide seq. of ZF
+===Bioinformatics Candids===
+{|
+ | [[File:Justin speaking.jpg|thumb|left]]
+ | [[File:Justin writing.jpg|thumb|left]]
+ | [[File:Zif268 sequence by memory.jpg|thumb|left|zif268 sequence by memory. You know you've stared at too many zif268 sequences when...]]
+|}
+[[File:Primer Index_iGEM 2011]]
+===Design of Plate Practice Sequences===
+While we wait for the chip to come in, we have a number of techniques and protocols that we can practice on beforehand, so that when the chip comes we'll be ready to go to use what they give us.  We will be practicing the following techniques:
+*Cutting ZF1 out of our oligos
+*Inserting ZF1 into the expression plasmid in between the omega subunit and the linker before F2
+*Verifying that combination of our F1 from the oligo with the plasmid produces a viable, functional ZF
+*Amplifying subpools of oligos for testing
+*Inserting the expression plasmids into the E. coli containing our selection genome
+*Verifying that our ZF-binding site/GFP expression paradigm works
+To this end, we will be ordering a 96-well plate from IDT containing oligos that will simulate the entire tube of oligos that we will receive from Agilent in four weeks.  These oligos will consist of the following:
+*6 positive controls (we know which DNA sequences these bind to)
+**3 of them being the F1 fingers of Zif268, OZ052, and OZ123
+**3 of them being ZF F1s derived from CODA.
+*90 generated sequences, picked from a subset of the chip
+**These are picked evenly across the 9,150 sequences generated on the cihp for the TGG triplet F1 target from the colorblindness "bottom finger" target, GTG GGA TGG.  This particular target was chosen because the F2/F1 is a GNNTNN combo, which might be more likely to get hits from our chip generation sequences.
+The primer tag sequences for the 90 generated sequence subset will be the same as they are on the chip (for the sake of explanation, we will refer to them now as P1F and P1R in this paragraph).  The positive controls will be flanked immediately by the same primers as the generated subset so that we can amplify everything as one pool altogether should we need to (so this will be P1F and P1R).  However, we will also put an additional set of primers outside of the P1F/P1R primers for the positive controls so that we can specifically amplify the positive control subpool, should we want to.  These primers will be the same as the primers for the positive control on the chip (which will be called P2F and P2R here).
+To recap, on the chip we will have the following oligos :
+<pre>
+other oligos for the 5 other target sequences
+Oligos (TGG set, 9150 total):   | P1F | type II binding site | generated F1 | type II binding site | P1R |
+Oligos (+ control, 100 total):  | P2F | type II binding site |  control F1  | type II binding site | P2R |
+</pre>
+In our test pool of 96 sequences, we will have two types of oligos (note the two pairs of primers around the positive controls):
+<pre>
+Oligo (TGG set, 90 total):          | P1F | type II binding site | generated F1 | type II binding site | P1R |
+Oligo (+ control, 6 total):   | P2F | P1F | type II binding site | generated F1 | type II binding site | P1R | P2R |
+</pre>
+Once we get our test sequences back from IDT, they will come in a 96-well plate with one oligo in each plate.  We should make a mixture using some of each well in order to create a tube that contains all 96 sequences.  This will simulate the tube that we will receive from Agilent, except instead of 55,000 sequences we will have 96 sequences only in this tube.  From here, we can practice using this as a library.
+We can pretend that this tube is just 96 generated sequences on the chip, treating the positive controls as if they were also generated sequences (we only include them in the 96 to ensure that we will indeed get a "hit" from this practice screening).  Thus, we can just use the P1F/P1R primer set to amplify all of them in order to use them for the subsequent steps.
+These subsequent steps will be those that were outlined above, namely cutting out the F1 sequence from each oligo, ligating this F1 into our expression plasmid, putting the expression plasmid into our selection strain, observing colonies which get infused with ZFs that bind to our target site (the "hits"), and sequencing the colonies that get hits to determine which ZF they are expressing.
+We will be repeating these exact same steps once we get the chip, so if we can perfect our protocols with these practice sequences, we should be golden when the chip comes in.
+</div>
+<div id="629" style="display:none">
+==June 29th==
+Our first day with everyone in the wet lab!
+'''PyrF and rpoZ sequencing:'''
+*For some reason, Genewiz said that sequencing failed due to "no priming." We will redo the PCR and send the products in again today.
+'''Lambda red recombination and MAGE:'''
+*The cultures made from the kan plates from our earlier attempt at lambda red did (in some cases) grow, including one colony in kan from 3.75µL arabinose, 100µL plated.
+**PCR of liquid culture: used 1µL of culture diluted 1:20 as well as saturated culture of ∆hisB∆pyrF∆rpoZ+pKD46 (diluted 1:20) as a 1529620 locus wild-type control
+***primers (from Vatsan): 1529481-f, 1529806-r
+**KAPA mastermix and procedure with 56˚C annealing and 90 seconds elongation
+**E Gel of product: both wild-type and the sample hopefully containing the insert had the same short band of around 350bp--the recombination was unsuccessful
+[[File:1529620locus(labeled).png|thumb|none|kan-ZFB insertion into the 1529620 locus 6/29/11]]
+*Used overnight saturated culture to reinoculate; once close to mid-log, culture split into 2 1.5mL amounts and 37.5µL arabinose added to each
+*same procedure as previously described, but with about 300ng kan-ZFB construct and a 2.5µM final concentration of HisBNuke3 (12.5µL of 10µM stock)
+**electroporate with 1.8 kV, about 5
+**recover 3 hrs
+**kan-ZFB insertion colonies plated on kan, MAGE on amp
+===PCR Preparation===
+*Lambda Red- Selection strain glycerol stock: 1/100 dilution, 2 uL stock with 198 uL ddH<sub>2</sub>O
+*Spec (Colony)- Touch 1 colony with pipet tip, add and mix with pipet in 20 uL ddH<sub>2</sub>O, then vortex
+===PCRs===
+#PKD46 (Lambda Red)
+*Kapa Mix 2x- 12.5 uL
+*Primer_F- 0.75 uL
+*Primer_R- 0.75 uL
+*Template 1 uL
+*ddH<sub>2</sub>O- 10 uL
+*''(25 uL total)''
+#Spec (Colony)
+*Kapa Mix 2x- 12.5 uL
+*Primer_F- 0.75 uL
+*Primer_R- 0.75 uL
+*Template 1 uL
+*ddH<sub>2</sub>O- 10 uL
+*''(25 uL total)''
+#Spec (Miniprep)
+*Kapa Mix 2x- 12.5 uL
+*Primer_F- 0.75 uL
+*Primer_R- 0.75 uL
+*Template 2 uL
+*ddH<sub>2</sub>O- 9 uL
+*''(25 uL total)''
+===Expression Plasmid Design in silico===
+Today, we designed our expression plasmids in SeqBuilder.  This included plasmids for our 6 target sequences, and 3 positive controls (9 in total).  These positive controls were the following:
+{|
+| align="center" style="background:#f0f0f0;"|'''Index'''
+| align="center" style="background:#f0f0f0;"|'''Nucleotide Sequence (5\'-3\')'''
+| align="center" style="background:#f0f0f0;"|'''Helices (F3 to F1)'''
+| align="center" style="background:#f0f0f0;"|'''Notes'''
+|-
+| 16||GAA GGG AAC||QDGNLGR RREHLVR HRTNLIA||Very similar to one of our target sequences (CB top), which is GAA GGG ACC
+|-
+| 55||GGA GTG GTG||QTTHLSR DHSSLKR RNFILQR||Very similar to a target sequence (FH top), which is GGA GTG CTG
+|-
+| 77||TGT GAA TAG||RRRNLQI QQTNLTR QPHGLTA||Out of ze air
+|}
+Each of our expression plasmids contained:
+*Omega subunit
+*Omega/F1 linker (taken from paper that Dan and Noah emailed us), [[http://nar.oxfordjournals.org/content/36/8/2547.short]]
+*type II binding sites
+*gap between type II binding sites that contains XbaI restriction enzyme site (which is not present anywhere else in the entire expression plasmid)
+*F1/F2 TGEKP linker
+*F2 for a specific target sequence
+*F2/F3 TGEKP linker
+*F3 for a specific target sequence
+*TAA stop codon immediately after F3
+All of this is on our spec-resistance-containing plasmid.  The above construct replaced the GFP which was present previously on this plasmid.
+Tomorrow we will begin design of our primers from these SeqBuilder files.
+</div>
+<div id="630" style="display:none">
+==June 30th==
+'''Lambda Red, Backbone, and Sequencing PCR'''
+*Gel run on the presence of a Lambda Red protein in the pKD46 plasmid showed that it is indeed present, so our recombination failures have not been due to an incorrect plasmid.
+*Gel run on the backbone of pZE21G plasmid was success and took us one step closer to obtaining all parts necessary for the three part assembly
+*Gel run on the pyrF and rpoZ was success
+**Therefore we sent the PCR products and primers to GENEWIZ for sequencing again
+[[File: 2011.06.30.lambda_spec_pyrFrpoZ(labeled).png|thumb|none|pKD46, pZE21G, and PyrF and rpoZ loci 6/30/11]]
+'''pZE21G backbone:'''
+*Since last night's PCR was successful, we will redo it with a few protocol adaptations to get a cleaner product and to increase our yield when we purify
+*KAPA mastermix and protocol: primers HindIII-F and KpnI-R
+**template: pZE21G miniprepped plasmid, 1µL
+**2 min elongation time, 30 cycles
+**2 samples at 55˚C annealing, 2 samples at 60˚C
+'''Lambda red and MAGE:'''
+*Yesterday's prep produced tiny colonies on the MAGE plates and (so far) none on the kan-ZFB plates. Just in case it didn't work, we will redo the lambda red using even more DNA and perform a second round of MAGE using culture from yesterday that was not plated.
+*same procedure, but with the following changes:
+**5µL kan-ZFB (about 1 mg)
+**recover 3 hrs
+**kan-ZFB: plate 100µL and 2 mL on kan plates
+**MAGE: plate 1µL and 10µL on amp plates
+*To see if the colonies on the MAGE plate knocked out HisB, we chose 24 colonies, resuspended them in water, and put half the cells in LB (complete media) and half in NM media (does not have histidine). 96 well plate, 150µL media, grown overnight at 30˚C.
+===ZF Expression Plasmid Ultramer and Primer Design===
+Today, we designed primers ZF_073 through ZF_085 as listed in the iGEM Primer Index spreadsheet.  These were basically two sets of primers: the primers to clone out the omega subunit and linker, and the ultramers that would construct the last part of the linker along with the type II binding sites and F2/F3 fingers.  One should refer to the primer list for the sequences.
+Note: the annealing sequence for the ultramer overlap contained a 72 degree melting temp hairpin.  To get around this, I changed one of the codons in the F2 backbone.  The F2 backbone begins with "FQCRIC", and so I changed the codon for the arginine (R) from CGC to CGT, which resolved the hairpin problem.
+==June 30th - Bioinformatics==
+===Updated Primer list and FASTA formatting===
+We ran into a small hiccup, when we were informed that we had forgotten to reverse translate the reverse primer sequences that were being appended to the generated sequence. This is because the primer sequences we were given were the sequences for the actual primers, rather than sequences to which the primers would bind. Luckily, we caught this error! We did have the re-run the generator because we had to make sure that our generated sequences did not contain the new primers.
+*Here is the updated primer list:
+This is the set of final target sequences with assigned forward and reverse primers (tags for PCR):
+{| class="wikitable" cellpadding="5"
+| align="center" style="background:#f0f0f0;"|'''Disease'''
+| align="center" style="background:#f0f0f0;"|'''Target Sequence'''
+| align="center" style="background:#f0f0f0;"|'''Forward Primer (5'-3' NOT REVERSE COMPLEMENT)'''
+| align="center" style="background:#f0f0f0;"|'''Reverse Primer (5'-3' REVERSE COMPLEMENT)'''
+|-
+| Colorblindness||GCT GGC TGG||ATATAGATGCCGTCCTAGCG||TGGGCACAGGAAAGATACTT
+|-
+| Colorblindness||GCG GTA ACC||CCCTTTAATCAGATGCGTCG||GGTCGCCCTTATTACTACCA
+|-
+| Familial Hypercholesterolemia||GGC TGA GAC||TTGGTCATGTGCTTTTCGTT||TCTGAGTATCCGATACCCCT
+|-
+| Familial Hypercholesterolemia||GGA GTC CTG||GGGTGGGTAAATGGTAATGC||GCTATATCCGGGGAATCGAT
+|-
+| Myc-gene Cancer||GGC TGA CTC||TCCGACGGGGAGTATATACT||TTGGCCTGAAGCAGTTAGTA
+|-
+| Myc-gene Cancer||GGC TGG AAA||CATGTTTAGGAACGCTACCG||GGGAGGGAACGGAGATTATT
+|-
+| Controls || n/a ||GTACATGAAACGATGGACGG||CGCTGAGGAGACTATACCAG
+|}
+There was also a small error in the FASTA formatting. There are not supposed to be any spaces in the header, so the spaces were replaced with underscores.
+*Example:
+ >1_control
+ GTACATGAAACGATGGACGGGGTCTCAGCCATTCCAATGTCGTATCTGTATGCGTAATTTTTCACGCAAACACCATTTGGGTCGTCATATCCGTACGCACACGGTGAGACCCGCTGAGGAGACTATACCAG
+</div>
+<html>
+<div id="vsebina_mid_right"></div>
+<div id="vsebina_foot"></div>
+</div>
+</div>
+<body>
+</body>
+</html>
 </html>

'	A	C	D	E	F	G	H	I	K	L	M	N	P	Q	R	S	T	V	W	Y
A	10	0	99	55	0	29	122	20	32	332	2	59	55	63	255	87	24	43	0	0
C	0	0	15	0	0	3	0	0	0	5	0	0	6	0	31	6	14	0	0	0
D	99	15	94	92	0	39	62	6	84	342	15	120	55	42	277	290	87	21	0	8
E	55	0	92	42	0	34	77	1	38	141	2	39	4	29	134	28	90	26	0	1
F	0	0	0	0	0	0	0	10	0	0	0	22	4	0	2	4	6	0	0	0
G	29	3	39	34	0	38	56	0	14	126	1	95	28	47	119	125	38	7	0	0
H	122	0	62	77	0	56	118	9	103	498	4	88	24	26	87	159	70	2	0	0
I	20	0	6	1	10	0	9	6	8	95	3	5	17	3	62	16	17	4	0	0
K	32	0	84	38	0	14	103	8	84	386	24	44	19	102	269	163	113	22	1	0
L	332	5	342	141	0	126	498	95	386	174	32	686	16	112	362	276	875	360	0	8
M	2	0	15	2	0	1	4	3	24	32	0	7	2	11	39	14	3	1	0	0
N	59	0	120	39	22	95	88	5	44	686	7	8	36	28	120	254	84	34	1	0
P	55	6	55	4	4	28	24	17	19	16	2	36	0	3	29	150	21	13	11	0
Q	63	0	42	29	0	47	26	3	102	112	11	28	3	100	261	314	125	19	0	0
R	255	31	277	134	2	119	87	62	269	362	39	120	29	261	618	343	504	281	0	0
S	87	6	290	28	4	125	159	16	163	276	14	254	150	314	343	592	173	91	0	0
T	24	14	87	90	6	38	70	17	113	875	3	84	21	125	504	173	154	28	0	0
V	43	0	21	26	0	7	2	4	22	360	1	34	13	19	281	91	28	12	0	0
W	0	0	0	0	0	0	0	0	1	0	0	1	11	0	0	0	0	0	0	0
Y	0	0	8	1	0	0	0	0	0	8	0	0	0	0	0	0	0	0	0	0

PDB ID	Binding Sequence	Link
1F2I	ATGGGCGCGCCCAT	[http://www.pdb.org/pdb/explore/explore.do?structureId=1F2I]
1G2D	GACGCTATAAAAGGAG	[http://www.pdb.org/pdb/explore/explore.do?structureId=1G2D]
1G2F	TCCTTTTATAGCGTCC	[http://www.pdb.org/pdb/explore/explore.do?structureId=1G2F]
1MEY	ATGAGGCAGAACT	[http://www.pdb.org/pdb/explore/explore.do?structureId=1MEY]
1TF6	ACGGGCCTGGTTAGTACCTGGATGGGAGACC	[http://www.pdb.org/pdb/explore/explore.do?structureId=1TF6]
1UBD	AGGGTCTCCATTTTGAAGCG	[http://www.pdb.org/pdb/explore/explore.do?structureId=1UBD]
1TF6	ACGGGCCTGGTTAGTACCTGGATGGGAGACC	[http://www.pdb.org/pdb/explore/explore.do?structureId=1TF6]
1YUI	GCCGAGAGTAC	[http://www.pdb.org/pdb/explore/explore.do?structureId=1YUI]
2DRP	CTAATAAGGATAACGTCCG	[http://www.pdb.org/pdb/explore/explore.do?structureId=2DRP]
2GLI	TTTCGTCTTGGGTGGTCCACG	[http://www.pdb.org/pdb/explore/explore.do?structureId=2GLI]
2I13	CAGATGTAGGGAAAAGCCCGGG	[http://www.pdb.org/pdb/explore/explore.do?structureId=2I13]
2KMK	CATAAATCACTGCCTA	[http://www.pdb.org/pdb/explore/explore.do?structureId=2KMK]
2PRT	CGCGGGGGCGTCTG	[http://www.pdb.org/pdb/explore/explore.do?structureId=2PRT]
2WBS	GAGGCGC	[http://www.pdb.org/pdb/explore/explore.do?structureId=2WBS]
2WBU	GAGGCGTGGC	[http://www.pdb.org/pdb/explore/explore.do?structureId=2WBU]

File:Gnn freqs.png Probability data for the 783 fingers that bind to GNN triplets. Note the high probability of leucine at position 4 and arginine at position 6.	File:Tnn probs.png Probability data for the 128 fingers that bind to TNN triplets. Note the high probability of leucine at position 4.	File:Cnn probs.png Probability data for the 16 fingers that bind to CNN triplets. There may not be enough data to consider this information statistically significant	File:Ann probs.png Probability data for the 29 fingers that bind to ANN triplets. There may not be enough data to consider this information statistically significant
File:Ngn probs.png Probability data for the 298 fingers that bind to NGN triplets. The position 4 leucine motif remains. There is also a high probability (> 0.5) of a histidine at position 3 and an arginine at position 6.	File:Ntn probs.png Probability data for the 177 fingers that bind to NTN triplets. The position 4 leucine motif remains.	File:Ncn probs.png Probability data for the 244 fingers that bind to NCN triplets. The position 4 leucine motif remains. There is also a very high probability of an arginine at position 6.	File:Nan probs.png Probability data for the 248 fingers that bind to NAN triplets. The position 4 leucine motif remains. There is also a very high probability (> 0.75) of an asparagine at position 3 and an arginine at position 6.
File:Nng probs.png Probability data for the 234 fingers that bind to NNG triplets. The position 4 leucine motif remains. There is also a very high probability (> 0.75) of an asparagine at position 1 and a high probability (> 0.5) of an aspartic acid at position 2 and an arginine at position 6.	File:Nnt probs.png Probability data for the 247 fingers that bind to NNT triplets. The position 4 leucine motif remains. There is also a high (> 0.5) probability of an arginine at position 6.	File:Nnc probs.png Probability data for the 262 fingers that bind to NNC triplets. The position 4 leucine motif remains. There is also a very high (> 0.75) probability of an arginine at position 6.	File:Nna probs.png Probability data for the 218 fingers that bind to NNA triplets. The position 4 leucine motif remains. There is also a very high (> 0.75) probability of a glutamine at position -1 and an arginine at position 6.

File:GAA generated round 1.png Round 1 of generating sequences for GAA with the program.	File:GAA generated round 2.png Round 2 of generating sequences for GAA with the program.
File:GAA open and persikov.png GAA sequences from the OPEN dataset.	File:GAA open only.png GAA sequences from Persikov and OPEN datasets.

Disease	Target DNA Finger 1	Helices in Zif268 Backbone	Helices in Zif268 Closely-Related Backbones	Helices in Zif268 Distantly-Related Backbones
Colorblindness (Bottom)	TGG	5150	3000	1000
Colorblindness (Top)	ATG	3050	3050	3050
Familial Hypercholesterolemia (Bottom)	GAC	5150	3000	1000
Familial Hypercholesterolemia (Top)	CTG	3050	3050	3050
Myc (Top₁₉₈)	CTC	3050	3050	3050
Myc (Top₉₈₁)	AAA	3050	3050	3050

File:CTC 0.png psu = 0	File:CTC .005 psuedo.png psu = .005	File:CTC .008 psuedo.png psu = .008
File:CTC .01.png psu = .01	File:CTC .015 psuedo.png psu = .015.	File:CTC .02 psuedo.png psu = .020.

Index	Nucleotide Sequence (5\'-3\')	Helices (F3 to F1)	Notes
16	GAA GGG AAC	QDGNLGR RREHLVR HRTNLIA	Very similar to one of our target sequences (CB top), which is GAA GGG ACC
55	GGA GTG GTG	QTTHLSR DHSSLKR RNFILQR	Very similar to a target sequence (FH top), which is GGA GTG CTG
77	TGT GAA TAG	RRRNLQI QQTNLTR QPHGLTA	Out of ze air

**June**
Sun	Mon	Tue	Wed	Thu	Fri	Sat

			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

'	A	C	D	E	F	G	H	I	K	L	M	N	P	Q	R	S	T	V	W	Y
A	10	0	99	55	0	29	122	20	32	332	2	59	55	63	255	87	24	43	0	0
C	0	0	15	0	0	3	0	0	0	5	0	0	6	0	31	6	14	0	0	0
D	99	15	94	92	0	39	62	6	84	342	15	120	55	42	277	290	87	21	0	8
E	55	0	92	42	0	34	77	1	38	141	2	39	4	29	134	28	90	26	0	1
F	0	0	0	0	0	0	0	10	0	0	0	22	4	0	2	4	6	0	0	0
G	29	3	39	34	0	38	56	0	14	126	1	95	28	47	119	125	38	7	0	0
H	122	0	62	77	0	56	118	9	103	498	4	88	24	26	87	159	70	2	0	0
I	20	0	6	1	10	0	9	6	8	95	3	5	17	3	62	16	17	4	0	0
K	32	0	84	38	0	14	103	8	84	386	24	44	19	102	269	163	113	22	1	0
L	332	5	342	141	0	126	498	95	386	174	32	686	16	112	362	276	875	360	0	8
M	2	0	15	2	0	1	4	3	24	32	0	7	2	11	39	14	3	1	0	0
N	59	0	120	39	22	95	88	5	44	686	7	8	36	28	120	254	84	34	1	0
P	55	6	55	4	4	28	24	17	19	16	2	36	0	3	29	150	21	13	11	0
Q	63	0	42	29	0	47	26	3	102	112	11	28	3	100	261	314	125	19	0	0
R	255	31	277	134	2	119	87	62	269	362	39	120	29	261	618	343	504	281	0	0
S	87	6	290	28	4	125	159	16	163	276	14	254	150	314	343	592	173	91	0	0
T	24	14	87	90	6	38	70	17	113	875	3	84	21	125	504	173	154	28	0	0
V	43	0	21	26	0	7	2	4	22	360	1	34	13	19	281	91	28	12	0	0
W	0	0	0	0	0	0	0	0	1	0	0	1	11	0	0	0	0	0	0	0
Y	0	0	8	1	0	0	0	0	0	8	0	0	0	0	0	0	0	0	0	0

Team:Harvard/Notebook

From 2011.igem.org

Revision as of 01:12, 2 August 2011

Contents

June 7th

June 8th

June 9th

June 9th - Bioinformatics

June 10th

June 10th - Bioinformatics

Visualizations

Properties of amino acids

June 13th

Gel images

June 13th - Bioinformatics

June 14

Today's Gel Images

June 14 - Bioinformatics

June 15th

June 15th - Bioinformatics

June 16th

June 16 - Bioinformatics

June 17th

June 17 - Bioinformatics

Goals

Options for Target DNA Sequences / ZF Helices

References

June 20th

June 20th - Bioinformatics

Goals for the week

Today

Probability data

June 21st

June 21st - Bioinformatics

Persikov Statistics - Graphs

Phone Call with Dan

Chip Design

Identifying dependencies

June 24

Updated Closest Zif268 Fingers

June 24th - Bioinformatics

Sequence Generation

June 24th

June 24th - Bioinformatics

Playing with Pseudocounts

June 25th-26th - Bioinformatics

June 27, Wet lab

June 27th - Bioinformatics

To Do for Today

100 Control Sequences

Updated Target Sequences

Cut Site Design

Updates on the program

June 28th

June 28th - Bioinformatics

Plasmid and Oligo Design Schematics

Chip-Based Sequence Design Schematic

References

Harvard Logo

Running the Generator!

Generated Final Chip Sequences

Generated WebLogos for Final Chip

Bioinformatics Candids

Design of Plate Practice Sequences

June 29th

PCR Preparation

PCRs

Expression Plasmid Design in silico

June 30th

ZF Expression Plasmid Ultramer and Primer Design

June 30th - Bioinformatics

Updated Primer list and FASTA formatting

'	A	C	D	E	F	G	H	I	K	L	M	N	P	Q	R	S	T	V	W	Y
A	10	0	99	55	0	29	122	20	32	332	2	59	55	63	255	87	24	43	0	0
C	0	0	15	0	0	3	0	0	0	5	0	0	6	0	31	6	14	0	0	0
D	99	15	94	92	0	39	62	6	84	342	15	120	55	42	277	290	87	21	0	8
E	55	0	92	42	0	34	77	1	38	141	2	39	4	29	134	28	90	26	0	1
F	0	0	0	0	0	0	0	10	0	0	0	22	4	0	2	4	6	0	0	0
G	29	3	39	34	0	38	56	0	14	126	1	95	28	47	119	125	38	7	0	0
H	122	0	62	77	0	56	118	9	103	498	4	88	24	26	87	159	70	2	0	0
I	20	0	6	1	10	0	9	6	8	95	3	5	17	3	62	16	17	4	0	0
K	32	0	84	38	0	14	103	8	84	386	24	44	19	102	269	163	113	22	1	0
L	332	5	342	141	0	126	498	95	386	174	32	686	16	112	362	276	875	360	0	8
M	2	0	15	2	0	1	4	3	24	32	0	7	2	11	39	14	3	1	0	0
N	59	0	120	39	22	95	88	5	44	686	7	8	36	28	120	254	84	34	1	0
P	55	6	55	4	4	28	24	17	19	16	2	36	0	3	29	150	21	13	11	0
Q	63	0	42	29	0	47	26	3	102	112	11	28	3	100	261	314	125	19	0	0
R	255	31	277	134	2	119	87	62	269	362	39	120	29	261	618	343	504	281	0	0
S	87	6	290	28	4	125	159	16	163	276	14	254	150	314	343	592	173	91	0	0
T	24	14	87	90	6	38	70	17	113	875	3	84	21	125	504	173	154	28	0	0
V	43	0	21	26	0	7	2	4	22	360	1	34	13	19	281	91	28	12	0	0
W	0	0	0	0	0	0	0	0	1	0	0	1	11	0	0	0	0	0	0	0
Y	0	0	8	1	0	0	0	0	0	8	0	0	0	0	0	0	0	0	0	0