Team:Harvard/Template:NotebookData
From 2011.igem.org
Harvard iGEM 2011
June 6th
First day of iGEM!June 7th
Miniprep of pKD42 (lambda red)
The lambda red plasmid is needed to enable the recombination used to insert the selection/expression systems into our E. coli cultures.
Procedure: followed Qiagen Kit instructions, each student (8) using 1 mL cell suspension
Results: DNA reasonably pure (260/280 between 1.8 and 2) and between 25 and 50 ng/µLJune 8th
PCR to connect ultramers into OZ052 (Zif268 F2 triplicate, GCCGATGTC)and OZ123 (Zif268 F2 triplicate, GAGTGGTTA):
OZ052:
- 3µL OZ052_F (10µM stock)
- 3µL OZ052_R (10µM stock)
- 5µL 10x Pfx amplification buffer
- 1.5µL dNTPs
- 1µL MgSO4
- 0.4µL DNA polymerase
- 36.1µL ddH2O
OZ123:
- 3µL OZ123_F (10µM stock)
- 3µL OZ123_R (10µM stock)
- 5µL 10x Pfx amplification buffer
- 1.5µL dNTPs
- 1µL MgSO4
- 0.4µL DNA polymerase
- 36.1µL ddH2O
Parameters:
- 1) 94⁰C for 5 min
- 2) 94⁰C for 15 sec
- 3) 60⁰C for 30 sec
- 4) 68⁰C for 1 min
- 5) Repeat 2-4 for 25 cycles
- 6) 68⁰C for 5 min
- 7) 4⁰C forever
June 9th - Wet Lab
- Created cell culture with selection construct (contains ZFB, His3, pyrF on plasmid) and reporter RFP (this will be used to test positive control ZFs, cells fluoresce green when ZF binds)
- Picked colonies, grew in LB/amp liquid media until mid-log
- 3 mL of LB, 1.5 µL of 2000x amp
- Once mid-log reached, created glycerol stock, stored stock at -80⁰C.
300 µL bacteria, 1200 µL 80% glycerolThis should have been 1200 µL bacteria media, 300 µL 80% glycerol (Corrected 6/14/2011) (80% pure glycerol, 20% molecular grade water)
- Spiked new tubes of media with 25 µL bacteria from the mid-log tube to leave overnight
- Picked colonies, grew in LB/amp liquid media until mid-log
NOTE: reporter RFP did not grow to mid-log by end of day, will let grow overnight to saturation and continue creating glycerol stock tomorrow.
- Plated selection strain from gel stab onto tet plate.
- Began primer design for creating the kan/selection construct fusion.
June 9th - Bioinformatics
Today we focused on reacquainting and familiarizing ourselves with Python. We completed the parsing (reading in) of the sequence and amino acid data so that it is easy to work with: by substituting each amino acid abbreviation (ex. A, N) with its numeric equivalent (ex. 1, 14), we can use a lot of nice math comparisons instead of messy letter/"string" comparisons.
After that, we worked on counting the number of times each amino acid appears in each of the 7 positions (unfortunately given by -1,1,2,3,5,6,7), and counting the number of times amino acids are next to each other (ex. ACTQRNF has AC, CT, TQ, etc pairings). Taken overall, we found that L is overwhelmingly in position 5.
Acid | -1 | 1 | 2 | 3 | 5 | 6 | 7 |
A | 77 | 140 | 210 | 197 | 0 | 312 | 85 |
C | 12 | 24 | 1 | 6 | 14 | 0 | 0 |
D | 413 | 16 | 694 | 258 | 0 | 142 | 14 |
E | 125 | 74 | 152 | 107 | 0 | 58 | 132 |
F | 0 | 0 | 22 | 0 | 10 | 0 | 0 |
G | 12 | 201 | 328 | 125 | 0 | 177 | 62 |
H | 93 | 144 | 232 | 652 | 0 | 51 | 17 |
I | 70 | 21 | 3 | 26 | 0 | 94 | 73 |
K | 108 | 372 | 46 | 169 | 6 | 321 | 52 |
L | 176 | 37 | 20 | 22 | 3325 | 75 | 55 |
M | 36 | 54 | 5 | 28 | 0 | 31 | 10 |
N | 23 | 150 | 129 | 940 | 0 | 182 | 61 |
P | 3 | 298 | 77 | 7 | 0 | 36 | 8 |
Q | 813 | 158 | 180 | 13 | 0 | 136 | 30 |
R | 870 | 539 | 137 | 55 | 3 | 428 | 2517 |
S | 99 | 970 | 859 | 278 | 0 | 140 | 12 |
T | 243 | 134 | 223 | 350 | 0 | 834 | 83 |
V | 166 | 26 | 27 | 115 | 0 | 341 | 146 |
W | 19 | 0 | 13 | 0 | 0 | 0 | 0 |
Y | 0 | 0 | 0 | 10 | 0 | 0 | 1 |
For pairings, we found patterns, but none as obvious as the L-in-position-5. Read this like a multiplication table: the intersection of L row and M column is how often that pairing was observed.
' | A | C | D | E | F | G | H | I | K | L | M | N | P | Q | R | S | T | V | W | Y |
A | 10 | 0 | 99 | 55 | 0 | 29 | 122 | 20 | 32 | 332 | 2 | 59 | 55 | 63 | 255 | 87 | 24 | 43 | 0 | 0 |
C | 0 | 0 | 15 | 0 | 0 | 3 | 0 | 0 | 0 | 5 | 0 | 0 | 6 | 0 | 31 | 6 | 14 | 0 | 0 | 0 |
D | 99 | 15 | 94 | 92 | 0 | 39 | 62 | 6 | 84 | 342 | 15 | 120 | 55 | 42 | 277 | 290 | 87 | 21 | 0 | 8 |
E | 55 | 0 | 92 | 42 | 0 | 34 | 77 | 1 | 38 | 141 | 2 | 39 | 4 | 29 | 134 | 28 | 90 | 26 | 0 | 1 |
F | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | 0 | 0 | 22 | 4 | 0 | 2 | 4 | 6 | 0 | 0 | 0 |
G | 29 | 3 | 39 | 34 | 0 | 38 | 56 | 0 | 14 | 126 | 1 | 95 | 28 | 47 | 119 | 125 | 38 | 7 | 0 | 0 |
H | 122 | 0 | 62 | 77 | 0 | 56 | 118 | 9 | 103 | 498 | 4 | 88 | 24 | 26 | 87 | 159 | 70 | 2 | 0 | 0 |
I | 20 | 0 | 6 | 1 | 10 | 0 | 9 | 6 | 8 | 95 | 3 | 5 | 17 | 3 | 62 | 16 | 17 | 4 | 0 | 0 |
K | 32 | 0 | 84 | 38 | 0 | 14 | 103 | 8 | 84 | 386 | 24 | 44 | 19 | 102 | 269 | 163 | 113 | 22 | 1 | 0 |
L | 332 | 5 | 342 | 141 | 0 | 126 | 498 | 95 | 386 | 174 | 32 | 686 | 16 | 112 | 362 | 276 | 875 | 360 | 0 | 8 |
M | 2 | 0 | 15 | 2 | 0 | 1 | 4 | 3 | 24 | 32 | 0 | 7 | 2 | 11 | 39 | 14 | 3 | 1 | 0 | 0 |
N | 59 | 0 | 120 | 39 | 22 | 95 | 88 | 5 | 44 | 686 | 7 | 8 | 36 | 28 | 120 | 254 | 84 | 34 | 1 | 0 |
P | 55 | 6 | 55 | 4 | 4 | 28 | 24 | 17 | 19 | 16 | 2 | 36 | 0 | 3 | 29 | 150 | 21 | 13 | 11 | 0 |
Q | 63 | 0 | 42 | 29 | 0 | 47 | 26 | 3 | 102 | 112 | 11 | 28 | 3 | 100 | 261 | 314 | 125 | 19 | 0 | 0 |
R | 255 | 31 | 277 | 134 | 2 | 119 | 87 | 62 | 269 | 362 | 39 | 120 | 29 | 261 | 618 | 343 | 504 | 281 | 0 | 0 |
S | 87 | 6 | 290 | 28 | 4 | 125 | 159 | 16 | 163 | 276 | 14 | 254 | 150 | 314 | 343 | 592 | 173 | 91 | 0 | 0 |
T | 24 | 14 | 87 | 90 | 6 | 38 | 70 | 17 | 113 | 875 | 3 | 84 | 21 | 125 | 504 | 173 | 154 | 28 | 0 | 0 |
V | 43 | 0 | 21 | 26 | 0 | 7 | 2 | 4 | 22 | 360 | 1 | 34 | 13 | 19 | 281 | 91 | 28 | 12 | 0 | 0 |
W | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 11 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Y | 0 | 0 | 8 | 1 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
June 10th - Wet Lab
- What we learned today: don't put E. coli plates in the -20C freezer!
- Observed a well populated selection strain plate and placed it in the 4C refrigerator
- Took the selection construct culture and extracted the plasmid through miniprep
- Observed 260/280 ratio of 1.90 and 1.88 through Nanodrop
- Observed concentrations of 87.7 and 100.6 ng/µL through Nanodrop
- Made 10 new agar plates with LB and amp
June 10th - Bioinformatics
Visualizations
We spent the first few hours today making cool visualizations and graphs of the data we found on the 9th: heatmaps turned out to be an annoying limitation of Python, so a Python/R hybrid was used, and bar charts were made exclusively in Python. See the dropbox for our pretty (and hopefully informative compared to spreadsheets) charts/graphs.
We then started work on TNN and GNN properties specifically (essentially repeating the June 9th work, but confined to smaller data sets). There are some differences between TNN and GNN: see graphs in dropbox. We decided that there was not enough data for fingers that bind to ANN and CNN triplets to perform significant analysis on it.
- Overall, similar color clusters are found in the heatmaps. In all cases, L and N are often placed consecutively on the helix. There are fewer clusters of high frequency when looking at TNN binders.
We then, using the theorized framework from a paper (2011 Persikov [http://iopscience.iop.org.ezp-prod1.hul.harvard.edu/1478-3975/8/3/035010/]), tried to match amino acid binding to each base pair to see if there was a pattern. See dropbox document .../bioinformatics/Binding Frequency for that data. There's a lot of it.
Properties of amino acids
We then worked on finding properties of the each position (hydrophobic/phillic, non/polar):
Hydrophilic vs Hydrophobic
Position | Very Phobic | Hydrophobic | Neutral | Hydrophillic |
6 | 285 | 85 | 204 | 2782 |
5 | 542 | 312 | 1334 | 1169 |
4 | 3334 | 14 | 0 | 9 |
3 | 191 | 203 | 1417 | 1536 |
2 | 91 | 211 | 1819 | 1236 |
1 | 138 | 164 | 1604 | 1451 |
-1 | 468 | 90 | 1257 | 1542 |
Polar vs Nonpolar
Position | Polar | Nonpolar |
6 | 2917 | 440 |
5 | 2290 | 1067 |
4 | 9 | 3348 |
3 | 2830 | 527 |
2 | 2652 | 705 |
1 | 2555 | 802 |
-1 | 2784 | 573 |
June 13th - Wet Lab
The control zinc fingers OZ052 and OZ123 were amplified with overhanging primers to allow its insertion into the Wolfe plasmid:
Overhang PCR for ultramers: the template was the product of the ultramer PCR (see 6/8/11), and several concentrations were used
In all the tubes:
- 5 µL Pfx amplification buffer
- 1.5 µL dNTPs
- 1 µL MgSO4
- 0.4 µL polymerase
- 38.1 µL ddH2O
- 1.5 µL OZ052_up and 1.5 µL OZ052_down OR 1.5 µL OZ123_up and 1.5 µL OZ123_down
In OZ052 (1) and OZ123 (1):
- 1 µL of ultramer PCR product
In OZ052 (1:10) and OZ123 (1:10):
- 1 µL of a 1 in 10 dilution of ultramer PCR product
In OZ052 (1:100) and OZ123 (1:100):
- 1 µL of a 1 in 100 dilution of ultramer PCR product
Parameters:
- 94⁰C for 5 min
- 94⁰C for 15 sec
- 55⁰C for 30 sec
- 68⁰C for 30 sec
- Repeat steps 2-4 for 25 cycles
- 68⁰C for 5 min
- 4⁰C forever
Gel to verify proper amplification (1% agarose gel, 10 µL 1 kb ladder, 120 V):
The OZ052 lanes (1-3) had bands at the proper length (328 bp) at all three concentrations, although there were several fainter bands likely from side products. Only the undiluted OZ123 lane showed any bands, and from the faint band at 328 and the stronger band around 250 it appears that the PCR did not work well, and the majority of the product was the ultramer from the first PCR.
PCR around vector: the template used was the Wolfe selection construct plasmid miniprepped 6/10/11 (100.6 ng/µL stock)
Reagents the same as above except:
- 1.5 µL of Wolfe_F and 1.5 µL of Wolfe_R primers to each tube
- plasmid tube (1 ng) given 1 ng of template (1 µL of a 1 in 100 dilution)
- plasmid tube (10 ng) given 10 ng of template (1 µL of a 1 in 10 dilution)
Parameters same as above except:
- elongation (step 4) 5 minutes (vector approximately 5 kb)
Gel to verify proper amplification (1% agarose, 10 µL 1 kb ladder, 170V)
There were no bands of the correct size in the lanes. The only band that appeared was a faint, short band in one lane that likely was a primer. Since the DNA ladder worked, the problem likely was not with the electrophoresis but with the PCR reaction, perhaps due to issues with the primers.
Gel images
June 13th - Bioinformatics
Today we started work on a program to statistically generate possible sequences.
The four functions needed to do this are:
- generate(matrix, pseudocounts (lambda), dependency tuples)
- takes a matrix of zinc-finger AA position counts, a list of dependent amino acid pairs, and a pseudocount multiplier and generates a list of potential amino acid sequences weighted by independent and dependent probabilities
- add_pseudo(dependent matrix row,independent matrix row)
- given a matrix row of dependent counts (i.e. how many times 'a' occurs at position n when 'b' is set to some AA at position m) and a row of independent matrix counts (how many times 'a' occurs at n regardless of b's AA) return an adjusted matrix row, based on the dependent matrix row, that has pseudocounts added to the values that are empty in the dependent matrix row but filled in the independent matrix row.
- generate_indep(matrix)
- randomly pick an amino acid, given a matrix row, from a weighted random distribution based on the values in the row
- generate_dep(indep_row, dep_row, lambda)
- add pseudo counts (call add_pseudo) and generate a dependent random call for a position (using generate_indep on the adjusted matrix)