Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thanks to: Broad Inst., DARPA-BioComp, DOE-GTL, EU-MolTools, NGHRI-CEGS, NHLBI-PGA, NIGMS-CECBSR, PhRMA, Lipper Foundation Agencourt, Ambergen, Atactic,

Similar presentations


Presentation on theme: "Thanks to: Broad Inst., DARPA-BioComp, DOE-GTL, EU-MolTools, NGHRI-CEGS, NHLBI-PGA, NIGMS-CECBSR, PhRMA, Lipper Foundation Agencourt, Ambergen, Atactic,"— Presentation transcript:

1 Thanks to: Broad Inst., DARPA-BioComp, DOE-GTL, EU-MolTools, NGHRI-CEGS, NHLBI-PGA, NIGMS-CECBSR, PhRMA, Lipper Foundation Agencourt, Ambergen, Atactic, BeyondGenomics, Caliper, Genomatica, Genovoxx, Helicos, MJR, NEN, Nimblegen, ThermoFinnigan, Xeotron/Invitrogen For more info see: arep.med.harvard.edu BU BME retreat 23-Jun-2004 9:45-10:30 Seacrest, N. Falmouth, MA Optimal Combinatorial Biology & Genome Engineering

2 Exponential technologies Shendure J, Mitra R, Varma C, Church GM (May 2004) Advanced Sequencing Technologies: Methods & Goals. Nature Reviews of Genetics 5, 335 -344. ABI

3 010101 01010 01010001101010 1010010110010110 01010001101010 010010 111010 01010101010 01010001101010 1010010110010110 01010001101010 010010111010 010101010 010101101010 10100100010110 010001101010 0100111010 0101010 0101101010 101000010110 0100001010 01001010 Programming cells with DNA vs. Digital computers simulating cells Cells simulating digital computers Drugs & devices simulating human systems 0101010 0101101010 101000010110 0100001010 01001010 0101010 0101101010 101000010110 0100001010 01001010 0101010 0101101010 101000010110 0100001010 01001010 0101010 0101101010 101000010110 0100001010 01001010

4 Engineering complex systems (comparative genomics) Stedman et al. (2004) [Masticatory] Myosin gene mutation correlates with anatomical changes in the human lineage Nature 428, 415 - 418

5 DNA RNA Proteins Metabolites Replication rate Environment Biosystems Engineering Integrating Measures & Models Microbes Cancer & stem cells Darwinian optima In vitro replication Small multicellular organisms RNAi Insertions SNPs interactions

6 Now that we have 200 genomes, why sequence? Once per organism Phylogenetic footprinting, biodiversity RNA splicing & chromatin modification patterns. Cell-lineage during development NA "aptamers" & Ab for any protein Once per person Preventative medicine & genotype–phenotype associations Frequently Cancer: mutation sets for individual clones, loss-of-heterozygosity B & T-cell receptor diversity: Temporal profiling, clinical New & old pathogen "weather map", biowarfare sensors DNA computing & lab selections Shendure et al. 2004 Nature Rev Gen 5, 335.

7 Why 'single molecule' sequencing? (1) Single-cell analyses, e.g. Preimplantation (PGD) (2) Co-occurrence on a molecule, complex, cell e.g. RNA splice-forms (3) Cost: $1K-100K "personal genomes" http://grants.nih.gov/grants/guide/rfa-files/RFA-HG-04-003.html (4) Precision: Counting 10 9 RNA tags (to reduce variance) (~5e5 RNAs per human cell) Fixed 5e3 5e4 5e6 5e9 (goal) Costs EST SAGE MPSS Polony-FISSeq (polymerase colony)

8 Polony Fluorescent In Situ Sequencing Libraries Greg Porreca Abraham Rosenbaum 1 to 100kb Genomic L R M L R PCR bead Sequencing primers Selector bead 2x20bp after MmeI Dressman et al PNAS 2003 emulsion

9 Cleavable dNTP-Fluorophore (& terminators) Mitra,RD, Shendure,J, Olejnik,J, Olejnik,EK, and Church,GM (2003) Fluorescent in situ Sequencing on Polymerase Colonies. Analyt. Biochem. 320:55-65 Reduce or photo- cleave

10 Polony- FISSeq : up to 2 billion beads/slide White= Fe-core pixels, Cy5 primer (570nm) ; Cy3 dNTP (666nm) Jay Shendure

11 # of bases sequenced (total)23,703,953 # bases sequenced (unique)73 Avg fold coverage324,711 X Pixels used per bead (analysis)~3.6 Read Length per primer14-15 bp Insertions 0.5% Deletions 0.7% Substitutions (raw) 4e-5 Throughput:360,000 bp/min Polony FISSeq Stats Current capillary sequencing 1400 bp/min (600X speed/cost ratio, ~$5K/1X) (This may omit: PCR, homopolymer, context errors) Shendure

12 CD44 Exon Combinatorics (Zhu & Shendure) Alternatively Spliced Cell Adhesion Molecule Specific variable exons are up-or-down-regulated in various cancers (>2000 papers) v6 & v7 enable direct binding to chondroitin sulfate, heparin… Zhu,J, et al. Science. 301:836-8.

13 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 RNA exon examples auto- regridded & quan- titated Zhu,J, Shendure,J, Mitra, RD, Church, GM (2003) Science. 301:836-8. Single Molecule Profiling of Alternative Pre-mRNA Splicing.

14 Zhu J, Shendure J, Mitra RD, Church GM. Science 301:836-8. Single molecule profiling of alternative pre-mRNA splicing. Eph4 = murine mammary epithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic) CD44 RNA isoforms

15 DNA RNA Proteins Metabolites Replication rate Environment Biosystems Engineering Integrating Measures & Models Escherichia Darwinian optima Prochlorococcus mutant suboptimality Homo RNAi Insertions SNPs interactions

16 Integer Stochiometric matrix (Roche/ExPASy) Metabolic Pathways Cellular Processes

17 XiXi Membrane V transport V syn V deg V growth Growth: c 1 X i + c 2 X 2 +... +c m X m Biomass Flux ratios at each branch point yields optimal polymer composition for replication X i =const.   v j =0

18 AcCoA CoA ATP FAD NADH Xi = metabolites Ci = coeff. in growth reaction Biomass composition Edwards & Palsson, PNAS 2000, BMC Bioinf. 2000 Optimize flow from input C,N,P to Biomass GTP Trp Leu Ala Arg Gly Cys Ser Asn Asp His CTP UTP SucCoA Val Glu Gln Phe Pro Ile Lys Met Tyr Thr dACGT

19 Minimization of Metabolic Adjustment (MoMA) Linear Programming (LP) to find optima, Quadratic (QP) to find closest points x,y are two of the 100s of flux dimensions Wild-type optimum Mutant optimum Mutant initially (closest point) Mutant Wild type (feasible flux polyhedra) Objective function = growth flux hyperplanes Segre, Vitkup, & Church PNAS 99: 15112-7

20 12 C 13 C MS/NMR Flux Ratio Data

21 050100150200 0 20 40 60 80 100 120 140 160 180 200 1 2 3 4 56 7 8 9 10 11 12 1314 15 16 1718 -50050100150200250 -50 0 50 100 150 200 250 1 2 3 4 56 7 8 9 10 11 12 1314 15 16 17 18 Experimental Fluxes Predicted Fluxes -50050100150200250 -50 0 50 100 150 200 250 1 2 3 4 56 7 8 9 10 11 12 13 14 15 16 1718  pyk (LP) WT (LP) Experimental Fluxes Predicted Fluxes Experimental Fluxes Predicted Fluxes  pyk (QP)  =0.91 p=8e-8  =-0.06 p=6e-1  =0.56 p=7e-3 Flux Data C009- limited

22 Reproducibility of mass competition Correlation between two selection experiments Badarinarayana, et al. Nature Biotech.19: 1060

23 Competitive growth data  2 p-values 4x10 -3 1x10 -5 Position effects Novel redundancies On minimal media negative small selection effect Hypothesis: next optima are achieved by regulation of activities. LP QP

24 Motif Co-occurrence, comparative genomics, RNA clusters, and/or ChIP 2 -location data P= 10 -6 to 10 -11 Genome Res. 14:201–208 Bulyk, McGuire,Masuda,Church

25 Synthetic testing of DNA motif combinations 1.3 2.4 (1.3 in  argR) 1.1 1.3 0.7 2.5 0.2 1.4 1.4 3.5 RNA Ratio (motif- to wild type) for each flanking gene Bulyk, McGuire,Masuda,Church Genome Res. 14:201–208

26 Systems Biology Loop Synthesis / Perturbation Model Experimental design (Systematic) Data Proteasome targeting Genome Engineering

27 Engineering BioSystems Perturbations Action Specificity %KO "Design" Small molecules (drugs) Fast Varies Varies Hard Antibodies Fast Varies Varies Hard RNAi Slow Varies Medium OK Insertion "traps" Slow Yes Varies Random Proteasome targeting Fast Excellent Medium Easy Homologous recombination Slow Perfect Complete Easy

28 Programming proteasome targeting Janse, DM, Crosas,B Finley,D & Church, GM (2004) Localization to the Proteasome is Sufficient for Degradation.

29 Synthetic Genomes & Proteomes. Why? Test or engineer cis-DNA/RNA-elements Access to any protein (complex) including post-transcriptional modifications Affinity agents for the above. Mass spectrometry standards, protein design Utility of molecular biology DNA-RNA-Protein in vitro "kits" (e.g. PCR, SP6, Roche) Toward these goals design a chassis: 115 kbp genome. 150 genes. Nearly all 3D structures known. Comprehensive functional data.

30 PURE translation utility (yet room for improvement) Removing tRNA-synthetases, RNases & proteases makes feasible: Optimal mRNA structure & codon usage Lee et al. 2004 J Immunol Methods. 284:147-57. Selection of scFvs specific for HBV DNA polymerase using ribosome display. Forster et al. 2003Programming peptidomimetic syntheses by translating genetic codes designed de novo. PNAS 100:6353-7. Klammt et al. 2004 Eur J Biochem. 271:568-80. High level cell-free expression & specific labeling of integral membrane proteins. Shimizu et al. 2001 Nat Biotechnol. 19:751-5. Cell-free translation reconstituted with purified components.

31 in vitro genetic codes 5' mS yU eU UGG UUG CAG AAC... GUU A 3' GAAACCAUG fMTNVE | | | 5' Second base 3' U A C C U mS yU eU A C U G A Forster, et al. (2003) PNAS 100:6353-7 80% average yield per unnatural coupling. bK = biotinyllysine, mS = Omethylserine eU=2-amino-4-pentenoic acid yU = 2-amino-4-pentynoic acid

32 Mirror world : resistant to enzymes, parasites, predators L-amino acids & D-ribose (rNTPs, dNTPs) Transition: EF-Tu, peptidyl transferase, DNA-ligase D-amino acids & L-ribose (rNTPs, dNTPs) Dedkova, et al. (2003) Enhanced D-amino acid incorporation into protein by modified ribosomes. J Am Chem Soc 125, 6616-7

33 Forster & Church Oligos for 150 & 776 synthetic genes (for E.coli minigenome & M.mobile whole genome respectively)

34 Up to 760K Oligos/Chip 18 Mbp for $700 raw (6-18K genes) <1K Oxamer Electrolytic acid/base 8K Atactic/Xeotron/Invitrogen Photo-Generated Acid Sheng, Zhou, Gulari, Gao (U.Houston) 24K Agilent Ink-jet standard reagents 48K Febit 100K Metrigen 380K Nimblegen Photolabile 5'protection Nuwaysir, Smith, Albert Tian, Gong, Church

35 Improve DNA Synthesis Cost Synthesis on chips in pools is 5000X less expensive per oligonucleotide, but amounts are low (1e6 molecules rather than usual 1e12) & bimolecular kinetics slow with square of concentration decrease!) Solution: Amplify the oligos then release them. 10 50 10 => ss-70-mer (chip) 20-mer PCR primers with restriction sites at the 50mer junctions Tian, Gong, Sheng, Zhou, Gulari, Gao, Church => ds-90-mer => ds-50-mer

36 Improve DNA Synthesis Accuracy via mismatch selection Tian & Church

37 Genome assembly Challenges: 1. Tandem, inverted and dispersed repeats (hierarchical assembly, size-selection and/or scaffolding) 2. Reduce mutations (goal <1e-6 errors) to reduce # of intermediates 3. >30 kbp homologous recombination (Nick Reppas) Stemmer et al. 1995. Gene 164:49-53. Single-step assembly of a gene and entire plasmid from large numbers of oligodeoxyribonucleotides. 50 75 125 225 425 825 … 100*2^(n-1)

38 M 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 DNA Templates RNA Transcripts All 30S-Ribosomal-protein DNAs & mRNAs synthesized in vitro s19 0.5kb 0.3kb Nimblegen Xeotron/Atactic Wild-type DNA Templates Tian, Gong, Sheng, Zhou, Gulari, Gao, Church

39 Improving synthesis accuracy 9-fold Method Total bp # Clones Trans- ition Trans- versionDeletionAddition Bp/error Hyb selection, PCR2364197352 1391 Gel selection, PCR24546352812113 455 No selection, ligation +PCR60932566224 160 No selection, PCR9243212513191 159 Tian & Church

40 Extreme mRNA makeover for protein expression in vitro RS-2,4,5,6,9,10,12,13,15,16,17,and 21 detectable initially. RS-1, 3, 7, 8, 11, 14, 18, 19, 20 initially weak or undetectable. Solution: Iteratively resynthesize all mRNAs with less mRNA structure. Tian & Church Western blot based on His-tags

41 Enabling technologies Multi-Gene Assembly Protein, peptidomimetic synthesis CAD/CAM & Design for manufacturing Automated homologous recombination for E.coli & embryonic stem cells Fidelity enhancements Sequencing 10 7 bp/$ ($1K/human)

42 Thanks to: DOE-GTL, DARPA-BioComp, NIGMS-CECBSR, NGHRI-CEGS, PhRMA, EU-MolTools, NHLBI-PGA, Broad Inst., Lipper Foundation Agencourt, Ambergen, Atactic, BeyondGenomics, Caliper, Genomatica, Genovoxx, MJR, NEN, Nimblegen, ThermoFinnigan, Xeotron/Invitrogen For more info see: arep.med.harvard.edu BU BME retreat 23-Jun-2004 9:45-10:30 Seacrest, N. Falmouth, MA Optimal Combinatorial Biology & Genome Engineering

43 .

44 Improve DNA Synthesis accuracy Synthesis on a chip pools of "construction" ~50-mers and two complementary "selection" ~26-mers (Left & Right) 10 50 10 => ss-70-mer (chip) Tian, Gong, Sheng, Zhou, Gulari, Gao, Church => ds/ss-50-mer (amplif/restrict) 10 26 10 => ss-56-mer (chip) 20-mer PCR primers (one biotinylated) Biotin => ss-76-mer (amplif/avidin)

45 Improve DNA Synthesis Accuracy via D-HPLC or MutS Smith & Modrich (1997) PNAS 94: 6847–50. Removal of polymerase-produced mutant sequences from PCR products. MutHLS Cleaves at GATC near mismatches. Lowers error rate from 6e-6 to 6e-7. Bellanne-Chantelot et al. (1997) Mutat Res. 382:35-43. Search for DNA sequence variations using a MutS-based technology. Mulligan & Tabone (2002) US Patent 6,664,112. Methods for improving the sequence fidelity of synthetic doublestranded- oligonucleotides.


Download ppt "Thanks to: Broad Inst., DARPA-BioComp, DOE-GTL, EU-MolTools, NGHRI-CEGS, NHLBI-PGA, NIGMS-CECBSR, PhRMA, Lipper Foundation Agencourt, Ambergen, Atactic,"

Similar presentations


Ads by Google