Single Cell, RNA, & Chromosome Sequencing Technologies

Slides:



Advertisements
Similar presentations
Vanderbilt Center for Quantitative Sciences Summer Institute Sequencing Analysis Yan Guo.
Advertisements

MCB 317 Genetics and Genomics Topic 11, part 2 Genomics.
This presentation was originally prepared by C. William Birky, Jr. Department of Ecology and Evolutionary Biology The University of Arizona It may be used.
TYPES OF MUTATION CAUSING HUMAN GENETIC DISEASE Nucleotide substitutions (point mutations) Missense mutations Nonsense mutations Spice site mutations Frame.
Next–generation DNA sequencing technologies – theory & practice
PCR Polymerase Chain Reaction Mariam Cortes Tormo Miami Children’s Hospital Research institute 2013.
Biology Mathematics Engineering Optics Physics Robotics Informatics.
Next-generation sequencing
Greg Phillips Veterinary Microbiology
Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome Jay Shendure, Gregory J. Porreca, Nikos B. Reppas, Xiaoxia Lin, John P. McCutcheon.
28-Apr 8:15AM – 8:45AM Next-Gen Seq Data Management Thanks to: Advancing Personal Genetics with Second Generation Sequencing.
Molecular Genomic Imaging Center (CEGS) Harvard / Wash U George Church, Rob Mitra Greg Porreca, Jay Shendure Sequencing by Ligation on Polony Beads with.
1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Polymerase chain reaction: Starting with VERY SMALL AMOUNTS OF DNA (sometimes a few molecules), one can amplify the DNA enough to detect it by electrophoresis.
George Church Thu 27-Apr :30-11 Broad-MPG Thanks to: New Sequencing Technologies & Diploid Personal Genomes NHGRI Seq Tech 2004: Agencourt, 454,
10 Mbp of oligos / $300 chip 8K Atactic/Xeotron/Invitrogen Photo-Generated Acid 12K Combimatrix Electrolytic 44K Agilent Ink-jet standard reagents 380K.
DOE GTL Vertically Integrated BioEnergy Research Center (special thanks to Harvard Inst. for Biologically Inspired Engineering) TimeAgenda item (77 PIs.
Targeted Sequencing of Human Genomes, Transcriptomes, and Methylomes Jin Billy Li George Church Lab Harvard Medical School
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
HST Advisory Council Thursday 16-Nov :00 to 2:20 PM Personal Genomes & Medicine Thanks to: Broad Inst., DARPA-BioComp, DOE-GTL, EU-MolTools, NGHRI-CEGS,
George Church Wed 20-Sep-2006 Science Center Hall E Thanks to: Synthetic, constructive, Adaptive Biology NHGRI Seq Tech 2004: Agencourt, 454, Microchip,
Manipulating the Genome: DNA Cloning and Analysis 20.1 – 20.3 Lesson 4.8.
What Can You Do With qPCR?
CS 6293 Advanced Topics: Current Bioinformatics
Diabetes and Endocrinology Research Center The BCM Microarray Core Facility: Closing the Next Generation Gap Alina Raza 1, Mylinh Hoang 1, Gayan De Silva.
Error-Corrected DNA Synthesis Peter Carr MIT Media Laboratory.
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
DNA basics DNA is a molecule located in the nucleus of a cell Every cell in an organism contains the same DNA Characteristics of DNA varies between individuals.
Fig 11-1 Chapter 11: recombinant DNA and related techniques.
Genetic and Molecular Epidemiology Lecture III: Molecular and Genetic Measures Jan 19, 2009 Joe Wiemels HD 274 (Mission Bay)
How do you identify and clone a gene of interest? Shotgun approach? Is there a better way?
George Church Wed 22-Aug :15 – 9:30 AM 4 th Fab Lab Forum & Digital Fabrication Symposium Thanks to: Fabricating with DNA AppliedBiosystems, Helicos,
Computational research for medical discovery at Boston College Biology Gabor T. Marth Boston College Department of Biology
DNA Cloning and PCR.
20.1 Structural Genomics Determines the DNA Sequences of Entire Genomes The ultimate goal of genomic research: determining the ordered nucleotide sequences.
Chapter 6 PCR and in vitro Mutagenesis A. Basic features of PCR 1. PCR is a cell-free method of DNA cloning standard PCR reaction is a selective DNA amplification.
Microarrays and Their Uses Brad Windle, Ph.D
Steady-state flux optima AB RARA x1x1 x2x2 RBRB D C Feasible flux distributions x1x1 x2x2 Max Z=3 at (x 2 =1, x 1 =0) RCRC RDRD Flux Balance Constraints:
19.1 Techniques of Molecular Genetics Have Revolutionized Biology
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Microfluidics for Gene Fabrication Peter Carr & David Kong MIT Media Laboratory.
Taqman Technology and Its Application to Epidemiology Yuko You, M.S., Ph.D. EPI 243, May 15 th, 2008.
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
Chapter 10: Genetic Engineering- A Revolution in Molecular Biology.
Polymerase Chain Reaction (PCR) Nahla Bakhamis. Multiple copies of specific DNA sequences; ‘Molecular Photocopying’
Chapter 20: DNA Technology and Genomics - Lots of different techniques - Many used in combination with each other - Uses information from every chapter.
Chapter 20 DNA Technology and Genomics. Biotechnology is the manipulation of organisms or their components to make useful products. Recombinant DNA is.
17-Apr :30-3:45 PM Avogadro-Scale Computing MIT Bartos E15 Thanks to: Greedy Algorithms in the Libraries of Biology PGPPGP.
Covariance in RNA ref " Covariance M ij =  fx i x j log 2 [fx i x j /(fx i fx j )] M=0 to 2 bits; x=base type x i x j see Durbin et al p
Da-Hyeong Cho Protein Engineering Laboratory Department of Biotechnology and Bioengineering Sungkyunkwan University Site-Directed Mutagenesis.
Site-Directed Mutagenesis
Methods in Cell Biology Cont. Sept. 24, Science Bomb 2 Unc-22: encodes a myofilament in C. elegans.
Next-generation sequencing technology
Next generation sequencing
Next Generation Sequencing
Next-generation sequencing technology
DNA Tools & Biotechnology
Chapter 20 – DNA Technology and Genomics
Chapter 14 Bioinformatics—the study of a genome
DNA Tools & Biotechnology
Jianbin Wang, H. Christina Fan, Barry Behr, Stephen R. Quake  Cell 
Massively Parallel Sequencing: The Next Big Thing in Genetic Medicine
“TaqMan genotyping Assay’’
RAD51 is essential for L. donovani.
Biotechnology.
Relationship between Genotype and Phenotype
Tools for Molecular Biology
Presentation transcript:

Single Cell, RNA, & Chromosome Sequencing Technologies George Church 2:30- 3:00 PM Tue 3-Oct-2006 Cancer Genomics & Emerging Technologies Thanks to: NCI/NIH HMS-CGCC AppliedBiosystems-Agencourt, Affymetrix, Helicos, 454, Solexa, DNAdirect, CompleteGenomics, Codon Devices

Muliplex Polony Summary Technologies for selecting genomic regions Mbp scale for rearrangements RNA tags & spliceforms 1 to 200 bp scale for SNPs & exons (1%) Low cost & high accuracy : $.07/kbp at 3E-7 errors Paired-end-tags (PET) for rearrangements Detection of rare mutations (e.g. drug resistance alleles) 60 million reads per run

Selective genome sequencing Numerous (100K) Small Regions (exons & point mutations) PCR : 21 Mbp >$250K Sjoblom et al (2006) Science Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay. Hardenbol et al. Genome Res. 2005 Feb;15(2):269-75. Analyzing genes using closing and replicating circles. Nilsson et al. (2006) Trends Biotechnol 24:83. One large region Single molecule amplification 1 to 4 Mbp Zhang et al. 2006 Nature Biotech. 24:680 Direct genomic [BAC hybridization] selection. [50% pure] Bashiardes et al (2005) Nat Methods 2: 63.

Selective genome sequencing Two ways to capture alleles from genomic ss-DNA In vitro Paired-tag library Gap fill Cleave & ligate Red=Synthetic; Yellow=genomic Shendure, et al. Science 309(5741):1728-32. Nilsson et al. (2006) Trends Biotechnol 24:83. How do we optimize >100K 100mers ? Zhang, Chou, Shendure, Li, Leproust, Church, Dahl, Davis, Nilsson

How? 10 Mbp of oligos / $1000 chip ~1000X lower oligo costs Digital Micromirror Array 8K Atactic/Xeotron/Invitrogen Photo-Generated Acid 12K Combimatrix/Codon Electrolytic 44K Agilent Ink-jet standard reagents 380K Nimblegen/GA Photolabile 5'protection Amplify pools of 50mers using flanking universal PCR primers & 3 paths to 10X error correction Tian et al. Nature. 432:1050; Carr & Jacobson 2004 NAR; Smith & Modrich 1997 PNAS

Padlock, Molecular Inversion Probes (MIPs) CG to CA,TG 35% of germline, 44% of colorectal cancer mutations (not restricted to single nucleotides nor common polymorphisms) R Universal primers Optional multiplex tag L Genomic DNA CG CA TG Alternative alleles Zhang, Chou, Shendure, Li, Leproust, Church, Dahl, Davis, Nilsson (10K to 1M 100-mer probes per pool -- see Kun Zhang’s poster) Vitkup, Sander, Church The Amino-acid Mutational Spectrum of Human Genetic Disease. Genome Biol. 4: R72. (CG to CA, TG)

Sequencing genomes from single cells via polymerase clones -- Plones (single chromosome, cell , RNA or particle) Zhang, et al. (2006) . Nature Biotech. June ’06 1) When we only have one cell as in Preimplantation Genetic Diagnosis/Haplotyping (PGD/PGH) or environmental samples (poor lab growth) 2) Candidate chromosome region sequencing 3) Prioritizing or pooling (rare) species based on an initial DNA screen (metagenomics) 4) Multiple chromosomes in a cell or virus 5) RNA splicing 6) Cell-cell interactions (predator-prey, symbionts, commensals, parasites) Announce speaker names; recognize Jim & George Phi-29 Polymerase Stand-displacement amplification

Single molecule amplification sequencing Multiple Displacement Amplification (MDA) NBT (2006) 24: 657-8. . Note!: Single human cell 1000X easier than 5 Mbp Zhang et al., Nature Biotechnology (2006) 24:680

Single-cell sequencing: 4.7 Mbp (plones) Ultra-clean conditions for reduction of background amplification + Real-Time monitoring Post-amplification chip hybridization distinguishes alleles Amplification variation random & easily filled by PCR

CD44 Counts (RNA splicing forms) Eph4 = mammary epithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic) Zhu, Shendure, Mitra, Church, Single Molecule Profiling of Alternative Pre-mRNA Splicing. Science 301:836-8.

Beads or not, Ligase or Polymerase Reading Polonies Beads or not, Ligase or Polymerase A G C T

‘Next Generation’ Sequencing Status Multi-molecule Reaction Volume AB/APG Ligase beads 1 fL 454/Roche Pol beads 100,000 fL Solexa Pol term 1 fL CGI Ligase 1 fL Affymetrix Hybr array 100 fL Single molecules Helicos Biosci Pol <1fL Visigen Biotech Pol FRET <1fL Pacific Biosci Pol <1fL Agilent Nanopores <1fL fL =1E-15 liters (femto) (7/9 involve our lab)

Length& run-time vs. Accuracy&Cost "Future improvements in the read lengths, demonstrated at 7 consecutive bases per tag (Shendure et al., 2005) and reductions in the run time, currently 60 hours, will make this a useful platform for resequencing." --Leamon, et al. (454) Gene Therapy and Regulation 3: 15-31  Note that without ‘future improvements’: Affymetrix/Illumina read-lengths of 1 base per tag are useful. 60 million reads/run is 10X faster per read than 500K reads/run. & 50X lower cost per bp due to lower reagent & instrument costs. $500/run $140K

Polony Sequencing Equipment CCD camera microscope with xyz controls Autosampler (96 wells) (HPLC-like) flow-cell syringe pump temperature control

Monolayer immobilization Integrated Polony Sequencing Pipeline (open source hardware, software, wetware) Monolayer immobilization In vitro paired tag libraries Bead polonies via emulsion PCR Enrich amplified beads Dressman et al PNAS 2003 SBE or SBL sequencing SOFTWARE Images → Tag Sequences Tag Sequences → Genome Epifluorescence & Flow Cell $140K Shendure, Porreca, Reppas, Lin, McCutcheon, Rosenbaum, Wang, Zhang, Mitra, Church (2005) Science 309:1728.

4 positions for paired-end anchor 'primers' ePCR bead L Tag 1 M Tag 2 R 5’ 3’ 7 bp 6 bp 7 bp 6 bp 4 positions for paired-end anchor 'primers' Each yields 6 to 7 bp of contiguous sequence 26 bp new sequence per 135 bp amplicon

Sequencing by Ligation (SBL) with fluorescent combinatorial 9-mers Excitation Emission 647 700 555 605 572 630 555 700 5’-Cy5-nnnnAnnnn-3’ 5’-Cy3-nnnnGnnnn-3’ 5’-TR-nnnnCnnnn-3’ 5’-Cy3+Cy5-nnnnTnnnn-3’ nm 5'PO4 ACUCAUC… (3’)…TAGAGT????????????????TGAGTAG…(5’) Shendure, Porreca, et al. (2005) Science 309:1728

Why low error rates? Goal of genotyping & resequencing  Discovery of variants e.g. cancer somatic mutations 4E-6 (&lab-evolved cells) Consensus error rate Total errors (E.coli) (Human) 1E-4 Bermuda/Hapmap 500 600,000 4E-5 454 200 240,000 3E-7 Polony-SbL @6X 0 1800 1E-8 Goal for 2006 0 60 Also, effectively reduce (sub)genome target size by enrichment for exons or common SNPs to reduce cost & # false positives.

Microbial lab evolution Lenski Citrate utilization Church Trp/Tyr exchange Palsson Glycerol utilization Edwards Radiation resistance Ingram Lactate production Stephanopoulos Ethanol resistance Marliere Thermotolerance J&J Diarylquinoline resistance (TB) DuPont 1,3-propanediol production

Polony-based Whole-Genome Mutation Discovery of DTrp clone Position Type Gene Location Function Mechanism 986,334 T > G ompF Promoter-10 Promoter of Non-specific transport channel Makes promoter more consensus-like 985,797 Glu > Ala Non-specific transport channel Makes pore bigger and more hydrophobic 931,960 D8 bp lrp frameshift General Transcriptional Regulator ? ompF – non specific transport channel Glu-117 → Ala (in the pore) Charged residue known to affect pore size and selectivity Can increase import & export capability simultaneously Shendure, et al. (2005) Science 309:1728

Multiple Genotypes, Similar Themes Evolving Population: Multiple Genotypes, Similar Themes PCR amplification and sequencing of OmpF and Lrp from multiple clones from 3 independent lines of Trp/Tyr co-cultures: OmpF: 42R G, L, C, 113 DV, 117 EA Arg  Gly, Leu, Cys ; Asp  Val; Glu  Ala Hydrophillic and bulky  hydrophobic and smaller Promoter: -12AC, -35 CA More consensus like Lrp: 1bp deletion, 9bp deletion, 8bp deletion, IS2 insertion, R->L in DBD. Change in global gene regulation? Heterogeneity within each time-point reflects colony heterogeneity. Reppas, Lin, et al (unpublished)

Mixture of wild & 2kb Inversion (pin) proximal tag placement Incorrect distance Red=same strand Black opposite strand distal tag placement 1,206k 1,210k Using paired ends, rearrangement & copy-number detection is >1000X easier than point mutation detection (6X vs 6000X)

Polonies for human inversions >300 kbp long inverted repeats Turner, Hurles, et al. 2006 Nat Methods 3:439-45. Sanger Inst. & HMS

Polonies for haplotyping, recombination, LOH Sequencing/genotyping on single human chromosomes Polonies for haplotyping, recombination, LOH 153 Mbp Zhang et al. Nature Genet. Mar 2006

Monitoring resistance to BCR-ABL-kinase inhibitors with polonies during CML patient therapy Nardi, Raz, Chao, Wu, Stone, Cortes, Deininger, Church, Zhu, Daley (submitted) M244V T315I E255K

Muliplex Polony Summary Technologies for selecting genomic regions Mbp scale for rearrangements RNA tags & spliceforms 1 to 200 bp scale for SNPs & exons (1%) Low cost & high accuracy : $.07/kbp at 3E-7 errors Paired-end-tags (PET) for rearrangements Detection of rare mutations (e.g. drug resistance alleles) 60 million reads per run

.

.

Polonies with & without beads or gels Increases from 14 to 57 million polony beads per run & improves data quality. Kim, Porreca, Seidman, Church unpublished

Why low error rates? Goal of genotyping & resequencing  Discovery of variants e.g. cancer somatic mutations 4E-6 (&lab-evolved cells) Consensus error rate Total errors (E.coli) (Human) 1E-4 Bermuda/Hapmap 500 600,000 4E-5 454 @40X 200 240,000 3E-7 Polony-SbL @6X 0 1800 1E-8 Goal for 2006 0 60 Also, effectively reduce (sub)genome target size by enrichment for exons or common SNPs to reduce cost & # false positives.

Cost vs consensus error rate Polony Sep05 Sep 06 AB3730 454 Sep05 $/kb @4E-5 $7 $9 0.8 0.07 $/3e9@1X 3M 300K $30K Paired ends yes no yes Device $ 365K 400K 140K Announce speaker names; recognize Jim & George Cost vs consensus error rate

Cancer exon sequencing $250K per sample (13,023 genes, 21 Mbp, 135,483 primer pairs) using PCR & capillary sequencing. $3K per sample (estimate) using single tube capture & polonies Sjoblom et al. The Consensus Coding Sequences of Human Breast and Colorectal Cancers. Science. 2006 Sep; Davies et al. Somatic mutations of the protein kinase gene family in human lung cancer. Cancer Res. 2005 65:7591-5.