HST Advisory Council Thursday 16-Nov :00 to 2:20 PM Personal Genomes & Medicine Thanks to: Broad Inst., DARPA-BioComp, DOE-GTL, EU-MolTools, NGHRI-CEGS, NHLBI-PGA, NIGMS-CECBSR, PhRMA, Lipper Foundation Agencourt, Ambergen, Atactic, BeyondGenomics, Caliper, Genomatica, Genovoxx, Helicos, MJR, NEN, Nimblegen, ThermoFinnigan, Xeotron/Invitrogen For more info see: arep.med.harvard.edu
Why sequence? Cancer: mutation sets for individual clones, loss-of-heterozygosity Pathogen "weather map", biowarfare sensors RNA splicing & chromatin modification patterns. Synthetic biology & lab selections Antibodies or "aptamers" for any protein B & T-cell receptor diversity: Temporal profiling, clinical Preventative medicine & genotype–phenotype associations Cell-lineage during development Phylogenetic footprinting, biodiversity Shendure et al Nature Rev Gen 5, 335.
The idea of Common SNPs for Common Diseases has been hugely oversold. Do association studies need the added baggage of "linkage" assumptions? Should we determine genotype (haplotype) directly (at low cost) rather than infer it from population trends?
Rare Alleles / Common Diseases Even "dispensable" regions of the genome can harbor neomorphic alleles. Each of us has about 10 4 mutations since the last major population bottleneck. "-463GA, has been associated with incidence or severity of inflammatory diseases, including atherosclerosis and Alzheimer's disease, and some cancers. The polymorphism is within an Alu element " Kumar AP, et al. (2004) J Biol Chem. 279: Variable breakpoints in Burkitt lymphoma cells with chromosomal t(8;14) translocation separate c-myc and the IgH locus up to several hundred kb. Joos S, et al. (1992) Hum Mol Genet. 1: Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Cohen JC et al. (2004) Science. 305:
Personal genomics & cancer therapy Mutations G719S, L858R, Del746ELREA in red. EGFR Mutations in lung cancer: correlation with clinical response to gefitinib [Iressa] therapy. Paez, … Meyerson (2004) Science 304: 1497 Dulbecco R. (1986) A turning point in cancer research: sequencing the human genome. Science 231:
Why 'single molecule' sequencing? (1) Single-cells: Preimplantation (PGD), uncultivatable (2) Co-occurrence on a molecule, complex, cell RNA splice-forms & DNA haplotypes (3) Cost: $1K-100K "personal genomes" (4) Precision: Counting 10 9 RNA tags (to reduce variance) (~5e5 RNAs per human cell) Fixed 5e3 5e4 5e6 5e9 (goal) costs EST SAGE MPSS Polony-FISSeq (polymerase colony)
CD44 Exon Combinatorics (Zhu & Shendure) Alternatively Spliced Cell Adhesion Molecule Specific variable exons are up-or-down-regulated in various cancers (>2000 papers) v6 & v7 enable direct binding to chondroitin sulfate, heparin… Zhu,J, et al. Science. 301:836-8.
Zhu J, Shendure J, Mitra RD, Church GM. Science 301: Single molecule profiling of alternative pre-mRNA splicing. Eph4 = murine mammary epithelial cell line Eph4bDD = stable transfection of Eph4 with MEK-1 (tumorigenic) CD44 RNA isoforms
Multi-locus haplotyping on pooled samples Kun Zhang Throughput = (# loci × # samples) / time
Multi-locus haplotyping NOS3 C/TC/TG/AG/AG/TG/TG/AG/AT/AT/AC/TC/TC/TC/T ~24-Kb Chr 7
Chromosome-wide haplotyping IL : A/C ~60-Mb CD : T/A Human Chr. 7 A..T A..A
Convergence on non-electrophorectic tag-sequencing methods? Tag > bp (2-ends) EST SAGE MPSS 454 Polony-Seq Ronaghi Single-molecule vs. amplified single molecule. Array vs. bead packing vs. random Rapid scans vs. long scans (chemically limited, 454) Number of immobilized primers: 0: Chetverin'97 "Molecular Colonies" 1: Mitra'99 > Agencourt "Bead Polonies" 2: Kawashima'88, Adams'97 > Lynx/Solexa: "Clusters"
Bead Polony Sequencing Pipeline In vitro libraries via paired tag manipulation Bead polonies via emulsion PCR [Dre03] Monolayered immobilization in acrylamide Enrichment of amplified beads SOFTWARE Images → Tag Sequences Tag Sequences → Genome FISSEQ or “wobble” sequencing Epifluorescence Scope with Integrated Flow Cell
Polony Fluorescent In Situ Sequencing Libraries Greg Porreca Abraham Rosenbaum 1 to 100kb Genomic M L R M PCR bead Sequencing primers Selector bead 2x20bp after MmeI ( BceAI, AcuI) Dressman et al PNAS 2003 emulsion
Cleavable dNTP-Fluorophore (& terminators) Mitra,RD, Shendure,J, Olejnik,J, Olejnik,EK, and Church,GM (2003) Fluorescent in situ Sequencing on Polymerase Colonies. Analyt. Biochem. 320:55-65 Reduce or photo- cleave
Polony- FISSeq : up to 2 billion beads/slide Cy5 primer (570nm) ; Cy3 dNTP (666nm) Jay Shendure Self Organizing Monolayer
# of bases sequenced (total)23,703,953 # bases sequenced (unique)73 Avg fold coverage324,711 X Pixels used per bead (analysis)~3.6 Read Length per primer14-15 bp Insertions 0.5% Deletions 0.7% Substitutions (raw) 4e-5 Throughput:360,000 bp/min Polony FISSeq Stats Current capillary sequencing 1400 bp/min (600X speed/cost ratio, ~$5K/1X) (This may omit: PCR, homopolymer, context errors) Shendure
Anonymity, privacy, identity Required disclosure > optional > required privacy
Non-anonymous healthy genotype-phenotype studies Are information-rich resources (e.g. facial imaging & genome sequence) really anonymous? What are the risks and benefits of "open-source"? What level of training is needed to give informed consent on open-ended studies? Harvard Medical School IRB Human Subjects protocol submitted 16-Sep-2004
.