Finishing the Human Genome

Slides:

Advertisements

Similar presentations

Next-Generation Sequencing: Methodology and Application

Advertisements

Polymerase Chain Reaction

Polymerase Chain Reaction (PCR). PCR produces billions of copies of a specific piece of DNA from trace amounts of starting material. (i.e. blood, skin.

High-Throughput Sequencing Technologies

Module 12 Human DNA Fingerprinting and Population Genetics p 2 + 2pq + q 2 = 1.

Doug Brutlag 2011 Sequencing the Human Genome Doug Brutlag Professor Emeritus of Biochemistry.

DNA Fingerprinting and Forensic Analysis Chapter 8.

Polymerase Chain Reaction (PCR2) fourth lecture Zoology department 2007 Dr.Maha H. Daghestani.

1 Library Screening, Characterization, and Amplification Screening of libraries Amplification of DNA (PCR) Analysis of DNA (Sequencing) Chemical Synthesis.

BME 130 – Genomes Lecture 4 Sequencing technology II Next generation sequencing.

Genomic DNA purification

Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor.

Polymerase Chain Reaction

Doug Brutlag 2011 Next Generation Sequencing and Human Genome Databases Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University.

WORKSHOP (1) Presented by: Afsaneh Bazgir Polymerase Chain Reaction

IN THE NAME OF GOD. PCR Primer Design Lecturer: Dr. Farkhondeh Poursina.

PCR optimization. Primers – design must be good but influenced by template sequence Quality of template DNA/impurities Components of PCR may need to be.

Polymerase Chain Reaction

Polymerase Chain Reaction (PCR)

EDVOKIT#300: Blue/White Cloning of a DNA Fragment

Chapter 19 – Molecular Genetic Analysis and Biotechnology

Recombinant DNA Technology………..

DNA Cloning and PCR.

Polymerase Chain Reaction PCR. invented by Karry B. Mullis (1983, Nobel Prize 1993) patent sold by Cetus corp. to La Roche for $300 million depends on.

Qai Gordon and Maddy Marchetti. What is Polymerase Chain Reaction? Polymerase Chain Reaction ( PCR ) is a process that amplifies small pieces of DNA to.

The Polymerase Chain Reaction

Polymerase Chain Reaction PCR. PCR allows for amplification of a small piece of DNA. Some applications of PCR are in: –forensics (paternity testing, crimes)

Polymerase Chain Reaction (PCR) Developed in 1983 by Kary Mullis Major breakthrough in Molecular Biology Allows for the amplification of specific DNA fragments.

Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.

Success criteria - PCR By the end of this lesson we will be able to: 1. The polymerase chain reaction (PCR) is a technique for the amplification ( making.

Polymerase Chain Reaction (PCR)

INTRODUCTION. INTRODUCTION Introduction In the past, amplifying (replication) of DNA was done in bacteria and took weeks. In 1971, paper in the.

1. 2 VARIANTS OF PCR APPLICATIONS OF PCR MECHANICS OF PCR WHAT IS PCR? PRIMER DESIGN.

The polymerase chain reaction

A program of ITEST (Information Technology Experiences for Students and Teachers) funded by the National Science Foundation Background Session #5 Polymerase.

PPT-1. Experiment Objective: The objective of this experiment is to amplify a DNA fragment by Polymerase Chain Reaction (PCR) and to clone the amplified.

Polymerase Chain Reaction A process used to artificially multiply a chosen piece of genetic material. May also be known as DNA amplification. One strand.

Biotechnology and Genetic Engineering PBIO 450/550 Characterization of DNA clones including: Restriction Enzyme (RE) mapping Subcloning Southerns Northerns*

PCR With PCR it is possible to amplify a single piece of DNA, or a very small number of pieces of DNA, over many cycles, generating millions of copies.

The Polymerase Chain Reaction (PCR)

Introduction to PCR Polymerase Chain Reaction

Lab 22 Goals and Objectives: EDVOKIT#300: Blue/White Cloning of a DNA Fragment Calculate transformation efficiencies Compare control efficiency to cloned.

Higher Human Biology Unit 1 Human Cells KEY AREA 5: Human Genomics.

Polymerase Chain Reaction (PCR). What’s the point of PCR? PCR, or the polymerase chain reaction, makes copies of a specific piece of DNA PCR allows you.

CATEGORY: EXPERIMENTAL TECHNIQUES Polymerase Chain Reaction (PCR) Tarnjit Khera, University of Bristol, UK Background The polymerase chain reaction (PCR)

Lecturer: Bahiya Osrah Background PCR (Polymerase Chain Reaction) is a molecular biological technique that is used to amplify specific.

Rajan sharma.  Polymerase chain reaction Is a in vitro method of enzymatic synthesis of specific DNA sequences.  This method was first time developed.

Unit II Lecture 3 B. Tech. (Biotechnology) III Year V th Semester EBT-501, Genetic Engineering.

Polymerase Chain Reaction (PCR). DNA DNA is a nucleic acid that is composed of two complementary nucleotide building block chains. The nucleotides are.

BME 130 – Genomes Lecture 4 Sequencing technology II Next generation sequencing.

Presented by: Khadija Balubaid.  PCR (Polymerase Chain Reaction) is a molecular biological technique  used to amplify specific fragment of DNA in vitro.

Introduction to PCR Polymerase Chain Reaction

Polymerase Chain Reaction

Success criteria - PCR By the end of this lesson we will be know:

Polymerase Chain Reaction

Topics to be covered Basics of PCR

EDVOKIT#300: Blue/White Cloning of a DNA Fragment

DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford

Gel electrophoresis analysis Automated DNA analyzer.

Polymerase Chain Reaction & DNA Profiling

Polymerase Chain Reaction

Polymerase Chain Reaction

BIOTECHNOLOGY BIOTECHNOLOGY: Use of living systems and organisms to develop or make useful products GENETIC ENGINEERING: Process of manipulating genes.

Polymerase Chain Reaction

Polymerase Chain Reaction (PCR) technique

High-throughput sequencing techniques

Introduction to Bioinformatics II

Molecular Cloning.

Dr. Israa ayoub alwan Lec -12-

Standard (Sanger) sequencing

Presentation transcript:

Finishing the Human Genome http://biochem158.stanford.edu/ Genomics, Bioinformatics & Medicine I have been asked to speak to you about the impact that the human genome project will have on biomedical research and education. It is clear that the new fields of genomics and bioinformatics will have the same revolutionary effects on medicine as they have had on molecular biology and biomedical research. I would like to present you with a vision of that revolution and some of its implications for medicine and medical education in particular. I would first like to call your attention to some reference materials that can give you further background on these issues: Scientifc American Introduction to Molecular Medicine, just enough, BMA. Science Magazine features the Human Genome project, Lee Rowan (sequencing human genome), Phil Hieter on Understanding the Human genome, Joe Derisi, current graduate students (studying gene expression) MIS Short Course on Medical Informatics http://biochem158.stanford.edu/ - you may see a videolectures here Doug Brutlag Professor Emeritus of Biochemistry & Medicine Stanford University School of Medicine

Chromosome 21: Public vs Celera Assemblies https://www.celera.com/ Chromosome 21: Public vs Celera Assemblies

Chromosome 8: Public vs. Celera https://www.celera.com/ Chromosome 8: Public vs. Celera

Finishing Strategy for the Public Genome Project http://www.genome.gov/ Finishing Strategy for the Public Genome Project Finishing the sequence of clones Sequencing of large-insert clones began with production of an initial assembly based on shotgun sequence data. For a typical BAC clone, we generated 6–10-fold coverage in paired end-sequences from random, small-insert (2–4 kb) plasmid clones and used computer programs (P. Green, unpublished and see ref. 68) to assemble the data into sequence contigs connected by linking information. Each base was assigned a quality score reflecting its predicted accuracy, based on the underlying shotgun sequence data. The assembly typically had gaps and lowquality regions, with the number varying greatly across clones. These regions are highly enriched in sequences that are difficult to clone or sequence and thus are not represented even after deep (6–10-fold) coverage with random reads. (These regions are also poorly covered by whole-genome shotgun strategies69.) The finishing phase converted this draft assembly into a high-quality continuous sequence by obtaining directed information. It involved iterative cycles of computational analysis and laboratory work. Box 2 Fig. 1 shows a simplified flowchart. The first step was to inspect the draft assembly for evidence of mis-assembly, arising from inappropriate merger of repeated sequences. Such evidence would include inconsistent patterns of linking among contigs, regions with unusually high coverage in sequence reads and bases with ‘high-quality discrepancies’ among the underlying sequence reads. In general, sequence assembly is more straightforward for the clone-based hierarchical shotgun strategy than for the whole-genome shotgun strategy, because the use of clones avoids problems arising from polymorphism and from different copies of repeated regions elsewhere in the genome. Most clones passed assembly inspection, but some failed due to the presence of very similar local dispersed, tandem or inverted repeats. Careful inspection could resolve the problem in some cases, but specific strategies had to be devised in other cases. One approach was to isolate distinct copies of the repeat in subclones of intermediate size (10-kb plasmids or fosmids) and sequence these subclones. Box 2 Fig. 2 illustrates an initially mis-assembled BAC clone from chromosome Y that could be assembled correctly with careful editing. The second step was gap closure. Because gaps tended to be enriched for problematic sequences, gap closure was challenging; it often required multiple attempts using a variety of alternative methods. Gaps were classified into two types: ‘spanned’ and ‘unspanned’. Spanned gaps were those for which the two flanking contig ends were linked by an end-sequenced plasmid. Most such gaps could be closed by primer-directed sequencing of the plasmid, serially extending the

Polymerase Chain Reaction Overview: Exponential Amplification of DNA

The First Three Cycles After Cycle 1 Original DNA After Cycle 2 After N cycles, amount of target DNA is 2N-2N

PCR Requirements DNA DNA Polymerase Enzyme Thermocycler Need to know at least the beginning and end of DNA sequence These flanking regions have to be unique to strand interested in amplifying Region of interest can be present in as little as one copy Enough DNA in 0.1 microliter of human saliva to use PCR DNA Polymerase Enzyme DNA polymerase from Thermus aquaticus--Yellowstone Alternatives: Thermococcus litoralis, Pyrococcus furiosus Thermocycler

Temperature Cycling TAQ polymerase optimum at 72° C

PCR on a Chip Uses: Reaction complete in 2-20 minutes Extremely portable

Fluidigm PCR Arrays http://www.fluidigm.com/access-array-system.html

Real-Time PCR Uses: •Portable means to diagnose bacteria: epidemics Bioterrorism detection Military, medical, and municipal applications Fast: Results in less than seven minutes

Quantitative PCR

QuantaLife http://www.quantalife.com/

PCR Applications Forensics assessment/reassessment of crimes Archaeology determine gene sequences of ancient organisms rethinking the past, human origins Molecular Biology Cloning genes Sequencing genes Finishing genome sequences Amplification of DNA or RNA Medicine Diagnostics for inherited disease Diagnostics for gene expression Diagnostics for gene methylation

Finishing Strategy for the Public Genome Project http://www.genome.gov/ Finishing Strategy for the Public Genome Project Finishing the sequence of clones Sequencing of large-insert clones began with production of an initial assembly based on shotgun sequence data. For a typical BAC clone, we generated 6–10-fold coverage in paired end-sequences from random, small-insert (2–4 kb) plasmid clones and used computer programs (P. Green, unpublished and see ref. 68) to assemble the data into sequence contigs connected by linking information. Each base was assigned a quality score reflecting its predicted accuracy, based on the underlying shotgun sequence data. The assembly typically had gaps and lowquality regions, with the number varying greatly across clones. These regions are highly enriched in sequences that are difficult to clone or sequence and thus are not represented even after deep (6–10-fold) coverage with random reads. (These regions are also poorly covered by whole-genome shotgun strategies69.) The finishing phase converted this draft assembly into a high-quality continuous sequence by obtaining directed information. It involved iterative cycles of computational analysis and laboratory work. Box 2 Fig. 1 shows a simplified flowchart. The first step was to inspect the draft assembly for evidence of mis-assembly, arising from inappropriate merger of repeated sequences. Such evidence would include inconsistent patterns of linking among contigs, regions with unusually high coverage in sequence reads and bases with ‘high-quality discrepancies’ among the underlying sequence reads. In general, sequence assembly is more straightforward for the clone-based hierarchical shotgun strategy than for the whole-genome shotgun strategy, because the use of clones avoids problems arising from polymorphism and from different copies of repeated regions elsewhere in the genome. Most clones passed assembly inspection, but some failed due to the presence of very similar local dispersed, tandem or inverted repeats. Careful inspection could resolve the problem in some cases, but specific strategies had to be devised in other cases. One approach was to isolate distinct copies of the repeat in subclones of intermediate size (10-kb plasmids or fosmids) and sequence these subclones. Box 2 Fig. 2 illustrates an initially mis-assembled BAC clone from chromosome Y that could be assembled correctly with careful editing. The second step was gap closure. Because gaps tended to be enriched for problematic sequences, gap closure was challenging; it often required multiple attempts using a variety of alternative methods. Gaps were classified into two types: ‘spanned’ and ‘unspanned’. Spanned gaps were those for which the two flanking contig ends were linked by an end-sequenced plasmid. Most such gaps could be closed by primer-directed sequencing of the plasmid, serially extending the

Finished Sequence in 2004 (Build 35) http://www.genome.gov/ Finished Sequence in 2004 (Build 35) * The total length of tiling paths including only finished bases of clones in Build 35. Roughly 2.19 Mb of sequence on chromosome Y was derived directly from the equivalent pseudoautosomal region on chromosome X. † Defined as gaps in euchromatic regions, including junctions with heterochromatic/centromeric sequences, for which no clone was available (see text). ‡ Defined here as gaps in heterochromatic regions (see text and Supplementary Note 2 on heterochromatic sequence). Separate gaps were counted for centromeres and pericentric heterochromatin, even when the two were contiguous. Centromere sizes were taken from ref. 62 or in some cases provided directly by the sequencing centres (see Supplementary Note 2). Acrocentric sizes are based on centromere ratios from ref. 63. The sizes of large heterochromatic gaps are typically difficult to estimate accurately owing to their repeat structure and polymorphic nature62,64. Other regions might arguably be called heterochromatin (for example, the pericentric regions of chromosomes 19 and 3 and a ,400-kb gap on the Y chromosome23), but are classified as euchromatin here. § The sum of lengths for finished sequence, estimated heterochromatic gaps, euchromatic gaps and unfinished clone gaps. The total length is only approximate because of uncertainty in gap sizes, particularly for heterochromatic gaps and centromeres. k Those in the tiling path but for which it has not been possible to obtain finished sequence. Unfinished sequence from these clones is deposited in public databases. These gaps are all listed at 50 kb, reflecting the approximate average size of the gap.

Comparison of Chromosome 7 Draft versus Finished Sequence http://www.genome.gov/ Comparison of Chromosome 7 Draft versus Finished Sequence Figure 1 Comparison of previous draft sequence with current near-complete sequence of chromosome 7 (ref. 24). At large scale, there was good collinearity between draft and near-complete sequence, although some inversions were present in the draft due to lack of sufficient anchors in some regions. At finer scale, the draft sequence contained some sequence contigs for which order and orientation were not known. The inset shows a region of 500 kb with sequence derived from three overlapping BACs. BACs at each end were finished at the time of draft assembly, whereas the middle BAC was at an early stage of shotgun coverage in which contigs were not yet ordered and oriented.

Substitutions in BAC Overlaps with BACs from Same or Different Libraries http://www.genome.gov/ Substitutions in BAC Overlaps with BACs from Same or Different Libraries Figure 2 Assessment of potential errors by analysis of BAC overlaps. a, Single-base differences between overlapping ﬁnished BAC clones (with 45 kb overlap). The number of single-base differences in overlaps for clones from the same library and from different libraries is plotted. The results are consistent with half of the clones from the same library representing identical underlying DNA sequence with low error rate, and half representing different haplotypes as expected. b, Insertion/deletion (indel) differences between overlapping clones. The number of indels per Mb for a given size range is compared for clones with no single-base mismatches (presumed to be derived from the same haploid source) and .3 single-base mismatches (presumed to be derived from different haploid sources). Indels in the former class primarily represent errors in ﬁnished sequence; they occur at ,20-fold lower frequency (inset) than indels in the latter class, which primarily representpolymorphicdifferences.

Gaps in BAC Overlaps with BACs from Same or Different Libraries http://www.genome.gov/ Gaps in BAC Overlaps with BACs from Same or Different Libraries Figure 2 Assessment of potential errors by analysis of BAC overlaps. a, Single-base differences between overlapping ﬁnished BAC clones (with $5 kb overlap). The number of single-base differences in overlaps for clones from the same library and from different libraries is plotted. The results are consistent with half of the clones from the same library representing identical underlying DNA sequence with low error rate, and half representing different haplotypes as expected. b, Insertion/deletion (indel) differences between overlapping clones. The number of indels per Mb for a given size range is compared for clones with no single-base mismatches (presumed to be derived from the same haploid source) and .3 single-base mismatches (presumed to be derived from different haploid sources). Indels in the former class primarily represent errors in ﬁnished sequence; they occur at ,20-fold lower frequency (inset) than indels in the latter class, which primarily representpolymorphicdifferences.

Duplications and Deletions in the Human Genome http://www.genome.gov/ Figure 4 Segmental duplications across the genome. a, Segmental duplications and sequence gaps across the genome. Segmental duplications are indicated below the chromosomes in blue (length $10 kb and sequence identity $95%). Large duplications are shown to approximate scale; smaller ones are indicated as ticks. Sequence gaps are indicated above the chromosomes in red. Large gaps (.300 kb) are shown to approximate scale; smaller gaps are indicated as ticks with those that are 50 kb or smaller shown as shorter ticks. Unﬁnished clones are indicated as black ticks. b, Percentage of large segmental duplications by chromosome. This count includes both interchromosomal and intrachromosomal duplications with length $1 kb and sequence identity $90%. The blue bars show the result of direct analysis of near-complete sequence. The gold bars show an independent estimate65 using whole-genome shotgun data to correct for potential mis-assembly of such segmental duplications. The strong agreement suggests that most segmental duplications are properly represented in near-complete genome sequence. The discrepancy for chromosome X is probably a result of errors in the independent estimate, due to limited coverage and diversity of data from this chromosome15.

Percentage of Chromosomes Duplicated http://www.genome.gov/ Percentage of Chromosomes Duplicated Figure 4 Segmental duplications across the genome. a, Segmental duplications and sequence gaps across the genome. Segmental duplications are indicated below the chromosomes in blue (length $10 kb and sequence identity $95%). Large duplications are shown to approximate scale; smaller ones are indicated as ticks. Sequence gaps are indicated above the chromosomes in red. Large gaps (.300 kb) are shown to approximate scale; smaller gaps are indicated as ticks with those that are 50 kb or smaller shown as shorter ticks. Unfinished clones are indicated as black ticks. b, Percentage of large segmental duplications by chromosome. This count includes both interchromosomal and intrachromosomal duplications with length$1 kb and sequence identity$90%. The blue bars show the result of direct analysis of near-complete sequence. The gold bars show an independent estimate65 using whole-genome shotgun data to correct for potential mis-assembly of such segmental duplications. The strong agreement suggests that most segmental duplications are properly represented in near-complete genome sequence. The discrepancy for chromosome X is probably a result of errors in the independent estimate, due to limited coverage and diversity of data from this chromosome15.

Duplications near Centromeres http://www.genome.gov/ Duplications near Centromeres Examples of repeat structure near centromeres and telomeres. a, Repeats in pericentric regions of chromosomes 7 and 8. The most proximal regions are crowded with alpha satellite sequences and other centromeric repeats; composition, density and order may vary considerably between chromosome arms62. Just outside this region, there is usually a high density of inter- and intra-chromosomal duplication. For details, see text and refs 39, 40, 66 and 67.

Duplications near Telomeres http://www.genome.gov/ Duplications near Telomeres b, Sequence organization in human subtelomeric DNA regions. The terminal repeat tract consists of 2–15 kb of simple repeat sequence (TTAGGG)n and is indicated by the black arrow at right. Short (50–250 bp) and often degenerate (TTAGGG) tracts (internal black arrows) are highly enriched (.25-fold) in subtelomeric DNA relative to elsewhere in the genome. A subtelomeric repeat (Srpt) region (blue) consists of a mosaic patchwork of segmentally duplicated DNA tracts that occur in two or more subtelomere regions and range in size from ,10 kb to .300 kb. TAR1, D4Z4 and beta satellite sequences are frequently associated with Srpt regions. Proximal to the Srpt region is chromosome-specific genomic DNA, typically with a high GC content and high gene density. Stretches of segmentally duplicated DNA that occur only once within subtelomeric regions (tan) are interspersed with 1-copy subtelomeric DNA (yellow) in a telomere-specific fashion. Overall, segmentally duplicated DNA comprises approximately 25% of the most telomeric 500 kb of the chromosome, a fivefold enrichment over the genome-wide average.

Deletions and Duplications can Arise from Unequal Crossing Over in Repeated Regions Crossing over between maternal and paternal chromosomes Unequal crossing over between maternal and paternal chromosomes Maternal Paternal Offspring Offspring Maternal Paternal Offspring Offspring

The Diploid Sequence of an Individual Human (HuRef) http://www.jcvi.org/ The Diploid Sequence of an Individual Human (HuRef)

Karyotype of J.Craig Venter http://www.jcvi.org/ Karyotype of J.Craig Venter Giemsa Stain FISH Stain

Comparing NCBI Assembly to HuRef Assembly http://www.jcvi.org/ Comparing NCBI Assembly to HuRef Assembly

SNPs & InDels in HuRef Autosomes http://www.jcvi.org/ SNPs & InDels in HuRef Autosomes Across all autosomal chromosomes, the observed diversity values for SNPs and indels are 6.15 3 104 and 0.84 3 104 respectively. When restricted to coding regions only, hSNP 1/4 3.59 3 104 and hindel 1/4 0.07 3 104, indicating that 42% of SNPs and 91% of indels have been eliminated by selection in coding regions. The strong selection against coding indels is not surprising, because most will introduce a frameshift and produce a nonfunctional protein. Our observed hSNP falls within the range of 5.4 3 104 to 8.3 3 104 that has been previously reported by other groups [44–47]

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Illumina Solexa Sequencing Technology http://www.illumina.com/ Illumina Solexa Sequencing Technology

Life Sciences 454 Process Overview 4) Perform Sequencing by synthesis on the 454 Instrument 1) Prepare Adapter Ligated ssDNA Library 2) Clonal Amplification on 28 µ beads 3) Load beads and enzymes in PicoTiter Plate™ http://www.my454.com/ Life Sciences 454 Process Overview

Emulsion Based Clonal Amplification Micro-reactors Adapter carrying library DNA Anneal DNA template to capture beads Break micro-reactors Isolate DNA containing beads Single test tube generation of millions of clonally amplified sequencing templates No cloning and colony picking “Water-in-oil” emulsion + PCR Reagents + Emulsion Oil Perform emulsion PCR A B http://www.my454.com/ Emulsion Based Clonal Amplification

Depositing DNA Beads into the PicoTiter™Plate Centrifuge Step Load Enzyme Beads Load beads into PicoTiter™Plate 44 μm 70x75mm array of fused optical fibers with etched wells 1.6 million wells monitored optically through fiber http://www.my454.com/ Depositing DNA Beads into the PicoTiter™Plate

Sequencing-By-Synthesis Simultaneous sequencing of the entire genome in hundreds of thousands of picoliter-size wells Pyrophosphate signal generation http://www.my454.com/ Sequencing-By-Synthesis

Flowgrams and BaseCalling Flow Order 4-mer TACG 3-mer 2-mer 1-mer http://www.my454.com/ Flowgrams and BaseCalling

Pacific Biosciences SMRT Sequencing http://www.pacificbiosciences.com/ Pacific Biosciences SMRT Sequencing

Pacific Biosciences SMRT Sequencing http://www.pacificbiosciences.com/ Pacific Biosciences SMRT Sequencing

Phospholinked Fluorophores http://www.pacificbiosciences.com/ Phospholinked Fluorophores

Processive Synthesis http://www.pacificbiosciences.com/

Synthesis of Long Duplex DNA http://www.pacificbiosciences.com/ Synthesis of Long Duplex DNA

Highly Parallel Optics System http://www.pacificbiosciences.com/ Highly Parallel Optics System

Circular Templates Gives Redundant Sequencing and Accuracy http://www.pacificbiosciences.com/ Circular Templates Gives Redundant Sequencing and Accuracy

Circular Templates Gives Redundant Sequencing and Accuracy http://www.pacificbiosciences.com/ Circular Templates Gives Redundant Sequencing and Accuracy

Ion Torrent Sequencing http://www.iontorrent.com/ Ion Torrent Sequencing

Ion Torrent Sequencing http://www.iontorrent.com/ Ion Torrent Sequencing

Ion Torrent Sequencing http://www.iontorrent.com/ Ion Torrent Sequencing

The Human Genome How fast is the cost going down? 2006: $ 50 million 2008: $500,000 2009: $50,000 2010: $20,000 2011: $5,000 2012:??? $1,000 Thanks to Serafim Batzoglou

Archon Genomics X-Prize http://genomics.xprize.org/ Archon Genomics X-Prize

Archon Genomics X-Prize http://genomics.xprize.org/ Archon Genomics X-Prize

PCR Primer Design Primer sequence: Length: 20-30 base pairs long (1/4N) 50% ± 15% G-C Avoid repeating sequences and poly-A sequences Avoid inverted repeating sequences Two primers should have little self complementarity 3’ end of primer should be G/C Melting Temperature Tm = [(number of A+T residues) x 2 °C] + [(number of G+C residues) x 4 °C] Both primers should have comparable melting temperatures Annealing temperature is about 5° C lower than melting temp

GeneFisher http://bibiserv.techfak.uni-bielefeld.de/genefisher/ 1 ccgtaacgga ggatgttttt cagaatgtgg ttgggattga tggatgggag gtagaacgag 61 ttcgagttga aaaggttttg tgtgccatgt gaaaaggtta gcatctatta cgtagacgag 121 agaaattcat tggaaatttg agaaggagat tgagcataat gaaacttgtt ttggaaaaat 181 atgttgttat taatgtggag gtgggcaaga atgagaataa tcagtagcaa tgaggtgtca 241 ataatttgat actgtctaca tggaagacgg cgaccagagc catggaagtc agaatgaaaa 301 atgataaatg tgaaaacatt ctagagaaga aatgaatacg cgaaggcccg tggtgggtga 361 tgacatgatg tgatttctgc ccagtgctct gaatgtcaaa gtgaagaaat tcaatgaagg 421 acgggtaaac ggcgggagta actatgactc tcttaaggta gccaaatgca tagtcatcta 481 attagtgacg ttcatgaatg gatgaacgag attcccactg tccctaccta ctatccagcg 541 aaaccacagc caaggtaacg ggcttggtgg aatccgcggg gaaagaagac cctgttgagc 601 ttgactctag tctggcacgg tgaagagcca tgagaagtgt agaataagtg ggaggcccct 661 gggcccccct gcccagcaag gggacagagt ggggcaaggc cagaggtgaa ataccactac 721 tctgattgtt tattcactga cccgtgaggt gccccaaggg gctcttgctt ctggcgccga 781 gtgcccggcc acatgcacat gccaaattgt aaagaccatc gat http://bibiserv.techfak.uni-bielefeld.de/genefisher/ Forward Primer Data Reverse Primer Data Sequence GATTGATGGATGGGAGGTA CTGTGGTTTCGCTGGA GC Content 47 56 Position 34 533 Degeneracy 0 0 3' GC 50 50 3' Degeneracy 0 0 Tm 52.7996 51.349 Location 34 533 http://bibiserv.techfak.uni-bielefeld.de/genefisher/