Sequencing technology and assembly

Slides:



Advertisements
Similar presentations
Quality Control of Illumina Data Mick Watson Director of ARK-Genomics The Roslin Institute.
Advertisements

MCB Lecture #15 Oct 23/14 De novo assemblies using PacBio.
IMGS 2012 Bioinformatics Workshop: File Formats for Next Gen Sequence Analysis.
V Improvements to 3kb Long Insert Size Paired-End Library Preparation Naomi Park, Lesley Shirley, Michael Quail, Harold Swerdlow Wellcome Trust Sanger.
Next–generation DNA sequencing technologies – theory & practice
Current Sequencing Technologies and Data Generation
List PriceReagent costs/run Reagent costs/Mb Error rates (%) Run TimeMillions of Reads/run Bases/readMB/runBACs,plastids, microbial genomes TranscriptomePlant/Animal.
Next-generation sequencing
The past, present, and future of DNA sequencing Dan Russell.
Evaluation of PacBio sequencing to improve the sunflower genome assembly Stéphane Muños & Jérôme Gouzy Presented by Nicolas Langlade Sunflower Genome Consortium.
1 DNA Sequencing Achim Tresch UoC / MPIPZ Cologne treschgroup.de/OmicsModule1415.html
CS 6293 Advanced Topics: Current Bioinformatics
Next Generation DNA Sequencing Platforms: Evolving Tools for
Genome Sequencing and Assembly High throughput Sequencing Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Update on Next-Generation Sequencing
Molecular Biology Dr. Chaim Wachtel April 4, 2013.
Sequencing Data Quality Saulo Aflitos. Read (≈100bp) Contig (≈2Kbp) Scaffold (≈ 2Mbp) Pseudo Molecule (Super Scaffold) Paired-End Mate-Pair LowComplexityRegion.
Sequencing Technologies and Applications at JGI
National Center for Genome Analysis Support: Carrie Ganote Ram Podicheti Le-Shin Wu Tom Doak Quality Control and Assessment.
Introduction to next generation sequencing Rolf Sommer Kaas.
MES Genome Informatics I - Lecture IV. NGS basics Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University.
PERFORMANCE COMPARISON OF NEXT GENERATION SEQUENCING PLATFORMS Bekir Erguner 1,3, Duran Üstek 2, Mahmut Ş. Sağıroğlu 1 1Advanced Genomics and Bioinformatics.
Ion Torrent and Minion Relatively low cost ‘next generation’ sequencing Wendy Smith School of Computing Science, Alan Ward Newcastle University, UK.
Sequencing at the speed of life. Simple is beautiful.
Next Generation DNA Sequencing
Quick introduction to genomic file types Preliminary quality control (lab)
Towards your own genome. Designing your Sequencing Run Sequencing strategy Genome size and genome.
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Gerton Lunter Wellcome Trust Centre for Human Genetics From calling bases to calling variants: Experiences with Illumina data.
De Novo Genome Assembly - Introduction Henrik Lantz - BILS/SciLife/Uppsala University.
Next-Generation Sequencing of Microbial Genomes and Metagenomes
Molecular Biology Dr. Chaim Wachtel May 28, 2015.
Jan Pačes Institute of Molecular Genetics AS CR
Quality Control Hubert DENISE
SEQUENCING – THE BENCHTOPS. Roche 454 Junior Same technology as 454 FLX Read length: 400 bases Paired-end 100,000 reads 12 hours (instrument time) Output.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
Sequence File Formats.
De Novo Genome Assembly - Introduction
Third Generation Sequencing. Today Illumina – Solexa sequencing technology 454 Life sciences – 454 sequencer Applied Biosystem – SOLiD system Tomorrow.
Meet the ants Camponotus floridanus Carpenter ant Harpegnathos saltator Jumping ant Solenopsis invicta Red imported fire ant Pogonomyrmex barbatus Harvester.
Canadian Bioinformatics Workshops
What should a bioinformatician know about DNA sequencing, and why?
Introduction to Illumina Sequencing
De Novo Genome Assembly - Introduction
16S rRNA Experimental Design
Next-generation sequencing technology
DNA Sequencing Second generation techniques
The NGS Era is Now Eric T. Weimer, PhD, D(ABMLI)
Sequencing technologies
Sequence File Formats.
Illumina Processing Steven Leonard
Introduction to next generation sequencing
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Next-generation sequencing technology
NGS technologies.
The FASTQ format and quality control
Next Generation Sequencing
B3- Olympic High School Bioinformatics
Independent scientist
2nd (Next) Generation Sequencing
The characterisation of mtDNA deletions using long-read sequencing
Quality control for Sequencing Experiments
ULTRASEQUENCING. Next Generation Sequencing: methods and applications.
Molecular Diagnosis of Autosomal Dominant Polycystic Kidney Disease Using Next- Generation Sequencing  Adrian Y. Tan, Alber Michaeel, Genyan Liu, Olivier.
Next-generation DNA sequencing
BF nd (Next) Generation Sequencing
Canadian Bioinformatics Workshops
Transcript length distribution resulting from different assemblies of the embryo samples across the three technologies (HiSeq, MiSeq, and PacBio). Transcript.
Mapping rates of different transcript sets to the P
Presentation transcript:

Sequencing technology and assembly 1

Sanger sequencing Sanger sequencing with radioactivity High throughput Sanger sequencing with fluorescence 2

Roche/454 sequencing Yield: 500,000,000 bp Cost: $5,000 Time: ~1 min per bp Read length: 450 bp - > 1kb

Pyrosequencing

Illumina sequencing Yield: 8,000,000,000 – 80,000,000,000 bp Time: ~1 hour per bp Read length: ~150 bp Cost: Sample Extraction, $14.00/sample Automated Sample Library, $90.00/sample MiSeq (2x250), 1 lane 8-10Gb/lane, $1,700.00/sample MiSeq (2x300), 1 lane, 10-12Gb/lane, $2,100.00/sample HiSeq2500 (2x150), 1 lane, ~40Gb/lane, $2,500.00/lane HiSeq2500 (2x250), 1 lane, ~65Gb/lane, $3,500.00/lane 5

Illumina sequencing

Ion Torrent Yield: 50,000,000 bp Time: 2 hours Read length: 500bp <1 min per bp Cost: $500 7

Ion Torrent

PacBio Long reads (5-10kb) High error, but read 150x coverage Library prep: $600 Sequencing: $300

PacBio

Minion Quick sample prep Long reads (~50kb) High error $150 per run

Minion

Errors Different technologies have different error rates:

Base calling Need to be sure which base you have identified Depends on the technology Each machine includes software Phred is an historical package developed by at U. Washington Phred scores are probability that the base is correct 14

Quality values Phred 10: 1 x 101 chance that the base is wrong Phred 99: the base is correct! Fastq scores are the score + 33 then converted to ascii text 15

Homopolymeric errors Homopolymeric runs: Signal is not linear Not clear if 5 or 6 bases

Errors Different technologies have different error rates: Pyrosequencing/Ion Torrent – homopolymeric tracts Illumina – substitution errors PacBio – Machines can not keep up with biology Minion – noise coming through the membrane