Sequencing technologies

Slides:



Advertisements
Similar presentations
High-Throughput Sequencing Technologies
Advertisements

SOLiD Sequencing & Data
Next-generation sequencing
The past, present, and future of DNA sequencing Dan Russell.
Canadian Bioinformatics Workshops
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Greg Phillips Veterinary Microbiology
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Evaluation of PacBio sequencing to improve the sunflower genome assembly Stéphane Muños & Jérôme Gouzy Presented by Nicolas Langlade Sunflower Genome Consortium.
High Throughput Sequencing
Delon Toh. Pitfalls of 2 nd Gen Amplification of cDNA – Artifacts – Biased coverage Short reads – Medium ~100bp for Illumina – 700bp for 454.
CS 6293 Advanced Topics: Current Bioinformatics
Next Generation DNA Sequencing Platforms: Evolving Tools for
GENOME SEQUENCING. I. Genome sequencing The Sanger Method (1977) Denaturation +priming Polymerization.
NGS Data Generation Dr Laura Emery. Overview The NGS data explosion Sequencing technologies An example of a sequencing workflow Bioinformatics challenges.
Update on Next-Generation Sequencing
Next Now-Generation Genomics: methods and applications for modern disease research Aaron J. Mackey, Ph.D. Center for Public Health.
High-Throughput Sequencing Technologies
DNA Sequencing Today, laboratories routinely sequence the order of nucleotides in DNA. DNA sequencing is done to: Confirm the identity of genes isolated.
Analyzing your clone 1) FISH 2) “Restriction mapping” 3) Southern analysis : DNA 4) Northern analysis: RNA tells size tells which tissues or conditions.
From Haystacks to Needles AP Biology Fall Isolating Genes  Gene library: a collection of bacteria that house different cloned DNA fragments, one.
High Throughput Sequencing Methods and Concepts
Introduction to next generation sequencing Rolf Sommer Kaas.
MES Genome Informatics I - Lecture IV. NGS basics Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University.
Genomics – Next-Gen sequencing and Microarrays
High Throughput Sequencing Methods and Concepts Cedric Notredame adapted from S.M Brown.
Next Generation DNA Sequencing
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
De Novo Genome Assembly - Introduction Henrik Lantz - BILS/SciLife/Uppsala University.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Genome Characterization DNA sequence-ULTIMATE Map DNA sequencing-methods Assembly/sequencing BIO520 BioinformaticsJim Lund Assigned reading: Service 2006.
SEQUENCING – THE BENCHTOPS. Roche 454 Junior Same technology as 454 FLX Read length: 400 bases Paired-end 100,000 reads 12 hours (instrument time) Output.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
Ultra-High Throughput DNA Sequencing on the 454/Roche GS-FLX
De Novo Genome Assembly - Introduction
Introduction to PCR Polymerase Chain Reaction
Topic Cloning and analyzing oxalate degrading enzymes to see if they dissolve kidney stones with Dr. VanWert.
Introduction to Illumina Sequencing
Cse587A/Bio 5747: L2 1/19/06 1 DNA sequencing: Basic idea Background: test tube DNA synthesis DNA polymerase (a natural enzyme) extends 2-stranded DNA.
DNA Sequencing First generation techniques
Next-generation sequencing technology
Introduction to PCR Polymerase Chain Reaction
Research Techniques Made Simple: Next-Generation Sequencing:
DNA Sequencing Second generation techniques
Short Read Sequencing Analysis Workshop
Next generation sequencing
The NGS Era is Now Eric T. Weimer, PhD, D(ABMLI)
Sequence assembly Jose Blanca COMAV institute bioinf.comav.upv.es.
Introduction to next generation sequencing
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Next-generation sequencing technology
NGS technologies.
Sequencing technology and assembly
SVM 2FG.
Sequencing Technologies
Teagasc/APC Sequencing Facility
SOLEXA aka: Sequencing by Synthesis
B3- Olympic High School Bioinformatics
CISC 667 Intro to Bioinformatics (Spring 2007) Molecular Biology Tools
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
2nd (Next) Generation Sequencing
ULTRASEQUENCING. Next Generation Sequencing: methods and applications.
High-Throughput Sequencing Technologies
High-Throughput Sequencing Technologies
Next-generation DNA sequencing
Single-Molecule Sequencing: Towards Clinical Applications
BF nd (Next) Generation Sequencing
Canadian Bioinformatics Workshops
(Top) Construction of synthetic long read clouds with 10× Genomics technology. (Top) Construction of synthetic long read clouds with 10× Genomics technology.
Standard (Sanger) sequencing
Presentation transcript:

Sequencing technologies Jose Blanca COMAV institute bioinf.comav.upv.es

Genome.org/sequencingcosts Public US project 1000$

Sanger sequencing

Sanger sequencing Traditional DNA sequencing method Ideal for small sequencing projects Read length around 600-800 bp Around 5-10$ per reaction 384 reactions in parallel at most Applied Biosystems is the main technological provider

Sanger sequencing

Sanger sequencing

ABI files Sequences are usually delivered as ABI files. Binary files. Contain real signal, chromatogram and called sequence and quality They can be opened with: chromas (win) (www.technelysium.com.au/chromas.html) Sequence scanner (win) (Applied Biosystems) FinchTv (win, mac, linux) (www.geospiza.com/Products/finchtv.shtml) trev (win, mac, linux) (part of Staden). Usually they have a .ab1 extension.

Sequence and quality Phred score = - 10 log (prob error)

Any evidence has error bars Any conclusion has error bars

Sequence and quality Due to technical limitations different technologies have different errors patterns.

Sanger sequencing Quality is worst at the beginning and at the end.

Other sources of error Pre-sequencing: PCR mutation-like errors PCR primers (e.g. hexamers in random priming) Cloning artifacts, chimeric molecules Post-sequencing: Assembly artifacts Alignment errors due to: Reference Alignment algorithms SNV calling software

2nd generation sequencing

Sanger vs NGS sequencing Num. sequences per reaction 1 clone Millions of molecules Max. parallelization 384 Several millions Sequence quality High Low Sequence length 600-800 bp 35-20000 (depends on the platform) Throughtput

Sanger vs NGS

Library preparation Fragmentation Sonication Nebulization Shearing Size selection End repair Sequencing adaptor ligation Purification

Illumina Previously known as Solexa Reversible terminators based sequencing technique Short reads (50 or 250bp depending on the version) Lowest cost per base Ideal for resequencing projects Highest throughput Runs divided in 8 lanes Up to 4000 million reads Can sequence both ends of the molecules (paired ends) HiSeq2500, NextSeq 500 and MiSeq

Illumina

Illumina Quality diminishes with sequence length. No homopolymer problem, mainly substitution errors.

3rd generation sequencing

PacBio 3rd generation platform (single molecule) Polymerase based chemistry (SMRT) Longest NGS reads (up to 20,000bp) Very high error rate Ideal for de novo sequencing projects 45000 reads

PacBio 3rd generation, single molecule detection. No amplification step required. Nucleotides labeled on the phosphate removed during the polymerization. Sequencing based on the time required by the polymerase to incorporate a nucleotide (Polymerase requires milliseconds versus microseconds for the stochastic diffusion)

PacBio length distribution Longest lengths Limited by the polymerase damages due to the laser Sequencing strategies Standard Circular consensus Taken from flxlexblog.wordpress.com Distributions for standard sequencing, for circular the mean is around 250 pb.

PacBio quality distribution Distributions for standard sequencing. For circular sequencing the quality is at least 99.9% Taken from flxlexblog.wordpress.com

Nanopore Senses differences in ion flow USB powered

Bioinformatic challenges Huge data files handling. Beefy computers required. Software still being developed or missing. Ad-hoc software required during the analysis. Existing software tailored to experienced bioinformaticians. Dollar for dollar rule proposed EDSAC by Computer Laboratory Cambridge

Bioinformatic challenges Amount of data managed on a typical transcriptome assembly

Genome.org/sequencingcosts

This work is licensed under the Creative Commons Attribution 3 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. Jose Blanca COMAV institute bioinf.comav.upv.es