 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species.

Slides:



Advertisements
Similar presentations
The Past, Present, and Future of DNA Sequencing
Advertisements

Virus discovery-454 sequencing
The Good, Bad, and Ugly of Next-Gen Sequencing
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
Next–generation DNA sequencing technologies – theory & practice
Recombinant DNA technology
Next-generation sequencing
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
1 Les mesures de diversité microbienne par séquençage massivement parallèle Richard Christen CNRS UMR 6543 & Université de Nice
Canadian Bioinformatics Workshops
Next-generation sequencing – the informatics angle Gabor T. Marth Boston College Biology Department AGBT 2008 Marco Island, FL. February
What Is Genomics? Genomics is the study of how the entire genome of a species functions as a unit and evolves over time. It is the study of life’s blueprint,
Greg Phillips Veterinary Microbiology
The Human Genome Project and ~ 100 other genome projects:
CS273a Lecture 9, Aut08, Batzoglou CS273a Lecture 9, Fall 2008 Quality of assemblies—mouse N50 contig length Terminology: N50 contig length If we sort.
$399 Personal Genome Service $2,500 Health Compass service $985 deCODEme (November 2007) (April 2008) $350,000 Whole-genome sequencing (November 2007)
Informatics tools for next-generation sequence analysis Gabor T. Marth Boston College Biology Department University of Michigan October 20, 2008.
Plasmid purification lab
High Throughput Sequencing
The Microbiome and Metagenomics
Cancer Genomics Lecture Outline
Zachary Bendiks. Jonathan Eisen  UC Davis Genome Center  Lab focus: “Our work focuses on genomic basis for the origin of novelty in microorganisms (how.
Update on Next-Generation Sequencing
Next generation sequencing platforms Applications
The impact of next-generation sequencing technology of genetics Elaine R. Mardis – 11 February Washington School of Medicine, Genome Sequencing Center.
Next Now-Generation Genomics: methods and applications for modern disease research Aaron J. Mackey, Ph.D. Center for Public Health.
Next generation sequencing Xusheng Wang 4/29/2010.
From Haystacks to Needles AP Biology Fall Isolating Genes  Gene library: a collection of bacteria that house different cloned DNA fragments, one.
DNA Fingerprinting of Bacterial Communities. Overview Targets gene for ribosomal RNA (16S rDNA) Make many DNA copies of the gene for the entire community.
Molecular Microbial Ecology
ARC Biotechnology Platform: Sequencing for Game Genomics Dr Jasper Rees
Genome Sequencing and Assembly
PCR and Diagnostics Unique sequences of nucleotides if detectable can be used as definitive diagnostic determinants NA hybridisation is the basis for rapid.
Introduction to next generation sequencing Rolf Sommer Kaas.
Probes can be designed in an evolutionary hierarchy.
Bioinformatics and Sequencing Relevant to SolCAP
P. Tang ( 鄧致剛 ); RRC. Gan ( 甘瑞麒 ); PJ Huang ( 黄栢榕 ) Bioinformatics Center, Chang Gung University. Genome Sequencing Genome Resequencing De novo Genome.
The Changing Face of Sequencing
Molecular Techniques in Microbiology These include 9 techniques (1) Standard polymerase chain reaction Kary Mullis invented the PCR in 1983 (USA)Kary.
Advancing Science with DNA Sequence Metagenome definitions: a refresher course Natalia Ivanova MGM Workshop September 12, 2012.
Development and Application of SNP markers in Genome of shrimp (Fenneropenaeus chinensis) Jianyong Zhang Marine Biology.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
How will new sequencing technologies enable the HMP? Elaine Mardis, Ph.D. Associate Professor of Genetics Co-Director, Genome Sequencing Center Washington.
PHYSICAL MAPPING AND POSITIONAL CLONING. Linkage mapping – Flanking markers identified – 1cM, for example Probably ~ 1 MB or more in humans Need very.
SEQUENCING – THE BENCHTOPS. Roche 454 Junior Same technology as 454 FLX Read length: 400 bases Paired-end 100,000 reads 12 hours (instrument time) Output.
IPlant Genomics in Education Workshop Genome Exploration in Your Classroom.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
Anna Shcherbina Bioinformatics Challenge Day 01/10/2013 De novo assembly from clinical sample This work is sponsored by the Defense Threat Reduction Agency.
Ultra-High Throughput DNA Sequencing on the 454/Roche GS-FLX
Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦
Canadian Bioinformatics Workshops
MEGAN analysis of metagenomic data Daniel H. Huson, Alexander F. Auch, Ji Qi, et al. Genome Res
Tools for microbial community analysis. What I am not going to talk  Culture dependent analysis  Isolate all possible colonies  Infer community  Test.
16S rRNA Experimental Design
Rob Edwards San Diego State University
Next generation sequencing
Metagenomics: From Bench to Data Analysis 19-23rd September S rRNA-based surveys for Community Analysis: How Quantitative are they? Dr.
Preprocessing Data Rob Schmieder.
Quality Control & Preprocessing of Metagenomic Data
Introduction to next generation sequencing
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Genetic Engineering.
Workshop on the analysis of microbial sequence data using ARB
Polymerase Chain Reaction & DNA Profiling
Chapter 14 Bioinformatics—the study of a genome
DNA and the Genome Key Area 8a Genomic Sequencing.
Next-generation DNA sequencing
Introduction to Sequencing
Presentation transcript:

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

Classic chain-terminator sequencing Dye chain-terminator sequencing Next-generation sequencing

 Next-gen sequencing principle › Massive parallel › Add ACTGs › Catch a signal

 Roche/454 GS-FLX+ (‘454’) › Pyrosequencing  problems with homopolymers (e.g. AAAAAA) › Long-read sequencing: bp › Variable sequencing length › 1 million reads/run  1Gb/run › Sequencing speed: ~ 1 day/run › Next-next generation: IonTorrent PGM/Proton

 Illumina › Sequence by synthesis › Short-read sequencing: 36, 72, …, 150bp › Fixed sequencing length › 1 billion reads/run  100Gb/run (= 33 x human genome!)  Sequencing speed: 3 day – 10 days ~ length  Solid › Short-read sequencing (similar to Illumina)

 454  Illumina

 Price per run: $10000/run  Price per machine: $ › Supporting IT hardware › Peripheral devices such as fragmentation instrument, PCR equipment … › Negotiating power…  Use service centers! › Nxtgnt (BE), GATC(EU), Baseclear(NL), BGI … › No overhead cost, no maintenance etc. › Cheaper

 Next-generation sequencing has become 2 nd generation sequencing  Next-next-generation sequencing is almost there: 3 rd generation sequencing › Helicos: True Single Molecule Sequencing › IonTorrent/Life: Cheap and fast › Nanopore: Unlimited read size ›…›…

 Evolution sequencing technology goes hand in hand with evolution of › IT infrastructure/hardware › Analysis software  Hardware › 1 Illumina run ~ 100Gb text-file ~ 5million page book › Processing power/storage are an issue!  Software › Mapping to a human genome: ‘couple of hours’

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

 Prokaryotic genomics 101 › Prokaryotes = bacterias + archaea › Prokaryotic genomes  Large circular genome (0.5 – 10 Mb) ‘chromosome’  Small plasmids ( kb) (virulence factors, antibiotics resistance …)  (Almost) no introns  Easy ORF annotation

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

 1953: Watson/Crick discover DNA helix  1977: First complete genome  bacteriophage φX174  1995: First genome of free-living organism  H. influenza  2001: First draft of the human genome  2006: >200 complete bacterial genomes  2012: An uncountable number of bacterial genomes have been sequenced using next-gen sequencing

 Complete bacterial genomes used to be › Expensive › Difficult to obtain › ‘Nature’ or ‘Science’ work › Remained complex until the invention of next-generation sequencing

 Using next-generation sequencing, de novo sequencing has become › Relatively easy › Relatively cheap › Routine research  Already >10 complete bacterial genomes published in 2012 › More than just an assembly!

 Practical 1. Get some DNA from an isolated species of interest 2. Sequence: long or short reads (1-10 days) 3. Obtain your sequences 4. Assemble (1h)  Pure de novo assembly  Guided assembly 5. Annotate the genome (days-weeks)

 Assembly: Multiple ‘short’ reads 1 long sequence  Existing software › Velvet › SSAKE › Newbler › SSAKE › … Source: Nature 2009, MacLean et al.

 Relatively cheap › Sequencing cost: depending on coverage  Illumina, 30x, 5Gb genome: $10-$100  454, 30x, 5Gb genome: $1000-$5000 › Equipment  IT infrastructure, sequencing equipment, people …  Relatively easy › Need for IT support › No out-of-the-box standard solution for everything › Several different software packages for assembly

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

 De novo genome assembly › Study of 1 single species › Need for species isolation  Metagenomics analysis › Study of a community of species › No need for isolation (culturing bias!) › Study the collective gene pool and function of the community/ecology › No need for individual functions

 Practical 1. Get bacterial DNA or RNA from a sample  Soil  Gut/Fecal  Ocean water (e.g. Craig Venter)  … 2. Sequence: long or short reads (1-10 days) 3. Obtain your sequences 4. Map on a database of known genes (1 day) 5. Annotate/analyse the community (weeks)

 2010: Giant Panda genome (2 nd carnivore) › No umami taster receptor -> no meat affinity › The panda is more a dog than a bear › The panda is a carnivore eating bamboo!

 Still 2010 !: Panda ‘microbiome’  Gut microbiome of the panda reveals the presence of bamboo/cellulose degrading pathways

 A clinical example: gut microbiome can predict diabetes and malnourishment Plos One (2011), Brown et al.Plos One (2010), Valladares et al.Gut Pathology (2011),Gupta et al.

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

 Classical SNP analysis - practical 1. Design PCR primers 2. Generate amplicons 3. Re-sequence using long read sequencing  Conserve ‘SNP blocks’ 4. Detect SNPs 5. Correlate SNPs to drug resistance, severity of symptoms …

 Amplicon resequencing is the same for human, prokaryotic, viral analyses  Many standardized out-of-the-box solutions available  Very simple analysis  Watch out for the overkill… › Don’t use a bazooka to kill a fly! › Throughput can be too high

 Profile the coding region of hepatitis C Lauck et al. 2012

 Use next-generation sequencing to predict the optimal HIV therapy Thielen et al. 2012

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

 Imagine the following research questions › Which (known) species/groups are present in a certain sample › Does this composition alter given a certain treatment, change of conditions, patients etc.  No need for de novo genome sequencing  No metagenomics: species instead of functions

 Prokaryotes have the gene 16S rDNA, coding for ribosomal RNA  The 16S rDNA region is 1.5 kb long  16S rDNA is specific for each species/strain  Theoretical: 4 1,500 = possibilities  In practice: 16S rDNA sequence known for millions of species

 16S rDNA can be isolated in different species using universal PCR primers › Isolate/amplify different regions using the same primers  Compare the isolated sequences against a database of known sequences

 Practical procedure 1. Sample an environment and isolate DNA 2. Do a universal PCR amplification 3. Sequence using long read sequencing: the longer the better! 4. Obtain sequences 5. Map sequences against a reference database 6. Annotate the data

 Example: The Antarctica project › Which parameters determine the composition of bacterial communities in antarctical lakes? › 20 different samples/lakes › Sequence 16S rDNA genes › 1 x 454 run (1 million 500bp sequences) › Map all sequences back to the RDP database

 Analyse the data using computing power › Compare different locations  Is species A present in location1, location2,… › Assess the distribution in a single location  How dominant is the most dominant species in location 1  How many species are in location 1  …  Visualize !

 Analyse different samples on different taxonomic levels › Include taxonomic tree of life of bacterias › Use a ‘taxonomy browser’

 Analyse a single location

 Compare different locations

AnalysisLab work difficultyAnalysis difficulty De novo genome++ (isolate)+ Metagenomics++++ (pathways etc.) SNP+++ (design primers)++ (correlate) Species quantification++ (universal primers)++

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

 Viral profiling › Viral profiling = prokaryotic profiling, but…  Cheaper  Faster  Easier › De novo genome sequencing = OK › Don’t spend $ on a 100kb genome! › Multiplexing/pooling capacity is limited!

 Watch out for the overkill › An illumina run can be split into 8 lanes › >20 samples per lane can be combined  Still >100Mb per sample…

Thanks for your attention !