Presentation is loading. Please wait.

Presentation is loading. Please wait.

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species.

Similar presentations


Presentation on theme: " Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species."— Presentation transcript:

1

2  Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

3 Classic chain-terminator sequencing Dye chain-terminator sequencing Next-generation sequencing

4  Next-gen sequencing principle › Massive parallel › Add ACTGs › Catch a signal

5  Roche/454 GS-FLX+ (‘454’) › Pyrosequencing  problems with homopolymers (e.g. AAAAAA) › Long-read sequencing: 500-1000 bp › Variable sequencing length › 1 million reads/run  1Gb/run › Sequencing speed: ~ 1 day/run › Next-next generation: IonTorrent PGM/Proton

6  Illumina › Sequence by synthesis › Short-read sequencing: 36, 72, …, 150bp › Fixed sequencing length › 1 billion reads/run  100Gb/run (= 33 x human genome!)  Sequencing speed: 3 day – 10 days ~ length  Solid › Short-read sequencing (similar to Illumina)

7  454  Illumina

8  Price per run: $10000/run  Price per machine: $200-500.000 › Supporting IT hardware › Peripheral devices such as fragmentation instrument, PCR equipment … › Negotiating power…  Use service centers! › Nxtgnt (BE), GATC(EU), Baseclear(NL), BGI … › No overhead cost, no maintenance etc. › Cheaper

9  Next-generation sequencing has become 2 nd generation sequencing  Next-next-generation sequencing is almost there: 3 rd generation sequencing › Helicos: True Single Molecule Sequencing › IonTorrent/Life: Cheap and fast › Nanopore: Unlimited read size ›…›…

10  Evolution sequencing technology goes hand in hand with evolution of › IT infrastructure/hardware › Analysis software  Hardware › 1 Illumina run ~ 100Gb text-file ~ 5million page book › Processing power/storage are an issue!  Software › Mapping to a human genome: ‘couple of hours’

11  Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

12  Prokaryotic genomics 101 › Prokaryotes = bacterias + archaea › Prokaryotic genomes  Large circular genome (0.5 – 10 Mb) ‘chromosome’  Small plasmids (1-1000 kb) (virulence factors, antibiotics resistance …)  (Almost) no introns  Easy ORF annotation

13  Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

14  1953: Watson/Crick discover DNA helix  1977: First complete genome  bacteriophage φX174  1995: First genome of free-living organism  H. influenza  2001: First draft of the human genome  2006: >200 complete bacterial genomes  2012: An uncountable number of bacterial genomes have been sequenced using next-gen sequencing

15  Complete bacterial genomes used to be › Expensive › Difficult to obtain › ‘Nature’ or ‘Science’ work › Remained complex until the invention of next-generation sequencing

16  Using next-generation sequencing, de novo sequencing has become › Relatively easy › Relatively cheap › Routine research  Already >10 complete bacterial genomes published in 2012 › More than just an assembly!

17  Practical 1. Get some DNA from an isolated species of interest 2. Sequence: long or short reads (1-10 days) 3. Obtain your sequences 4. Assemble (1h)  Pure de novo assembly  Guided assembly 5. Annotate the genome (days-weeks)

18  Assembly: Multiple ‘short’ reads 1 long sequence  Existing software › Velvet › SSAKE › Newbler › SSAKE › … Source: Nature 2009, MacLean et al.

19  Relatively cheap › Sequencing cost: depending on coverage  Illumina, 30x, 5Gb genome: $10-$100  454, 30x, 5Gb genome: $1000-$5000 › Equipment  IT infrastructure, sequencing equipment, people …  Relatively easy › Need for IT support › No out-of-the-box standard solution for everything › Several different software packages for assembly

20  Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

21  De novo genome assembly › Study of 1 single species › Need for species isolation  Metagenomics analysis › Study of a community of species › No need for isolation (culturing bias!) › Study the collective gene pool and function of the community/ecology › No need for individual functions

22  Practical 1. Get bacterial DNA or RNA from a sample  Soil  Gut/Fecal  Ocean water (e.g. Craig Venter)  … 2. Sequence: long or short reads (1-10 days) 3. Obtain your sequences 4. Map on a database of known genes (1 day) 5. Annotate/analyse the community (weeks)

23

24  2010: Giant Panda genome (2 nd carnivore) › No umami taster receptor -> no meat affinity › The panda is more a dog than a bear › The panda is a carnivore eating bamboo!

25  Still 2010 !: Panda ‘microbiome’  Gut microbiome of the panda reveals the presence of bamboo/cellulose degrading pathways

26

27  A clinical example: gut microbiome can predict diabetes and malnourishment Plos One (2011), Brown et al.Plos One (2010), Valladares et al.Gut Pathology (2011),Gupta et al.

28  Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

29  Classical SNP analysis - practical 1. Design PCR primers 2. Generate amplicons 3. Re-sequence using long read sequencing  Conserve ‘SNP blocks’ 4. Detect SNPs 5. Correlate SNPs to drug resistance, severity of symptoms …

30  Amplicon resequencing is the same for human, prokaryotic, viral analyses  Many standardized out-of-the-box solutions available  Very simple analysis  Watch out for the overkill… › Don’t use a bazooka to kill a fly! › Throughput can be too high

31

32  Profile the coding region of hepatitis C Lauck et al. 2012

33  Use next-generation sequencing to predict the optimal HIV therapy Thielen et al. 2012

34  Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

35  Imagine the following research questions › Which (known) species/groups are present in a certain sample › Does this composition alter given a certain treatment, change of conditions, patients etc.  No need for de novo genome sequencing  No metagenomics: species instead of functions

36  Prokaryotes have the gene 16S rDNA, coding for ribosomal RNA  The 16S rDNA region is 1.5 kb long  16S rDNA is specific for each species/strain  Theoretical: 4 1,500 = 10 903 possibilities  In practice: 16S rDNA sequence known for millions of species

37  16S rDNA can be isolated in different species using universal PCR primers › Isolate/amplify different regions using the same primers  Compare the isolated sequences against a database of known sequences

38  Practical procedure 1. Sample an environment and isolate DNA 2. Do a universal PCR amplification 3. Sequence using long read sequencing: the longer the better! 4. Obtain sequences 5. Map sequences against a reference database 6. Annotate the data

39  Example: The Antarctica project › Which parameters determine the composition of bacterial communities in antarctical lakes? › 20 different samples/lakes › Sequence 16S rDNA genes › 1 x 454 run (1 million 500bp sequences) › Map all sequences back to the RDP database

40  Analyse the data using computing power › Compare different locations  Is species A present in location1, location2,… › Assess the distribution in a single location  How dominant is the most dominant species in location 1  How many species are in location 1  …  Visualize !

41  Analyse different samples on different taxonomic levels › Include taxonomic tree of life of bacterias › Use a ‘taxonomy browser’

42  Analyse a single location

43  Compare different locations

44 AnalysisLab work difficultyAnalysis difficulty De novo genome++ (isolate)+ Metagenomics++++ (pathways etc.) SNP+++ (design primers)++ (correlate) Species quantification++ (universal primers)++

45  Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species quantification  Viral profiling › De novo genome sequencing

46  Viral profiling › Viral profiling = prokaryotic profiling, but…  Cheaper  Faster  Easier › De novo genome sequencing = OK › Don’t spend $10.000 on a 100kb genome! › Multiplexing/pooling capacity is limited!

47  Watch out for the overkill › An illumina run can be split into 8 lanes › >20 samples per lane can be combined  Still >100Mb per sample…

48 Thanks for your attention !

49 joachim.deschrijver@ugent.be


Download ppt " Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species."

Similar presentations


Ads by Google