Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to next generation sequencing Rolf Sommer Kaas.

Similar presentations


Presentation on theme: "Introduction to next generation sequencing Rolf Sommer Kaas."— Presentation transcript:

1 Introduction to next generation sequencing Rolf Sommer Kaas

2 National Food Institute, Technical University of Denmark Outline Next generation sequencing Ion Torrent454PacBioIllumina Output Data Analysis History MinION

3 National Food Institute, Technical University of Denmark History ‘77 ‘72 1980 1953 ‘751981 1990

4 National Food Institute, Technical University of Denmark History 1990-2003 Human genome project 1998 Random Shotgun Sequencing Fast 300 mill. $ Hierarchical Shotgun Sequencing 3 billion $

5 National Food Institute, Technical University of Denmark History 1990-2003 Human genome project 2001: Draft 2003: Complete

6 National Food Institute, Technical University of Denmark History ‘77 ‘72 1980 1953 ‘751981 1990 2003

7 National Food Institute, Technical University of Denmark History 2004 Next Generation Sequencing 454 Life Sciences: Parallelized pyrosequencing Reduce costs 6 fold

8 National Food Institute, Technical University of Denmark History 2004 Next Generation Sequencing (Wetterstrand KA. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP). Accessed 31-oct-14.) European Nucleotide Archive (ENA) (http://www.ebi.ac.uk/ena/about/statistics(http://www.ebi.ac.uk/ena/about/statistics)

9 National Food Institute, Technical University of Denmark Next generation sequencing Roche, 454 Life Sciences (GS FLX Titanium) Life Technologies (Ion Torrent & Ion Proton) Illumina (HiSeq, MiSeq, GenomeAnalyzer) Pacific Biosciences (PacBio RS) Oxford Nanopore (MinION, PromethION, GridION)

10 National Food Institute, Technical University of Denmark Next generation sequencing Method outline - library 1. Fragment DNA2. Ligate adapters Amplification primer Sequencing primer Barcode 3. Amplification 4. Sequencing

11 National Food Institute, Technical University of Denmark Next generation sequencing technologies Ion Torrent Problem with homopolymers Fast Expensive Long insert sizes Low throughput Cheapest

12 National Food Institute, Technical University of Denmark Next generation sequencing Illumina Genome AnalyzerHiSeq MiSeq Short reads (~50-250 bp) Good Accuracy High Throughput

13 National Food Institute, Technical University of Denmark Next generation sequencing technologies PacBio Expensive Lower accuracy Long reads (~5000 bp)

14 National Food Institute, Technical University of Denmark Next generation sequencing technologies Nanopore Upcoming technology Released to select labs

15 National Food Institute, Technical University of Denmark Next generation sequencing technologies Nanopore Up to 80,000 bp reads MinION: 150 mill. Bp pr 6 h. (30x coverage of E. coli) GridION MinION PromethION

16 National Food Institute, Technical University of Denmark Next generation sequencing technologies Machine distribution Illumina is the most common ABI SOLiD not as big as it appears

17 National Food Institute, Technical University of Denmark Reads Sample Raw reads Output

18 National Food Institute, Technical University of Denmark What is sequence data? Sequence data is stored in fasta files Fasta example: Output Header/ID Sequence

19 National Food Institute, Technical University of Denmark Handling sequence data? Watch out! Output Same FASTA file in Word This should be fine…

20 National Food Institute, Technical University of Denmark Handling sequence data? Watch out! Output What your data actually looks like! Oh no! This wont work… Take home message: Use “pure text editors” Examples: Notepad (Win) Textedit (Mac) Sublime Text (all) Save files in “txt” format.

21 National Food Institute, Technical University of Denmark What is the data? Fastq files What is Fastq? Fasta + quality scores Fastq example: @FCC0CD5ACXX:1:1101:1103:2048#ACCGT/1 ACNGTGTTTTTAGTTATTGTTTTGTTAAGTTGGGTTTTTTGTACCCAATAGCCAACAAGCCGCCTTTATGGCGGTTTTTTTGTGCCTGAAAAGTGGGCGCA + _BP`ccceggcegihiiighiifhihfddgfhi^efgfhhhhhegiiiiiiiihiihihggeeccdddcccacWTT^acc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1165:2058#ACGTT/1 ACGTTAGCAGAATCGCTTTCTGTTCGTTTTCCACCTGCGACAGACGCACCGGACCACGGTTGGCGAGATCGTCGCGCAGAATATCGGCGGCACGCTGCGAC + bb_eeceefeggehhdagfghhiihfghighhffhifhhcghfdhiihafgdceba`a\aaccc^V]^baccaccXaaX^bbcccaac[_X]]a[aacXT @FCC0CD5ACXX:1:1101:1135:2082#AGCGT/1 AGCGTGACAAACATTTTATTGCGCCCGGTTTTATCCAGCTTGAATGCCTGACGAAAGAAGATGATGGTGACGACGATGGAGAGAACAATCAGCACCAGATT + bbbeeeeefggfgiihgiigiiiiiiiffgifgeghiiihhfefffhhhfgh_fhggdgegeaceeacbdcbcc\^aa]``_^bb]bcccccbac_a^bc @FCC0CD5ACXX:1:1101:1239:2083#AGCGT/1 AGCGTCTGACTCACACAAAAACGGTAACACAGTTATCCACAGAATCAGGGGATAAGGCCGGAAAGAACATGTGAGCAAAAAGGCAAAGCCAGGACAAAAGG + bbbeeeeegggggiiiiiiiiiigifhhiiighiiihhiiiiiiihiiiiiiiiiihiigcdbbdcdcccccdccccccccacccccccbcccacccccc 1 read, 4 lines Output

22 National Food Institute, Technical University of Denmark What is the data? Fastq files What is Fastq? Fasta + quality scores Fastq example: @FCC0CD5ACXX:1:1101:1103:2048#ACCGT/1 ACNGTGTTTTTAGTTATTGTTTTGTTAAGTTGGGTTTTTTGTACCCAATAGCCAACAAGCCGCCTTTATGGCGGTTTTTTTGTGCCTGAAAAGTGGGCGCA + _BP`ccceggcegihiiighiifhihfddgfhi^efgfhhhhhegiiiiiiiihiihihggeeccdddcccacWTT^acc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1165:2058#ACGTT/1 ACGTTAGCAGAATCGCTTTCTGTTCGTTTTCCACCTGCGACAGACGCACCGGACCACGGTTGGCGAGATCGTCGCGCAGAATATCGGCGGCACGCTGCGAC + bb_eeceefeggehhdagfghhiihfghighhffhifhhcghfdhiihafgdceba`a\aaccc^V]^baccaccXaaX^bbcccaac[_X]]a[aacXT @FCC0CD5ACXX:1:1101:1135:2082#AGCGT/1 AGCGTGACAAACATTTTATTGCGCCCGGTTTTATCCAGCTTGAATGCCTGACGAAAGAAGATGATGGTGACGACGATGGAGAGAACAATCAGCACCAGATT + bbbeeeeefggfgiihgiigiiiiiiiffgifgeghiiihhfefffhhhfgh_fhggdgegeaceeacbdcbcc\^aa]``_^bb]bcccccbac_a^bc @FCC0CD5ACXX:1:1101:1239:2083#AGCGT/1 AGCGTCTGACTCACACAAAAACGGTAACACAGTTATCCACAGAATCAGGGGATAAGGCCGGAAAGAACATGTGAGCAAAAAGGCAAAGCCAGGACAAAAGG + bbbeeeeegggggiiiiiiiiiigifhhiiighiiihhiiiiiiihiiiiiiiiiihiigcdbbdcdcccccdccccccccacccccccbcccacccccc Header/ID Output

23 National Food Institute, Technical University of Denmark What is the data? Fastq files What is Fastq? Fasta + quality scores Fastq example: @FCC0CD5ACXX:1:1101:1103:2048#ACCGT/1 ACNGTGTTTTTAGTTATTGTTTTGTTAAGTTGGGTTTTTTGTACCCAATAGCCAACAAGCCGCCTTTATGGCGGTTTTTTTGTGCCTGAAAAGTGGGCGCA + _BP`ccceggcegihiiighiifhihfddgfhi^efgfhhhhhegiiiiiiiihiihihggeeccdddcccacWTT^acc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1165:2058#ACGTT/1 ACGTTAGCAGAATCGCTTTCTGTTCGTTTTCCACCTGCGACAGACGCACCGGACCACGGTTGGCGAGATCGTCGCGCAGAATATCGGCGGCACGCTGCGAC + bb_eeceefeggehhdagfghhiihfghighhffhifhhcghfdhiihafgdceba`a\aaccc^V]^baccaccXaaX^bbcccaac[_X]]a[aacXT @FCC0CD5ACXX:1:1101:1135:2082#AGCGT/1 AGCGTGACAAACATTTTATTGCGCCCGGTTTTATCCAGCTTGAATGCCTGACGAAAGAAGATGATGGTGACGACGATGGAGAGAACAATCAGCACCAGATT + bbbeeeeefggfgiihgiigiiiiiiiffgifgeghiiihhfefffhhhfgh_fhggdgegeaceeacbdcbcc\^aa]``_^bb]bcccccbac_a^bc @FCC0CD5ACXX:1:1101:1239:2083#AGCGT/1 AGCGTCTGACTCACACAAAAACGGTAACACAGTTATCCACAGAATCAGGGGATAAGGCCGGAAAGAACATGTGAGCAAAAAGGCAAAGCCAGGACAAAAGG + bbbeeeeegggggiiiiiiiiiigifhhiiighiiihhiiiiiiihiiiiiiiiiihiigcdbbdcdcccccdccccccccacccccccbcccacccccc DNA sequence Output

24 National Food Institute, Technical University of Denmark What is the data? Fastq files What is Fastq? Fasta + quality scores Fastq example: @FCC0CD5ACXX:1:1101:1103:2048#ACCGT/1 ACNGTGTTTTTAGTTATTGTTTTGTTAAGTTGGGTTTTTTGTACCCAATAGCCAACAAGCCGCCTTTATGGCGGTTTTTTTGTGCCTGAAAAGTGGGCGCA + _BP`ccceggcegihiiighiifhihfddgfhi^efgfhhhhhegiiiiiiiihiihihggeeccdddcccacWTT^acc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1165:2058#ACGTT/1 ACGTTAGCAGAATCGCTTTCTGTTCGTTTTCCACCTGCGACAGACGCACCGGACCACGGTTGGCGAGATCGTCGCGCAGAATATCGGCGGCACGCTGCGAC + bb_eeceefeggehhdagfghhiihfghighhffhifhhcghfdhiihafgdceba`a\aaccc^V]^baccaccXaaX^bbcccaac[_X]]a[aacXT @FCC0CD5ACXX:1:1101:1135:2082#AGCGT/1 AGCGTGACAAACATTTTATTGCGCCCGGTTTTATCCAGCTTGAATGCCTGACGAAAGAAGATGATGGTGACGACGATGGAGAGAACAATCAGCACCAGATT + bbbeeeeefggfgiihgiigiiiiiiiffgifgeghiiihhfefffhhhfgh_fhggdgegeaceeacbdcbcc\^aa]``_^bb]bcccccbac_a^bc @FCC0CD5ACXX:1:1101:1239:2083#AGCGT/1 AGCGTCTGACTCACACAAAAACGGTAACACAGTTATCCACAGAATCAGGGGATAAGGCCGGAAAGAACATGTGAGCAAAAAGGCAAAGCCAGGACAAAAGG + bbbeeeeegggggiiiiiiiiiigifhhiiighiiihhiiiiiiihiiiiiiiiiihiigcdbbdcdcccccdccccccccacccccccbcccacccccc Name field (optional) Output

25 National Food Institute, Technical University of Denmark What is the data? Fastq files What is Fastq? Fasta + quality scores Fastq example: @FCC0CD5ACXX:1:1101:1103:2048#ACCGT/1 ACNGTGTTTTTAGTTATTGTTTTGTTAAGTTGGGTTTTTTGTACCCAATAGCCAACAAGCCGCCTTTATGGCGGTTTTTTTGTGCCTGAAAAGTGGGCGCA + _BP`ccceggcegihiiighiifhihfddgfhi^efgfhhhhhegiiiiiiiihiihihggeeccdddcccacWTT^acc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1165:2058#ACGTT/1 ACGTTAGCAGAATCGCTTTCTGTTCGTTTTCCACCTGCGACAGACGCACCGGACCACGGTTGGCGAGATCGTCGCGCAGAATATCGGCGGCACGCTGCGAC + bb_eeceefeggehhdagfghhiihfghighhffhifhhcghfdhiihafgdceba`a\aaccc^V]^baccaccXaaX^bbcccaac[_X]]a[aacXT @FCC0CD5ACXX:1:1101:1135:2082#AGCGT/1 AGCGTGACAAACATTTTATTGCGCCCGGTTTTATCCAGCTTGAATGCCTGACGAAAGAAGATGATGGTGACGACGATGGAGAGAACAATCAGCACCAGATT + bbbeeeeefggfgiihgiigiiiiiiiffgifgeghiiihhfefffhhhfgh_fhggdgegeaceeacbdcbcc\^aa]``_^bb]bcccccbac_a^bc @FCC0CD5ACXX:1:1101:1239:2083#AGCGT/1 AGCGTCTGACTCACACAAAAACGGTAACACAGTTATCCACAGAATCAGGGGATAAGGCCGGAAAGAACATGTGAGCAAAAAGGCAAAGCCAGGACAAAAGG + bbbeeeeegggggiiiiiiiiiigifhhiiighiiihhiiiiiiihiiiiiiiiiihiigcdbbdcdcccccdccccccccacccccccbcccacccccc Quality scores Output

26 National Food Institute, Technical University of Denmark Paired and Single End Single end reads Insert size (eg. 300 bp) Paired end reads Long Insert size (eg. 8000 bp) Output

27 National Food Institute, Technical University of Denmark Splitting & clipping data Fastq example: @FCC0CD5ACXX:1:1101:1103:2048#ACCGT/1 ACNGTGTTTTTAGTTATTGTTTTGTTAAGTTGGGTTTTTTGTACCCAATAGCCAACAAGCCGCCTTTATGGCGGTTTTTTTGTGCCTGAAAAGTGGGCGCA + _BP`ccceggcegihiiighiifhihfddgfhi^efgfhhhhhegiiiiiiiihiihihggeeccdddcccacWTT^acc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1165:2058#ACGTT/1 ACGTTAGCAGAATCGCTTTCTGTTCGTTTTCCACCTGCGACAGACGCACCGGACCACGGTTGGCGAGATCGTCGCGCAGAATATCGGCGGCACGCTGCGAC + bb_eeceefeggehhdagfghhiihfghighhffhifhhcghfdhiihafgdceba`a\aaccc^V]^baccaccXaaX^bbcccaac[_X]]a[aacXT @FCC0CD5ACXX:1:1101:1135:2082#AGCGT/1 AGCGTGACAAACATTTTATTGCGCCCGGTTTTATCCAGCTTGAATGCCTGACGAAAGAAGATGATGGTGACGACGATGGAGAGAACAATCAGCACCAGATT + bbbeeeeefggfgiihgiigiiiiiiiffgifgeghiiihhfefffhhhfgh_fhggdgegeaceeacbdcbcc\^aa]``_^bb]bcccccbac_a^bc @FCC0CD5ACXX:1:1101:1239:2083#AGCGT/1 AGCGTCTGACTCACACAAAAACGGTAACACAGTTATCCACAGAATCAGGGGATAAGGCCGGAAAGAACATGTGAGCAAAAAGGCAAAGCCAGGACAAAAGG + bbbeeeeegggggiiiiiiiiiigifhhiiighiiihhiiiiiiihiiiiiiiiiihiigcdbbdcdcccccdccccccccacccccccbcccacccccc using barcodes Output aka multiplexing De-multiplexing is usually done by the sequencer

28 National Food Institute, Technical University of Denmark Data quality Output

29 National Food Institute, Technical University of Denmark Trimming data Fastq example: @FCC0CD5ACXX:1:1101:1103:2048#ACCGT/1 AC N GTGTTTTTAGTTATTGTTTTGTTAAGTTGGGTTTTTTGTACCCAATAGCCAACAAGCCGCCTTTATGGCGGTTTTTTGTGCCTGAAAAGTGGGCGCA + _BP`ccceggcegihiiighiifhihfddgfhi^efgfhhhhhegiiiiiiiihiihihggeeccdddcccacWTT^acc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1165:2058#ACGTT/1 ACGTTAGCAGAATCGCTTTCTGTTCGTTTTCCACCTGCGACAGACGCACCGGACCACGGTTGGCGAGATCGTCGCGCAGAATATCGGCGGCACGCTGCGAC + bb_eeceefeggehhdagfghhiihfghighhffhifhhcghfdhiihafgdceba`a\aaccc^V]^baccaccXaaacc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1135:2082#AGCGT/1 AGCGTGACAAACATTTTATTGCGCCCGGTTTTATCCAGCTTGAATGCCTGACGAAAGAAGATGATGGTGACGACGATGGAGAGAACAATCAGCACCAGATT + bbbeeeeefggfgiihgiigiiiiiiiffgifgeghiiihhfefffhhhfgh_fhggdgegeaceeacbdcbcc\^aa]``_^bb]bcccccbac_a^bc @FCC0CD5ACXX:1:1101:1239:2083#AGCGT/1 AGCGTCTGACTCACACAAAAACGGTAACACAGTTATCCACAGAATCAGGGGATAAGGCCGGAAAGAACATGTGAGCAAAAAGGCAAAGCCAGGACAAAAGG + bbbeeeeegggggiiiiiiiiiigifhhiiighiiihhiiiiiiihiiiiiiiiiihiigcdbbdcdcccccdccccccccacccccccbcccacccccc Output

30 National Food Institute, Technical University of Denmark Trimming data Fastq example: @FCC0CD5ACXX:1:1101:1103:2048#ACCGT/1 AC N GTGTTTTTAGTTATTGTTTTGTTAAGTTGGGTTTTTTGTACCCAATAGCCAACAAGCCGCCTTTATGGCGGTTTTTTGTGCCTGAAAAGTGGGCGCA + _BP`ccceggcegihiiighiifhihfddgfhi^efgfhhhhhegiiiiiiiihiihihggeeccdddcccacWTT^acc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1165:2058#ACGTT/1 ACGTTAGCAGAATCGCTTTCTGTTCGTTTTCCACCTGCGACAGACGCACCGGACCACGGTTGGCGAGATCGTCGCGCAGAATATCGGCGGCACGCTGCGAC + bb_eeceefeggehhdagfghhiihfghighhffhifhhcghfdhiihafgdceba`a\aaccc^V]^baccaccXaaacc[ab_`]`[_b`^BBBBBBBB @FCC0CD5ACXX:1:1101:1135:2082#AGCGT/1 AGCGTGACAAACATTTTATTGCGCCCGGTTTTATCCAGCTTGAATGCCTGACGAAAGAAGATGATGGTGACGACGATGGAGAGAACAATCAGCACCAGATT + bbbeeeeefggfgiihgiigiiiiiiiffgifgeghiiihhfefffhhhfgh_fhggdgegeaceeacbdcbcc\^aa]``_^bb]bcccccbac_a^bc @FCC0CD5ACXX:1:1101:1239:2083#AGCGT/1 AGCGTCTGACTCACACAAAAACGGTAACACAGTTATCCACAGAATCAGGGGATAAGGCCGGAAAGAACATGTGAGCAAAAAGGCAAAGCCAGGACAAAAGG + bbbeeeeegggggiiiiiiiiiigifhhiiighiiihhiiiiiiihiiiiiiiiiihiigcdbbdcdcccccdccccccccacccccccbcccacccccc Output Data quality

31 National Food Institute, Technical University of Denmark Coverage & Depth Output Coverage: Average number of times the data is covered in the genome. N: Number of read L: Read length G: Genome size Depth: Number reads that coveres a particular nucleotide in each position in the genome. reads site = depth Data quality (target or assembly) Breadth-of-coverage: assembly size target size C = Example: N = 5 mill L = 100 bp G = 5 Mbp C = 5*100/5 = 100X On average, 100 reads covers each position in the genome. ________ Example: assembly = 4.9 mill target = 5 mill c = 4.9/5 = 0.98 ________

32 National Food Institute, Technical University of Denmark Output Data storage & Access International Nucleotide Sequence Database Collaboration (INSDC) Europe European Bioinformatics Institute (EBI) United States National Center for Biotechnology Information (NCBI) Asia DNA Data Bank of Japan (DDBJ)

33 National Food Institute, Technical University of Denmark European Bioinformatics Institute (EBI) Output Data storage & Access http://www.ebi.ac.uk/ena

34 National Food Institute, Technical University of Denmark Assembly Mapping to a reference Further analysis (eg. Gene finding) Further analysis (eg. SNP trees) Data Analysis Data splitting, clipping, and trimming Referenc e De novo

35 National Food Institute, Technical University of Denmark UnixDOS Mac OS X LinuxWindows Bioinformatic tools CLC bio and MEGA Geneious Data Analysis Bioinformatic platforms

36 National Food Institute, Technical University of Denmark Data Analysis Bioinformatic platforms Unix…

37 National Food Institute, Technical University of Denmark + Platform independent + Requires little computer resources + Can be done everywhere - Requires patience http://www.genomicepidemiology.org/ : http://www.genomicepidemiology.org/ MLST Resistance genes SNP calling and tree creation Species identification https://main.g2.bx.psu.edu/ :https://main.g2.bx.psu.edu/ Many NGS tools Steep learning curve Data Analysis Bioinformatic platforms Web-tools to the rescue!

38 National Food Institute, Technical University of Denmark Different sequencers requires different assemblers Depend on output and error profile Assembler: Newbler 454 Ion Torrent Assembler: Velvet Illumina ABI Solid (color spaced) Data Analysis Assembly De novo

39 National Food Institute, Technical University of Denmark Velvet – The unnecessarily complex assembler K-mer based assembler User needs to set K Longer reads equals larger K Everything is defined in “Kmer-space” Nucleotide length = Kmer_length + K-1 Kmer_coverage = Nucleotide_coverage * (Read_length-K+1)/Read_length Data Analysis Assembly De novo

40 National Food Institute, Technical University of Denmark Velvet assembly Data Analysis Assembly De novo Example >NODE_1_length_91928_cov_23.136574 AGTTCATTGATAAATCTTTTTTGATTATCATCAACGAGTGCCCACACAGATTGATTGGTT TATATTGTTAAAGAGCTTTTCCTATCGAAATCGCTTTTAAGCTCAATTCGCTAGGGCTGC GTATATTACGCTTATTCAGTTGAGTGTCAAACGTTATTTTCTA... K = 83 Kmer_length + K-1 = Nucleotide length 91928 + 83 – 1 = 92010 Kmer_coverage = Nucleotide_coverage * (Read_length-K+1)/Read_length 23.136574 (300 – 83 + 1) / 300 ___________________ = 31.84

41 National Food Institute, Technical University of Denmark De novo quality check Number of contigs - Fewer is generally better N50 Total size of contigs 50% of size Data Analysis

42 National Food Institute, Technical University of Denmark De novo quality check Number of contigs - Fewer is better N50 Total size of contigs 50% of size Size of contig Data Analysis

43 National Food Institute, Technical University of Denmark Assembly Further analysis (eg. Gene finding) Data Analysis Data splitting, clipping, and trimming Referenc e De novo

44 National Food Institute, Technical University of Denmark Contigs Gene finding Resistance MLST Etc. Data Analysis Further data analysis

45 National Food Institute, Technical University of Denmark Find genes by Open Reading Frames + Shine-Dalgarno + motifs Not there does not mean it is NOT there Not assembled Truncated “Hypothetical” & “Putative” – The curse of bioinformatics Annotated gene – verified in the lab “Hypothetical” or “Putative” annotations No match to original sequence The evil circle of BLAST similarity Suggested annotation service: RAST: http://rast.nmpdr.org/ Data Analysis Further data analysis Genes are not just genes…

46 National Food Institute, Technical University of Denmark Assembly Mapping to a reference Further analysis (eg. Gene finding) Data Analysis Data splitting, clipping, and trimming Referenc e De novo

47 National Food Institute, Technical University of Denmark Mapping to a reference raw reads Do not match any reads Do not match reference Reference sequence Data Analysis Mappers: BWA Bowtie MAQ CGE

48 National Food Institute, Technical University of Denmark Assembly Mapping to a reference Further analysis (eg. Gene finding) Further analysis (eg. SNP trees) Data Analysis Data splitting, clipping, and trimming Referenc e De novo

49 National Food Institute, Technical University of Denmark Thank you for listening Questions?


Download ppt "Introduction to next generation sequencing Rolf Sommer Kaas."

Similar presentations


Ads by Google