Download presentation
Presentation is loading. Please wait.
Published byAmbrose Sims Modified over 7 years ago
1
Konstantin Okonechnikov Qualimap v2: advanced quality control of
high throughput sequencing data Max Planck Institute For Infection Biology Molecular Biology department BOSC 2015 Dublin 1 1
2
Quality Control of HTS data
High-throughput sequencing : high speed, deep coverage, various applications However there are platform specific, protocol-based and analysis errors (duplicates, PCR, algorithm induced biases, etc…) There are tools that allow to perform Quality Control task: FastQC Samtools Picard tools RSeQC … Qualimap There are various solutions. Also Qualimap. Previous year second version of Qualimap.
3
Qualimap A Java application, computes statistics and graphs for the evaluation and quality control of HTS alignment data. Both GUI and command line interfaces available. Input: BAM/SAM, GTF/GFF/BED Output: HTML /PDF, simple text format Analysis modes: BAM QC, RNA-seq QC, Counts QC So we provided our solution. Since we are here in the visualization Qualimap
4
Qualimap 2 A Java application, computes statistics and graphs for the evaluation and quality control of HTS alignment data. Both GUI and command line interfaces available. Input: BAM/SAM, GTF/GFF/BED Output: HTML /PDF, simple text format Analysis modes: BAM QC, Multi-sample BAM QC, RNA-seq QC + Counts QC v2 So we provided our solution. Since we are here in the visualization Qualimap
5
BAM QC mode BAM file global statistics: alignment types, coverage analysis, insert size, duplication rate, etc. Improvements and novelties: Additional metrics for mapping quality, coverage, insert size Region/out-of-region analysis Duplicate alignments exclusion … many more … Global data (reference size, number of reads), coverage (mapped, paired, per chromosome) , reads info (insert size, quality, homopolymers, duplication rate)
6
Novel mode: multisample BAM QC
Common analysis of generated BAM QC results Detailed plots: coverage, GC-content, insert size, etc. Principal component analysis based on selected parameters A new feature, available from the snapshots
7
Novel mode: multisample BAM QC
Common analysis of generated BAM QC results: Detailed plots: coverage, GC-content, insert size, etc. Principal component analysis based on selected parameters A new feature, available from the snapshots
8
Redesigned RNA-seq quality control: RNA-seq QC + Counts QC
Analysis types: transcript coverage and proportion, GC bias, 5‘-3‘ bias, counts analysis (expression level, gene types, etc.) Novel :multisample analysis in Counts QC Main properties are comparable to RSeQC and RNASeQC. Novelty multisample analysis, example saturation, expression analysis
9
Becoming more “open-source”
Source code repository: bitbucket Discussion forum: google-groups User activity ( – ): 15 bug reports 9 novel issue suggestions 3 bug-fixes from users Supporters are mentioned in commits, news and history log Each user mentioned in the log
10
Thank you for your attention!
Useful links: Web-site: Bitbucket: Reference: García-Alcalde F, Okonechnikov K, et al “Qualimap: evaluating next-generation sequencing alignment data." Bioinformatics 28, no. 20 (2012): This is it. If you are working with NGS alignment data, give Qualimap a try. Hope you will find it as useful as we find it for our research. Thanks a lot.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.