Canadian Bioinformatics Workshops www.bioinformatics.ca
Module #: Title of Module 2
Module 6 Structural variant calling Guillaume Bourque Informatics on High-throughput Sequencing Data June 9-10, 2016
Learning Objectives of Module To understand what are structural variants (SVs) To appreciate how SVs are discovered from NGS data To appreciate the strengths and weaknesses of each SV discovery strategy To recognize the sequence alignment SV “signals” To be able to visually explore read support for SVs
Structural Variants (SVs) Structural Variants (SVs): Genomic rearrangements that affect >50bp (or 100bp, or 1Kb) of sequence, including: deletions novel insertions inversions mobile-element transpositions duplications translocations Adapted from Alkan et al. Nat Rev Genet 2011
Detection and confirmation of SVs Feuk et al. Nat Rev Genet 2006
Structural variants in cancer Can higher resolution maps help identify recurrent aberrations and driver mutations in cancer?
Classes of SVs Copy number variants (CNVs): Deletions Duplications Copy neutral rearrangements: Inversions Translocations Other structural variants: Novel insertions Mobile-element transpositions
Classes of SVs Alkan et al. Nat Rev Genet 2011
Our understanding is driven by technology Aaron Quinlan
Array-based detection of CNVs Alkan et al. Nat Rev Genet 2011
Detecting SVs from NGS data Meyerson et al. Nat Rev Genet 2010
Strategies for calling SVs from NGS data Baker Nat Methods 2012
Strategies for calling SVs from NGS data 1. Baker Nat Methods 2012
Discordant read pairs Concordant Discordant (distance too long) (distance too short) Genomic distance between mapped paired tags Read 1 Read 2 insert size Reads pairs are also Discordant when order or orientation isn’t as expected.
Using discordant reads to detect SVs Adapted from Aaron Quinlan
Using discordant reads to detect SVs Adapted from Aaron Quinlan
Using discordant reads to detect SVs Adapted from Aaron Quinlan
Using discordant reads to detect SVs Adapted from Aaron Quinlan
Read-pair tools BreakDancer VariationHunter MoDIL GASV-PRO DELLY LUMPY GenomeSTRiP Etc.
Detecting SVs with read-pairs Hillmer et al. Genome Res 2011
Read-pairs in complex regions Hillmer et al. Genome Res 2011
Read-pair summary Weaknesses Strengths: Difficult to interpret read-pairs in repetitive regions Difficult to fully characterize highly rearranged regions High rate of false positives Strengths: Most classes of variation can, in principle, be detected
Strategies for calling SVs from NGS data 2. Baker Nat Methods 2012
Read-depth Aaron Quinlan
Read-depth Aaron Quinlan
Normalization issues
Population based SV detection : PopSV Monlong et al. BioRxiv 034165
Read depth tools ReadDepth RDXplorer cnvSeq CNVer CopySeq GenomeSTRiP CNVnator PopSV Etc.
Read-depth summary Weaknesses Strengths: Relatively low resolution (normally ~10Kb) Cannot detect balanced rearrangements (e.g., inversions), or transposon insertions Strengths: Determines DNA copy number (unlike most other methods) Provides useful information even with low coverage, albeit at low resolution
Strategies for calling SVs from NGS data 3. Baker Nat Methods 2012
Split reads Rausch et al. Bioinformatics 2012
Split read tools Pindel DELLY LUMPY PRISM Mobster Etc.
Split reads summary Weaknesses Strengths: Requires sufficient coverage Can have false positives especially in repetitive regions Strengths: Can be added to read-pairs methods Base-pair resolution of breakpoints
Strategies for calling SVs from NGS data 4. Baker Nat Methods 2012
De novo assembly for SVs Adapted from Alkan et al. Nat Rev Genet 2011
De novo assembly tools for SVs Cortex SGA DISCOVAR ABySS Ray Etc.
De novo assembly for SVs summary Weaknesses Computationally very intensive Hard to resolve repetitive and complex regions Strengths: Base-pair resolution of breakpoints All classes of variation can, in principle, be detected
Summary of strategies for calling SVs Aaron Quinlan
Bottom line: try many methods and validate Mills et al. Nature 2011 Kloosterman et al. Genome Res 2015
Visual validation: a deletion Aaron Quinlan
Visual validation: a duplication Aaron Quinlan
Visual validation: an inversion Aaron Quinlan
Visual validation: an insertion (in the reference) Aaron Quinlan
SVs summary view : Circos plots circos.ca
Lab time!
We are on a Coffee Break & Networking Session