Canadian Bioinformatics Workshops

Slides:



Advertisements
Similar presentations
Discovery of Structural Variation with Next-Generation Sequencing Alexandre Gillet-Markowska Gilles Fischer Team – Biology.
Advertisements

Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland.
Mapping translocation breakpoints by next- generation sequencing Chen, Wei, Vera Kalscheuer, Andreas Tzschach, Corinna Menzel, Reinhard Ullmann, Marcel.
Using the whole read: Structural Variation detection with RPSR
Bioinformatics at Molecular Epidemiology - new tools for identifying indels in sequencing data Kai Ye
Next-generation sequencing – the informatics angle Gabor T. Marth Boston College Biology Department AGBT 2008 Marco Island, FL. February
Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities By Kevin Chen, Lior Pachter PLoS Computational Biology, 2005 David Kelley.
Variant discovery Different approaches: With or without a reference? With a reference – Limiting factors are CPU time and memory required – Crossbow –
1000 Genomes SV detection Boston College Chip Stewart 24 November 2008.
Informatics challenges and computer tools for sequencing 1000s of human genomes Gabor T. Marth Boston College Biology Department Cold Spring Harbor Laboratory.
NGS Workshop Variant Calling
Whole Exome Sequencing for Variant Discovery and Prioritisation
Detecting copy number variations using paired-end sequence data Nick Furlotte CS224 May 29, 2009.
GeVab: Genome Variation Analysis Browsing Server Korean BioInformation Center, KRIBB InCoB2009 KRIBB
Constitutional (germ-line) variants in hereditary conditions
Todd J. Treangen, Steven L. Salzberg
High throughput sequencing: informatics & software aspects Gabor T. Marth Boston College Biology Department BI543 Fall 2013 January 29, 2013.
Nature Genetics Vol.36 Sept 2004 Detection of Large-scale Variation In the Human Genome Iafrate, Feuk, Rivera, Listewnik, Donahoe, Qi, Scherer, Lee any.
Genomics Method Seminar - BreakDancer January 21, 2015 Sora Kim Researcher Yonsei Biomedical Science Institute Yonsei University College.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
BRUDNO LAB: A WHIRLWIND TOUR Marc Fiume Department of Computer Science University of Toronto.
Sangwoo Kim, Ph.D. Assistant Professor,
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
Identification of Copy Number Variants using Genome Graphs
Cancer genomics Yao Fu March 4, Cancer is a genetic disease In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes.
SV validation plate #1 Format: 384 amplicons ( two 384-well plates of primers ) Events: 4 different types of SVs: Deletions Insertions Tandem duplications.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
De novo assembly validation
Biodiversity. Genetic Mutations Change in base pairs Affect sequence May affect protein production Can alter genetic makeup within species.
Ke Lin 23 rd Feb, 2012 Structural Variation Detection Using NGS technology.
Analyzing DNA using Microarray and Next Generation Sequencing (1) Background SNP Array Basic design Applications: CNV, LOH, GWAS Deep sequencing Alignment.
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
Recent Advances in Genomic Science Julian Sampson Institute of Medical Genetics, Cardiff.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Canadian Bioinformatics Workshops
Genome evolution within the individual
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Precise Identification of Structural Variations in the Human Genome by Splitting Shotgun Reads Zemin Ning1, Anthony Cox1, David Adams1, Paul Flicek2, Charles.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
SVs and CNVs They are often confused…
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.
Mutations Changes in the genetic material Gene Mutations
Jin Zhang, Jiayin Wang and Yufeng Wu
Haley J. Abel, Hussam Al-Kateb, Catherine E. Cottrell, Andrew J
Identification of Multiple Complex Rearrangements Associated with Deletions in the 6q23-27 Region in Sézary Syndrome  Katarzyna Iżykowska, Mariola Zawada,
The characterisation of mtDNA deletions using long-read sequencing
Linking Genetic Variation to Important Phenotypes
Model of segmental duplication Acceptor regions of the genome acquire segments of genomic material that range from 1–200 kb from disparate regions.
Volume 9, Issue 4, Pages (October 2011)
GersteinLab.org Overview
Utility of NIST Whole-Genome Reference Materials for the Technical Validation of a Multigene Next-Generation Sequencing Test  Bennett O.V. Shum, Ilya.
Next-generation DNA sequencing
Introduction to Sequencing
Single-Molecule Sequencing: Towards Clinical Applications
BF528 - Genomic Variation and SNP Analysis
BF528 - Whole Genome Sequencing and Genomic Variation
Canadian Bioinformatics Workshops
Chromothripsis in Healthy Individuals Affects Multiple Protein-Coding Genes and Can Result in Severe Congenital Abnormalities in Offspring  Mirjam S.
Harrison Brand, Ryan L. Collins, Carrie Hanscom, Jill A
Alignment and CNV analysis in cattle
Next-Generation Sequencing of Duplication CNVs Reveals that Most Are Tandem and Some Create Fusion Genes at Breakpoints  Scott Newman, Karen E. Hermetz,
Key NGS principles. Key NGS principles. A and B, identification of structural variants: Longer (paired-end or mate-pair) sequencing reads are more adept.
Presentation transcript:

Canadian Bioinformatics Workshops www.bioinformatics.ca

Module #: Title of Module 2

Module 6 Structural variant calling Guillaume Bourque Informatics on High-throughput Sequencing Data June 9-10, 2016

Learning Objectives of Module To understand what are structural variants (SVs) To appreciate how SVs are discovered from NGS data To appreciate the strengths and weaknesses of each SV discovery strategy To recognize the sequence alignment SV “signals” To be able to visually explore read support for SVs

Structural Variants (SVs) Structural Variants (SVs): Genomic rearrangements that affect >50bp (or 100bp, or 1Kb) of sequence, including: deletions novel insertions inversions mobile-element transpositions duplications translocations Adapted from Alkan et al. Nat Rev Genet 2011

Detection and confirmation of SVs Feuk et al. Nat Rev Genet 2006

Structural variants in cancer Can higher resolution maps help identify recurrent aberrations and driver mutations in cancer?

Classes of SVs Copy number variants (CNVs): Deletions Duplications Copy neutral rearrangements: Inversions Translocations Other structural variants: Novel insertions Mobile-element transpositions

Classes of SVs Alkan et al. Nat Rev Genet 2011

Our understanding is driven by technology Aaron Quinlan

Array-based detection of CNVs Alkan et al. Nat Rev Genet 2011

Detecting SVs from NGS data Meyerson et al. Nat Rev Genet 2010

Strategies for calling SVs from NGS data Baker Nat Methods 2012

Strategies for calling SVs from NGS data 1. Baker Nat Methods 2012

Discordant read pairs Concordant Discordant (distance too long) (distance too short) Genomic distance between mapped paired tags Read 1 Read 2 insert size Reads pairs are also Discordant when order or orientation isn’t as expected.

Using discordant reads to detect SVs Adapted from Aaron Quinlan

Using discordant reads to detect SVs Adapted from Aaron Quinlan

Using discordant reads to detect SVs Adapted from Aaron Quinlan

Using discordant reads to detect SVs Adapted from Aaron Quinlan

Read-pair tools BreakDancer VariationHunter MoDIL GASV-PRO DELLY LUMPY GenomeSTRiP Etc.

Detecting SVs with read-pairs Hillmer et al. Genome Res 2011

Read-pairs in complex regions Hillmer et al. Genome Res 2011

Read-pair summary Weaknesses Strengths: Difficult to interpret read-pairs in repetitive regions Difficult to fully characterize highly rearranged regions High rate of false positives Strengths: Most classes of variation can, in principle, be detected

Strategies for calling SVs from NGS data 2. Baker Nat Methods 2012

Read-depth Aaron Quinlan

Read-depth Aaron Quinlan

Normalization issues

Population based SV detection : PopSV Monlong et al. BioRxiv 034165

Read depth tools ReadDepth RDXplorer cnvSeq CNVer CopySeq GenomeSTRiP CNVnator PopSV Etc.

Read-depth summary Weaknesses Strengths: Relatively low resolution (normally ~10Kb) Cannot detect balanced rearrangements (e.g., inversions), or transposon insertions Strengths: Determines DNA copy number (unlike most other methods) Provides useful information even with low coverage, albeit at low resolution

Strategies for calling SVs from NGS data 3. Baker Nat Methods 2012

Split reads Rausch et al. Bioinformatics 2012

Split read tools Pindel DELLY LUMPY PRISM Mobster Etc.

Split reads summary Weaknesses Strengths: Requires sufficient coverage Can have false positives especially in repetitive regions Strengths: Can be added to read-pairs methods Base-pair resolution of breakpoints

Strategies for calling SVs from NGS data 4. Baker Nat Methods 2012

De novo assembly for SVs Adapted from Alkan et al. Nat Rev Genet 2011

De novo assembly tools for SVs Cortex SGA DISCOVAR ABySS Ray Etc.

De novo assembly for SVs summary Weaknesses Computationally very intensive Hard to resolve repetitive and complex regions Strengths: Base-pair resolution of breakpoints All classes of variation can, in principle, be detected

Summary of strategies for calling SVs Aaron Quinlan

Bottom line: try many methods and validate Mills et al. Nature 2011 Kloosterman et al. Genome Res 2015

Visual validation: a deletion Aaron Quinlan

Visual validation: a duplication Aaron Quinlan

Visual validation: an inversion Aaron Quinlan

Visual validation: an insertion (in the reference) Aaron Quinlan

SVs summary view : Circos plots circos.ca

Lab time!

We are on a Coffee Break & Networking Session