Canadian Bioinformatics Workshops

Slides:



Advertisements
Similar presentations
Introduction Lesson 1 Microsoft Office 2010 and the Internet
Advertisements

Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland.
High Throughput Sequencing
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
1 Genetics The Study of Biological Information. 2 Chapter Outline DNA molecules encode the biological information fundamental to all life forms DNA molecules.
Bioinformatics for high-throughput DNA sequencing Gabor Marth Boston College Biology New grad student orientation Boston College September 8, 2009.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
Computational Tools for Finding and Interpreting Genetic Variations Gabor T. Marth Department of Biology, Boston College
CM143 - Web Week 2 Basic HTML. Links and Image Tags.
Login: BITseminar Pass: BITseminar2011 Login: BITseminar Pass: BITseminar2011.
Detecting copy number variations using paired-end sequence data Nick Furlotte CS224 May 29, 2009.
GeVab: Genome Variation Analysis Browsing Server Korean BioInformation Center, KRIBB InCoB2009 KRIBB
MES Genome Informatics I - Lecture VIII. Interpreting variants Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute,
Detection of structural variants and copy number alterations in cancer: from computational strategies to the discovery of chromothripsis in neuroblastoma.
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Practically Genomic A hands-on bioinformatics IAP Course Materials: Instructors: Paola Favaretto, Sebastian Hoersch,
What is Genetic Research?. Genetic Research Deals with Inherited Traits DNA Isolation Use bioinformatics to Research differences in DNA Genetic researchers.
Genomics Method Seminar - BreakDancer January 21, 2015 Sora Kim Researcher Yonsei Biomedical Science Institute Yonsei University College.
BRUDNO LAB: A WHIRLWIND TOUR Marc Fiume Department of Computer Science University of Toronto.
SAVANT GENOME BROWSER Marc Fiume Department of Computer Science University of Toronto.
Identification of Copy Number Variants using Genome Graphs
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
SAVANT GENOME BROWSER Marc Fiume Department of Computer Science University of Toronto.
Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up.
Genome STRiP ASHG Workshop demo materials
Supplemental Figure 1. Bias-corrected NGS bioinformatics strategies. Paired-end DNA sequencing reveals the sequence of the genomic clone, the sample ID.
Ke Lin 23 rd Feb, 2012 Structural Variation Detection Using NGS technology.
Chapter 2 Genetic Variations. Introduction The human genome contains variations in base sequence from one individual to another. Some sequence variants.
Accessing and visualizing genomics data
Canadian Bioinformatics Workshops
A brief guide to sequencing Dr Gavin Band Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for Health.
TRACKSTER &CIRCSTER DEMO Slides: /g/funcgen/trainings/visualization/Demos/Trackster+Circster.ppt Galaxy: Galaxy Dev:
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
CCRC Cancer Conference November 8, 2015.
Canadian Bioinformatics Workshops
Visualizing data from Galaxy
Canadian Bioinformatics Workshops bioinformatics.ca.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Interpreting exomes and genomes: a beginner’s guide
Using command line tools to process sequencing data
Canadian Bioinformatics Workshops
Lesson: Sequence processing
Gil McVean Department of Statistics
Quality Control & Preprocessing of Metagenomic Data
NGS Analysis Using Galaxy
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Genomic Data Clustering on FPGAs for Compression
EMC Galaxy Course November 24-25, 2014
SVs and CNVs They are often confused…
DNA Marker Lecture 10 BY Ms. Shumaila Azam
Jin Zhang, Jiayin Wang and Yufeng Wu
Algorithm Animation for Bioinformatics Algorithms
Human Complexity of Software
Digital Image Processing
How to Improve Releasing Efficiency via i18N/L10n Test Automation.
BF528 - Genomic Variation and SNP Analysis
Canadian Bioinformatics Workshops
Sequence Visualization
Presentation transcript:

Canadian Bioinformatics Workshops

2Module #: Title of Module

Module 4 Visual Analysis of HT-seq data

Module 4 bioinformatics.ca Learning Objectives of Module to appreciate the different data viz tools in genomics to know when to use a particular tool to gain more experience with genome browsers to become an expert in variation inspection – single nucleotide and structural variants to become familiar with next-gen variant analysis tools

Module 4 bioinformatics.ca Organization Part I (9:00-10:30) – genome browsers – visualizing single nucleotide and structural variants Part II (11:00-12:30) – variant search engines – finding disease-causing genetic mutations

Module 4 bioinformatics.ca Part I : browsing HT-seq data, inspecting variants

Module 4 bioinformatics.ca Why visualize our data?

Module 4 bioinformatics.ca Anscombe’s quartet each of these datasets has the same mean and variance

Module 4 bioinformatics.ca Preattentive processing encoded properly, outliers are easily identified

Module 4 bioinformatics.ca Preattentive processing (video)

Module 4 bioinformatics.ca Why visualize? the human visual system is a low-cost* and high- performance – sense maker, to identify patterns – debugger, to identify issues and outliers * compared to cost of writing, debugging, and running computational scripts

Module 4 bioinformatics.ca Visualization Tools in Genomics

Module 4 bioinformatics.ca Which tool to use? there are over 40 different genome browsers, which to use? depends on – task at hand – kind and size of data – data privacy

Module 4 bioinformatics.ca HT-seq Genome Browsers task at hand : visualizing HT-seq reads, especially good for inspecting previously identified variants kind and size of data : large BAM files, stored locally or remotely data privacy : run on the desktop, can keep all data private Integrative Genome Viewer Savant Genome Browser

Module 4 bioinformatics.ca You might also want to try New web-technologies are being applied to make HT-seq data browsing more interactive UCSC Genome Browser has been retrofitted to display BAM files Trackster is a genome browser that can perform visual analytics on small windows of the genome, deploy full analysis with Galaxy UCSC Genome Browser Trackster (part of Galaxy)

Module 4 bioinformatics.ca Savant desktop genome browser, designed for HT-seq data – emphasis on manually inspecting single nucleotide and structural variations

Module 4 bioinformatics.ca Review: structural variation detection covered in Module 3 two complementary approaches: – depth of coverage (DOC) – paired end mapping (PEM)

Module 4 bioinformatics.ca PEM: small insertions donor reference

Module 4 bioinformatics.ca PEM: large insertions donor reference

Module 4 bioinformatics.ca PEM: deletions reference donor

Module 4 bioinformatics.ca PEM: inversions reference donor one read inverted when mapped

Module 4 bioinformatics.ca PEM: tandem duplications reference donor order of read mappings reversed

Module 4 bioinformatics.ca Structural Variants in Savant Savant has a visualization mode for BAM files called “Matepair (Arc)” that is specialized for identifying structural variants using the PEM methodology it connects the locations of paired mappings by an arc – arc height represents the mapped distance – arc color represents the relative orientation of the reads (for complex rearrangements, like inverstions)

Module 4 bioinformatics.ca Savant demo

Module 4 bioinformatics.ca Lab Time

Module 4 bioinformatics.ca We are on a Coffee Break & Networking Session

Canadian Bioinformatics Workshops

28Module #: Title of Module

Module 4 Visual Analysis of HT-seq data

Module 4 bioinformatics.ca Quiz for Module 4 Part I

Module 4 bioinformatics.ca Question 1 which visualization mode in Savant is best for finding SNPs? why?

Module 4 bioinformatics.ca Question 2 which visualization mode in Savant is best for finding structural variations? why?

Module 4 bioinformatics.ca Question 3 e.g. chr1: 5,195, ,199,144 what kind of event does this image depict?

Module 4 bioinformatics.ca A: INSERTION donor reference

Module 4 bioinformatics.ca Question 4 what kind of event does this image depict? chr1: 26,489, ,490,661

Module 4 bioinformatics.ca A: DELETION reference donor

Module 4 bioinformatics.ca Question 5 what would a heterozygous deletion look like? chr1: 31,574, ,578,242

Module 4 bioinformatics.ca Question 6 what kind of event does this image depict? chr1: 81,659, ,661,916

Module 4 bioinformatics.ca A: Inversion reference donor one read inverted when mapped

Module 4 bioinformatics.ca Question 7 what kind of event does this image depict? chr1: 11,050, ,055,457

Module 4 bioinformatics.ca A: Tandem Duplication reference donor order of read mappings reversed

Module 4 bioinformatics.ca Part II : visual analytics for variants this is bonus material, covered if time permits contact for

Module 4 bioinformatics.ca Genetic Variant Analysis finding disease-causing genetic mutation is “like trying to find a needle in a haystack needlestack” lots of variants many distractors – many false positives errors in sequencing errors in variant prediction – most true positives are not causal not related to phenotype of interest, not damaging

Module 4 bioinformatics.ca Genetic Variant Analysis filter variants based on quality, effect, and relevance to disease variant calling annotationfiltrationvisualization Modules 1-3Module 4.1

Module 4 bioinformatics.ca Existing Tools command-line is powerful but not interactive Excel / Genome Browsers are interactive but not powerful

Module 4 bioinformatics.ca chr1 : 102,435,394 – 129,485,349 GO

Module 4 bioinformatics.ca MedSavant, a variant search engine

Module 4 bioinformatics.ca MedSavant visual analytics from variant calling to disease mutation discovery variant calling annotationfiltrationvisualization MedSavant

Module 4 bioinformatics.ca MedSavant demo

Module 4 bioinformatics.ca You might also want to try VarSifter works in memory, good for small projects this space is evolving; difficult to do a comprehensive comparison much more commercial activity compared to genome browsers VarSifterGolden Helix SVS (commercial)

Module 4 bioinformatics.ca We are on a Coffee Break & Networking Session