Download presentation
Presentation is loading. Please wait.
1
Canadian Bioinformatics Workshops www.bioinformatics.ca
2
2Module #: Title of Module
3
Module 4 Visual Analysis of HT-seq data
4
Module 4 bioinformatics.ca Learning Objectives of Module to appreciate the different data viz tools in genomics to know when to use a particular tool to gain more experience with genome browsers to become an expert in variation inspection – single nucleotide and structural variants to become familiar with next-gen variant analysis tools
5
Module 4 bioinformatics.ca Organization Part I (9:00-10:30) – genome browsers – visualizing single nucleotide and structural variants Part II (11:00-12:30) – variant search engines – finding disease-causing genetic mutations
6
Module 4 bioinformatics.ca Part I : browsing HT-seq data, inspecting variants
7
Module 4 bioinformatics.ca Why visualize our data?
8
Module 4 bioinformatics.ca Anscombe’s quartet each of these datasets has the same mean and variance
9
Module 4 bioinformatics.ca Preattentive processing encoded properly, outliers are easily identified
10
Module 4 bioinformatics.ca Preattentive processing (video)
11
Module 4 bioinformatics.ca Why visualize? the human visual system is a low-cost* and high- performance – sense maker, to identify patterns – debugger, to identify issues and outliers * compared to cost of writing, debugging, and running computational scripts
12
Module 4 bioinformatics.ca Visualization Tools in Genomics
13
Module 4 bioinformatics.ca Which tool to use? there are over 40 different genome browsers, which to use? depends on – task at hand – kind and size of data – data privacy
14
Module 4 bioinformatics.ca HT-seq Genome Browsers task at hand : visualizing HT-seq reads, especially good for inspecting previously identified variants kind and size of data : large BAM files, stored locally or remotely data privacy : run on the desktop, can keep all data private Integrative Genome Viewer Savant Genome Browser
15
Module 4 bioinformatics.ca You might also want to try New web-technologies are being applied to make HT-seq data browsing more interactive UCSC Genome Browser has been retrofitted to display BAM files Trackster is a genome browser that can perform visual analytics on small windows of the genome, deploy full analysis with Galaxy UCSC Genome Browser Trackster (part of Galaxy)
16
Module 4 bioinformatics.ca Savant desktop genome browser, designed for HT-seq data – emphasis on manually inspecting single nucleotide and structural variations
17
Module 4 bioinformatics.ca Review: structural variation detection covered in Module 3 two complementary approaches: – depth of coverage (DOC) – paired end mapping (PEM)
18
Module 4 bioinformatics.ca PEM: small insertions donor reference
19
Module 4 bioinformatics.ca PEM: large insertions donor reference
20
Module 4 bioinformatics.ca PEM: deletions reference donor
21
Module 4 bioinformatics.ca PEM: inversions reference donor one read inverted when mapped
22
Module 4 bioinformatics.ca PEM: tandem duplications reference donor order of read mappings reversed
23
Module 4 bioinformatics.ca Structural Variants in Savant Savant has a visualization mode for BAM files called “Matepair (Arc)” that is specialized for identifying structural variants using the PEM methodology it connects the locations of paired mappings by an arc – arc height represents the mapped distance – arc color represents the relative orientation of the reads (for complex rearrangements, like inverstions)
24
Module 4 bioinformatics.ca Savant demo
25
Module 4 bioinformatics.ca Lab Time
26
Module 4 bioinformatics.ca We are on a Coffee Break & Networking Session
27
Canadian Bioinformatics Workshops www.bioinformatics.ca
28
28Module #: Title of Module
29
Module 4 Visual Analysis of HT-seq data
30
Module 4 bioinformatics.ca Quiz for Module 4 Part I
31
Module 4 bioinformatics.ca Question 1 which visualization mode in Savant is best for finding SNPs? why?
32
Module 4 bioinformatics.ca Question 2 which visualization mode in Savant is best for finding structural variations? why?
33
Module 4 bioinformatics.ca Question 3 e.g. chr1: 5,195,017 - 5,199,144 what kind of event does this image depict?
34
Module 4 bioinformatics.ca A: INSERTION donor reference
35
Module 4 bioinformatics.ca Question 4 what kind of event does this image depict? chr1: 26,489,321 - 26,490,661
36
Module 4 bioinformatics.ca A: DELETION reference donor
37
Module 4 bioinformatics.ca Question 5 what would a heterozygous deletion look like? chr1: 31,574,172 - 31,578,242
38
Module 4 bioinformatics.ca Question 6 what kind of event does this image depict? chr1: 81,659,802 - 81,661,916
39
Module 4 bioinformatics.ca A: Inversion reference donor one read inverted when mapped
40
Module 4 bioinformatics.ca Question 7 what kind of event does this image depict? chr1: 11,050,416 - 11,055,457
41
Module 4 bioinformatics.ca A: Tandem Duplication reference donor order of read mappings reversed
42
Module 4 bioinformatics.ca Part II : visual analytics for variants this is bonus material, covered if time permits contact mfiume@cs.toronto.edu for questionsmfiume@cs.toronto.edu
43
Module 4 bioinformatics.ca Genetic Variant Analysis finding disease-causing genetic mutation is “like trying to find a needle in a haystack needlestack” lots of variants many distractors – many false positives errors in sequencing errors in variant prediction – most true positives are not causal not related to phenotype of interest, not damaging
44
Module 4 bioinformatics.ca Genetic Variant Analysis filter variants based on quality, effect, and relevance to disease variant calling annotationfiltrationvisualization Modules 1-3Module 4.1
45
Module 4 bioinformatics.ca Existing Tools command-line is powerful but not interactive Excel / Genome Browsers are interactive but not powerful
46
Module 4 bioinformatics.ca chr1 : 102,435,394 – 129,485,349 GO
47
Module 4 bioinformatics.ca MedSavant, a variant search engine
48
Module 4 bioinformatics.ca MedSavant visual analytics from variant calling to disease mutation discovery variant calling annotationfiltrationvisualization MedSavant
49
Module 4 bioinformatics.ca MedSavant demo
50
Module 4 bioinformatics.ca You might also want to try VarSifter works in memory, good for small projects this space is evolving; difficult to do a comprehensive comparison much more commercial activity compared to genome browsers VarSifterGolden Helix SVS (commercial)
51
Module 4 bioinformatics.ca We are on a Coffee Break & Networking Session
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.