Download presentation
Presentation is loading. Please wait.
Published byEleanore Richardson Modified over 9 years ago
1
Tutorial 6 High Throughput Sequencing
2
HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up the pipelines
3
Review of resequencing pipeline
4
Demultiplexing Lane Unknown inserts
5
Reference Genome Sample Mapping Demultiplexing Example of mapping parameters: Number of mismatches per read Scores for mismatch or gaps Mapping parameters affect the rest of the analysis
6
Removing duplicates and non-unique mappings Mapping Demultiplexing Reference Genome ?
7
Resequencing/ Exome Pipeline
8
Coverage profile and variant calling Removing duplicates and non-unique mappings Mapping Demultiplexing Reference Genome …ACTTCGTCGAAAGG… G
9
Coverage profile and variant calling Removing duplicates and non-unique mappings Mapping Demultiplexing Variant filtering Reference Genome …ACTTCGTCGAAAGG… Reference Genome …ACTTCGTCGAAAGG… Frequency >= 20% Coverage >= 5
10
Variant calling Removing duplicates and non-unique mappings Mapping Demultiplexing Variant filtering Genes and known variants Reference Genome …ACTTCGTCGAAATG… …GTCCCGTGATACTCCGT… G A rs230985 Gene X
11
Resequencing results
12
Working with IGV http://www.broadinstitute.org/igv/
13
Integrative Genome Viewer IGV is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.
14
Genome used for mapping Name of sample (BAM file) Lowest resolution of the genome (zoomed out)
15
Zooming in
16
Coverage track Alignment track
17
Zoomed in until we get to the base pair value SNP
18
Hover over the coverage track in order to see details regarding all bases in a specific position Can we trust this SNP?
19
Hover over the alignment track in order to see details regarding a specific read What is the quality of this read and its mapping?
20
Right-click on alignment track to change view of this track
21
Color reads by strand to verify there is no strand bias
22
Why and how to work with IGV
23
Base qualities, comparison between samples
24
False positive indels
25
Same mapping statistics – different meaning What might cause this low percentage of mapping?
26
The sample contains a high percentage of contamination The sample is very different from the reference genome
27
One image is worth a thousand words…
28
Structural Variations Large deletion in the sample compared to the reference genome
29
Galaxy
30
https://main.g2.bx.psu.edu/
31
Use your account name and password to login to Galaxy:
32
Uploading data to Galaxy
37
Use the “eye” icon to view the contents of a file
38
Mapping, filtering and conversion to BAM
39
Mapping
40
Filter SAM file
41
Convert SAM to BAM
42
Variant calling
43
Create pileup
44
Find variants
45
Tuning up the pipelines
46
1 mismatch per read 5 mismatches per read How can mapping parameters affect the results
47
False positives vs. true negatives 3-bases insertion One pipeline for all projects?
48
How can you tune your analysis? Try different programs. Mapping: – Change mapping parameters – Use non-unique mappings – Don’t filter duplicates Variants: – Change variant filtration – Change variant merging – penetrance, different heredity, low coverage in one individual… – Look for bigger variants: big insertions/ deletions, inversions, copy number variations etc.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.