Download presentation
Presentation is loading. Please wait.
Published byJadyn Bough Modified over 9 years ago
1
Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland
2
The mapping is the process of comparing each read with the reference genome. There are many different software available to perform reference mapping ◦ Multiple placement of reads (multi-hits) ◦ Allow gaps ◦ Don’t allow gaps at all ◦ Limits on number of mis-matches Assess your mapping results ◦ % of total reads mapped ◦ % of uniquely mapped reads ◦ Coverage statistics, variance in depth Reference mapping
3
Mapped read depth
4
Identification of point mutation, short insertion and deletion. We go thought every column of the alignment and see how many alleles are found and how many are different to the reference genome. Variant detection Reference: ACGAAACGTAGTGAGGAC-GTA sample: ACCAAACGTAGAGAGGACCGTA SNP indels
5
2 nd generation sequencing is NOT single molecule sequencing Due to the PCR amplification, some DNA fragments will be sequenced more often than others => results in uneven coverage across the genome. This would provide false support in variant detection, as we are usually more confident in variants that has higher coverage support. Solution: Mark or remove exact duplicate reads when doing variant detection. Complexity of variant detection
6
Cloning process artifacts (e.g. PCR induced mutations). Error rate associated with the sequence reads. Error rate associated with the mapping. Reliability of the reference genome. Complexity of variant detection
7
A hard cut-off in percentage of difference to reference base. 75% as minimum threshold for a variant to be call homozygous variant. Percentage based cut-off assumes you have sufficient coverage. Calling a variant A (ref): 0% G: 100% A (ref): 0% G: 100% A (ref): 7% T: 93% A (ref): 7% T: 93%
8
When to call a variant ? A: 18% C: 0% G: 55% T: 27% A: 18% C: 0% G: 55% T: 27%
9
Perform local realignment and calculate mapping score to determine which one is better. Alignment considerations
10
What depth do I need ?
12
Read length Longer reads are more likely to be mapped with high confidences Sequencing depth Require sufficient depth, ~30x Base call quality for each supporting bases Use high quality bases, Q30 Mapping quality Local realignment to improve variant calling Factors to consider
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.