Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland.

Similar presentations


Presentation on theme: "Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland."— Presentation transcript:

1 Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland

2  The mapping is the process of comparing each read with the reference genome.  There are many different software available to perform reference mapping ◦ Multiple placement of reads (multi-hits) ◦ Allow gaps ◦ Don’t allow gaps at all ◦ Limits on number of mis-matches  Assess your mapping results ◦ % of total reads mapped ◦ % of uniquely mapped reads ◦ Coverage statistics, variance in depth Reference mapping

3 Mapped read depth

4  Identification of point mutation, short insertion and deletion.  We go thought every column of the alignment and see how many alleles are found and how many are different to the reference genome. Variant detection Reference: ACGAAACGTAGTGAGGAC-GTA sample: ACCAAACGTAGAGAGGACCGTA SNP indels

5  2 nd generation sequencing is NOT single molecule sequencing  Due to the PCR amplification, some DNA fragments will be sequenced more often than others => results in uneven coverage across the genome.  This would provide false support in variant detection, as we are usually more confident in variants that has higher coverage support.  Solution: Mark or remove exact duplicate reads when doing variant detection. Complexity of variant detection

6  Cloning process artifacts (e.g. PCR induced mutations).  Error rate associated with the sequence reads.  Error rate associated with the mapping.  Reliability of the reference genome. Complexity of variant detection

7  A hard cut-off in percentage of difference to reference base.  75% as minimum threshold for a variant to be call homozygous variant.  Percentage based cut-off assumes you have sufficient coverage. Calling a variant A (ref): 0% G: 100% A (ref): 0% G: 100% A (ref): 7% T: 93% A (ref): 7% T: 93%

8 When to call a variant ? A: 18% C: 0% G: 55% T: 27% A: 18% C: 0% G: 55% T: 27%

9  Perform local realignment and calculate mapping score to determine which one is better. Alignment considerations

10 What depth do I need ?

11

12  Read length  Longer reads are more likely to be mapped with high confidences  Sequencing depth  Require sufficient depth, ~30x  Base call quality for each supporting bases  Use high quality bases, Q30  Mapping quality  Local realignment to improve variant calling Factors to consider


Download ppt "Reference mapping and variant detection Peter Tsai Bioinformatics Institute, University of Auckland."

Similar presentations


Ads by Google