Haplotype Estimation Using Sequencing Reads

Slides:



Advertisements
Similar presentations
The Structure of Common Genetic Variation in United States Populations
Advertisements

Michael Dannemann, Janet Kelso  The American Journal of Human Genetics 
Maryam Sayadi, Seiichiro Tanizaki, Michael Feig  Biophysical Journal 
Comprehensively Evaluating cis-Regulatory Variation in the Human Prostate Transcriptome by Using Gene-Level Allele-Specific Expression  Nicholas B. Larson,
Marc A. Coram, Huaying Fang, Sophie I. Candille, Themistocles L
CYP3A Variation and the Evolution of Salt-Sensitivity Variants
Tracing the Route of Modern Humans out of Africa by Using 225 Human Genome Sequences from Ethiopians and Egyptians  Luca Pagani, Stephan Schiffels, Deepti.
Comparison of High-Resolution Melting Analysis, TaqMan Allelic Discrimination Assay, and Sanger Sequencing for Clopidogrel Efficacy Genotyping in Routine.
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Model-free Estimation of Recent Genetic Relatedness
Daniel Greene, Sylvia Richardson, Ernest Turro 
Comparing Algorithms for Genotype Imputation
From Static to Dynamic Risk Prediction: Time Is Everything
Rare-Variant Extensions of the Transmission Disequilibrium Test: Application to Autism Exome Sequence Data  Zongxiao He, Brian J. O’Roak, Joshua D. Smith,
Alessia Ranciaro, Michael C. Campbell, Jibril B
Thomas Willems, Melissa Gymrek, G
Accuracy of Haplotype Frequency Estimation for Biallelic Loci, via the Expectation- Maximization Algorithm for Unphased Diploid Genotype Data  Daniele.
Improved Heritability Estimation from Genome-wide SNPs
Brian K. Maples, Simon Gravel, Eimear E. Kenny, Carlos D. Bustamante 
Rounak Dey, Ellen M. Schmidt, Goncalo R. Abecasis, Seunggeun Lee 
Alternative Splicing QTLs in European and African Populations
Weight Loss after Gastric Bypass Is Associated with a Variant at 15q26
Gene-Expression Variation Within and Among Human Populations
Arpita Ghosh, Fei Zou, Fred A. Wright 
Michael Dannemann, Janet Kelso  The American Journal of Human Genetics 
Evidence for Widespread Reticulate Evolution within Human Duplicons
Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate Immunity Genes  Matthieu Deschamps, Guillaume Laval,
Variant Association Tools for Quality Control and Analysis of Large-Scale Sequence and Genotyping Array Data  Gao T. Wang, Bo Peng, Suzanne M. Leal  The.
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
A Subset-Based Approach Improves Power and Interpretation for the Combined Analysis of Genetic Association Studies of Heterogeneous Traits  Samsiddhi.
Japanese Population Structure, Based on SNP Genotypes from 7003 Individuals Compared to Other Ethnic Groups: Effects on Population-Based Association Studies 
Volume 173, Issue 1, Pages e9 (March 2018)
Biased Allelic Expression in Human Primary Fibroblast Single Cells
Maximizing the Power of Principal-Component Analysis of Correlated Phenotypes in Genome-wide Association Studies  Hugues Aschard, Bjarni J. Vilhjálmsson,
CYP3A Variation and the Evolution of Salt-Sensitivity Variants
Sequencing the IL4 locus in African Americans implicates rare noncoding variants in asthma susceptibility  Gabe Haller, BA, Dara G. Torgerson, PhD, Carole.
Ivan P. Gorlov, Olga Y. Gorlova, Shamil R. Sunyaev, Margaret R
Matthieu Foll, Oscar E. Gaggiotti, Josephine T
Simultaneous Genotype Calling and Haplotype Phasing Improves Genotype Accuracy and Reduces False-Positive Associations for Genome-wide Association Studies 
Haplotypes at ATM Identify Coding-Sequence Variation and Indicate a Region of Extensive Linkage Disequilibrium  Penelope E. Bonnen, Michael D. Story,
Genotype Imputation with Millions of Reference Samples
E. Wang, Y. -C. Ding, P. Flodman, J. R. Kidd, K. K. Kidd, D. L
Accurate Non-parametric Estimation of Recent Effective Population Size from Segments of Identity by Descent  Sharon R. Browning, Brian L. Browning  The.
Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test  Michael C. Wu, Seunggeun Lee, Tianxi Cai, Yun Li, Michael.
Highly Punctuated Patterns of Population Structure on the X Chromosome and Implications for African Evolutionary History  Charla A. Lambert, Caitlin F.
Complex Signatures of Natural Selection at the Duffy Blood Group Locus
Shuhua Xu, Wei Huang, Ji Qian, Li Jin 
Validation for Clinical Use of, and Initial Clinical Experience with, a Novel Approach to Population-Based Carrier Screening using High-Throughput, Next-Generation.
A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals  Brian L. Browning, Sharon.
A Fast, Powerful Method for Detecting Identity by Descent
Wei Pan, Il-Youp Kwak, Peng Wei  The American Journal of Human Genetics 
Jared R. Kohler, David J. Cutler 
Identifying Darwinian Selection Acting on Different Human APOL1 Variants among Diverse African Populations  Wen-Ya Ko, Prianka Rajan, Felicia Gomez, Laura.
L-GATOR: Genetic Association Testing for a Longitudinally Measured Quantitative Trait in Samples with Related Individuals  Xiaowei Wu, Mary Sara McPeek 
Trevor J. Pemberton, Chaolong Wang, Jun Z. Li, Noah A. Rosenberg 
Complex History of Admixture between Modern Humans and Neandertals
Long Runs of Homozygosity Are Enriched for Deleterious Variation
Are Variants in the CAPN10 Gene Related to Risk of Type 2 Diabetes
Yu Zhang, Tianhua Niu, Jun S. Liu 
A Definitive Haplotype Map as Determined by Genotyping Duplicated Haploid Genomes Finds a Predominant Haplotype Preference at Copy-Number Variation Events 
Jung-Ying Tzeng, Chih-Hao Wang, Jau-Tsuen Kao, Chuhsing Kate Hsiao 
Enhanced Localization of Genetic Samples through Linkage-Disequilibrium Correction  Yael Baran, Inés Quintela, Ángel Carracedo, Bogdan Pasaniuc, Eran Halperin 
External model validation of binary clinical risk prediction models in cardiovascular and thoracic surgery  Graeme L. Hickey, PhD, Eugene H. Blackstone,
Evaluating the Effects of Imputation on the Power, Coverage, and Cost Efficiency of Genome-wide SNP Platforms  Carl A. Anderson, Fredrik H. Pettersson,
Matthew A. Saunders, Jeffrey M. Good, Elizabeth C. Lawrence, Robert E
Genotype-Imputation Accuracy across Worldwide Human Populations
Zuoheng Wang, Mary Sara McPeek  The American Journal of Human Genetics 
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Brian C. Verrelli, Sarah A. Tishkoff 
Gonçalo R. Abecasis, Janis E. Wigginton 
Presentation transcript:

Haplotype Estimation Using Sequencing Reads Olivier Delaneau, Bryan Howie, Anthony J. Cox, Jean-François Zagury, Jonathan Marchini  The American Journal of Human Genetics  Volume 93, Issue 4, Pages 687-696 (October 2013) DOI: 10.1016/j.ajhg.2013.09.002 Copyright © 2013 The American Society of Human Genetics Terms and Conditions

Figure 1 Phasing Accuracy on the High-Coverage Trio Parents’ Genotypes Phased Together with the 1000GP Genotypes The x axes show the amount of sequencing reads on the trio parents that was used by SHAPEIT2. The y axes show the accuracy measured on the trio parents by MSD on the left and SER on the right. Results of Beagle are shown with black dots. SHAPEIT2 was run with (orange line) and without (green line) use of the ∼4x sequencing reads on the 1000GP samples. The American Journal of Human Genetics 2013 93, 687-696DOI: (10.1016/j.ajhg.2013.09.002) Copyright © 2013 The American Society of Human Genetics Terms and Conditions

Figure 2 Results of Simulations by using Paired-End Reads with Varying Insert Size Distribution The x axis shows the sequence coverage and the y axis shows the mean switch distance. Beagle and SHAPEIT2 were run without any phase informative reads, and these results are plotted as gray and black points, respectively. The results are shown for different mean insert sizes: 300 bp insert (green), 500 bp insert (orange), 1,000 bp insert (blue), a mixture of 300/500/1,000 bp inserts (pink), 90% 500 bp and 10% 6 kb (yellow), 50% 500 bp and 50% 6 kb (light green), and distribution of insert sizes matched to the distribution of distances between heterozygote SNPs (light brown). The American Journal of Human Genetics 2013 93, 687-696DOI: (10.1016/j.ajhg.2013.09.002) Copyright © 2013 The American Society of Human Genetics Terms and Conditions

Figure 3 Results of Simulations by using Long Reads with High Error Rates The x axis shows the sequence coverage and the y axis shows the mean switch distance. Beagle and SHAPEIT2 were run without any phase informative reads (0×) and these results are plotted as gray and black points respectively. The results are shown for different lengths of reads: 100 bp reads with 300 bp insert (green), 5 kb single reads (orange), 10 kb single reads (blue), and 20 kb single reads (pink). Two different base error rates were used in the simulations: 15% (dashed line) and 4% (solid line). The American Journal of Human Genetics 2013 93, 687-696DOI: (10.1016/j.ajhg.2013.09.002) Copyright © 2013 The American Society of Human Genetics Terms and Conditions

Figure 4 Phasing Accuracy at Rare Variants by using Simulated Data The left panel shows the results for the short paired-end reads with varying insert sizes. The right panel shows the results for the long reads with 4% error rates. The x axis shows the minor allele count ranging form 1 (singleton) to 10. The y axis shows the SER relative to the closest heterozygous genotype taken at a site with a minor allele count greater than 10. Results for Beagle and SHAPEIT2 without using any reads are plotted as black and red lines respectively. The legends in the two panels provide details of the different simulations. The American Journal of Human Genetics 2013 93, 687-696DOI: (10.1016/j.ajhg.2013.09.002) Copyright © 2013 The American Society of Human Genetics Terms and Conditions

Figure 5 Proportion of Heterozygous Genotypes Covered by Phase Informative Reads (PIRs) in Simulated Data The x axis shows the sequence coverage while the y axis gives the proportion of heterozygous SNPs covered by PIRs. Results for short pair-end reads are shown on the left plot with read length of 100 bp and various insert sizes: 300 bp insert (green), 500 bp insert (orange), 1,000 bp insert (blue), a mixture of 300/500/1000 bp inserts (pink), and distribution of insert sizes matched to the distribution of distances between heterozygote SNPs (light brown). Results for long reads are shown on the right plot with various read lengths: 5 kb single reads (orange), 10 kb single reads (blue), and 20 kb single reads (pink). The American Journal of Human Genetics 2013 93, 687-696DOI: (10.1016/j.ajhg.2013.09.002) Copyright © 2013 The American Society of Human Genetics Terms and Conditions

Figure 6 Comparison of Methods for Single Sample Phasing by using a Reference Panel The plots show the performance of several experiments in which single samples were phased in terms of SER (left panel) and MSD (right panel) on the y axis. The x axis shows the coverage of the sequencing reads used on the single samples. The results of using SHAPEIT2 with an unphased 1000GP reference panel are shown as a blue line. The results of using SHAPEIT2 with the official 1000GP Phase 1 phased reference panel (denoted v1) are shown as a red line. The results of using SHAPEIT2 with our own version of the 1000GP Phase 1 phased reference panel (denoted v2) are shown as a green line. The use of Beagle is shown a black dot. The American Journal of Human Genetics 2013 93, 687-696DOI: (10.1016/j.ajhg.2013.09.002) Copyright © 2013 The American Society of Human Genetics Terms and Conditions