Robust Inference of Identity by Descent from Exome-Sequencing Data

Slides:



Advertisements
Similar presentations
The Structure of Common Genetic Variation in United States Populations
Advertisements

Pharmacogenetics: Implications of race and ethnicity on defining genetic profiles for personalized medicine  Victor E. Ortega, MD, Deborah A. Meyers,
Katarzyna Bryc, Eric Y. Durand, J
Genetic Landscape of Eurasia and “Admixture” in Uyghurs
Katarzyna Bryc, Eric Y. Durand, J
Population Genetic Structure of the People of Qatar
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Model-free Estimation of Recent Genetic Relatedness
Inferring Identical-by-Descent Sharing of Sample Ancestors Promotes High-Resolution Relative Detection  Monica D. Ramstetter, Sushila A. Shenoy, Thomas.
The Genetic Legacy of Zoroastrianism in Iran and India: Insights into Population Structure, Gene Flow, and Selection  Saioa López, Mark G. Thomas, Lucy.
Volume 26, Issue 24, Pages (December 2016)
Alicia R. Martin, Christopher R. Gignoux, Raymond K
Estimating Kinship in Admixed Populations
Caroline Durrant, Krina T. Zondervan, Lon R
Genome-wide Analysis of Body Proportion Classifies Height-Associated Variants by Mechanism of Action and Implicates Genes Important for Skeletal Development 
Proportioning Whole-Genome Single-Nucleotide–Polymorphism Diversity for the Identification of Geographic Population Structure and Genetic Ancestry  Oscar.
Brian K. Maples, Simon Gravel, Eimear E. Kenny, Carlos D. Bustamante 
Alternative Splicing QTLs in European and African Populations
Elizabeth Theusch, Analabha Basu, Jane Gitschier 
Parisa Shooshtari, Hailiang Huang, Chris Cotsapas 
Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate Immunity Genes  Matthieu Deschamps, Guillaume Laval,
Variant Association Tools for Quality Control and Analysis of Large-Scale Sequence and Genotyping Array Data  Gao T. Wang, Bo Peng, Suzanne M. Leal  The.
Ida Moltke, Matteo Fumagalli, Thorfinn S. Korneliussen, Jacob E
Towfique Raj, Manik Kuchroo, Joseph M
A Selection Operator for Summary Association Statistics Reveals Allelic Heterogeneity of Complex Traits  Zheng Ning, Youngjo Lee, Peter K. Joshi, James.
Guidelines for Large-Scale Sequence-Based Complex Trait Association Studies: Lessons Learned from the NHLBI Exome Sequencing Project  Paul L. Auer, Alex.
Japanese Population Structure, Based on SNP Genotypes from 7003 Individuals Compared to Other Ethnic Groups: Effects on Population-Based Association Studies 
Volume 173, Issue 1, Pages e9 (March 2018)
Sequencing the IL4 locus in African Americans implicates rare noncoding variants in asthma susceptibility  Gabe Haller, BA, Dara G. Torgerson, PhD, Carole.
Selection and Reduced Population Size Cannot Explain Higher Amounts of Neandertal Ancestry in East Asian than in European Human Populations  Bernard Y.
Ivan P. Gorlov, Olga Y. Gorlova, Shamil R. Sunyaev, Margaret R
Characteristics of Neutral and Deleterious Protein-Coding Variation among Individuals and Populations  Wenqing Fu, Rachel M. Gittelman, Michael J. Bamshad,
Catarina D. Campbell, Nick Sampas, Anya Tsalenko, Peter H
Mamoru Kato, Yusuke Nakamura, Tatsuhiko Tsunoda 
Matthieu Foll, Oscar E. Gaggiotti, Josephine T
Simultaneous Genotype Calling and Haplotype Phasing Improves Genotype Accuracy and Reduces False-Positive Associations for Genome-wide Association Studies 
A Genetic Landscape Reshaped by Recent Events: Y-Chromosomal Insights into Central Asia  Tatiana Zerjal, R. Spencer Wells, Nadira Yuldasheva, Ruslan Ruzibakiev,
A Three–Single-Nucleotide Polymorphism Haplotype in Intron 1 of OCA2 Explains Most Human Eye-Color Variation  David L. Duffy, Grant W. Montgomery, Wei.
Molecular Convergence of Neurodevelopmental Disorders
Pier Francesco Palamara, Laurent C. Francioli, Peter R
Accurate Non-parametric Estimation of Recent Effective Population Size from Segments of Identity by Descent  Sharon R. Browning, Brian L. Browning  The.
Highly Punctuated Patterns of Population Structure on the X Chromosome and Implications for African Evolutionary History  Charla A. Lambert, Caitlin F.
Dan-Yu Lin, Zheng-Zheng Tang  The American Journal of Human Genetics 
Shuhua Xu, Wei Huang, Ji Qian, Li Jin 
Pier Francesco Palamara, Todd Lencz, Ariel Darvasi, Itsik Pe’er 
Brent S. Pedersen, Aaron R. Quinlan 
Whole-Genome-Sequence-Based Haplotypes Reveal Single Origin of the Sickle Allele during the Holocene Wet Phase  Daniel Shriner, Charles N. Rotimi  The.
Leonardo Arbiza, Srikanth Gottipati, Adam Siepel, Alon Keinan 
James A. Lautenberger, J. Claiborne Stephens, Stephen J
Long Homozygous Chromosomal Segments in Reference Families from the Centre d'Étude du Polymorphisme Humain  Karl W. Broman, James L. Weber  The American.
A Fast, Powerful Method for Detecting Identity by Descent
Stephen Leslie, Peter Donnelly, Gil McVean 
A Genomewide Search Using an Original Pairwise Sampling Approach for Large Genealogies Identifies a New Locus for Total and Low-Density Lipoprotein Cholesterol.
Haplotype Diversity across 100 Candidate Genes for Inflammation, Lipid Metabolism, and Blood Pressure Regulation in Two Populations  Dana C. Crawford,
Human Population Genetic Structure and Inference of Group Membership
Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium  Christopher S. Carlson,
Volume 78, Issue 7, Pages (October 2010)
Identifying Darwinian Selection Acting on Different Human APOL1 Variants among Diverse African Populations  Wen-Ya Ko, Prianka Rajan, Felicia Gomez, Laura.
Trevor J. Pemberton, Chaolong Wang, Jun Z. Li, Noah A. Rosenberg 
Widespread Allelic Heterogeneity in Complex Traits
Complex History of Admixture between Modern Humans and Neandertals
Analysis of protein-coding genetic variation in 60,706 humans
Markers for Mapping by Admixture Linkage Disequilibrium in African American and Hispanic Populations  Michael W. Smith, James A. Lautenberger, Hyoung.
Abraham's Children in the Genome Era: Major Jewish Diaspora Populations Comprise Distinct Genetic Clusters with Shared Middle Eastern Ancestry  Gil Atzmon,
Sarah E. Medland, Dale R. Nyholt, Jodie N. Painter, Brian P
Genotype-Imputation Accuracy across Worldwide Human Populations
Zuoheng Wang, Mary Sara McPeek  The American Journal of Human Genetics 
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Brian C. Verrelli, Sarah A. Tishkoff 
Population Genetic Structure of the People of Qatar
Presentation transcript:

Robust Inference of Identity by Descent from Exome-Sequencing Data Wenqing Fu, Sharon R. Browning, Brian L. Browning, Joshua M. Akey  The American Journal of Human Genetics  Volume 99, Issue 5, Pages 1106-1116 (November 2016) DOI: 10.1016/j.ajhg.2016.09.011 Copyright © 2016 American Society of Human Genetics Terms and Conditions

Figure 1 Evaluation and Implementation of ExIBD In the first step (identification), candidate IBD segments are detected for the whole genome using Beagle fastIBD with fastibdthreshold = 10−10. In the second step (refinement), endpoints of candidate IBD segments are refined with Beagle IBD with ibd2nonibd = 0.01 and nonibd2ibd = 0.0001. Finally, in the third step (filtering), IBD segments from the call set that are likely false positives using the locus-specific FDR are removed. The locus-specific FDR is estimated from the locus-specific power, false-positive rate, and the true IBD rate in the population. The American Journal of Human Genetics 2016 99, 1106-1116DOI: (10.1016/j.ajhg.2016.09.011) Copyright © 2016 American Society of Human Genetics Terms and Conditions

Figure 2 Feasibility of Detecting IBD Segments in Exome-Sequencing Data (A) The locus-specific detection power, evaluated by fastIBD with fastibdthreshold of 10−10 (in red) and Beagle IBD with ibd2nonibd = 0.01 and nonibd2ibd = 0.0001 (in blue), is heterogeneous across the genome. (B) Locus-specific power is significantly positively correlated with exon density (i.e., the proportion of sequence in the exons) and genetic diversity (as measured by nucleotide diversity) in exome-sequencing data. (C) The average locus-specific precision and recall (with 95% confidence interval) to detect IBD segments in exome-sequencing data by fastIBD and Beagle IBD under different parameter settings. All the evaluation results shown here were based on the detection of 2 cM IBD segments in chromosome 1. The American Journal of Human Genetics 2016 99, 1106-1116DOI: (10.1016/j.ajhg.2016.09.011) Copyright © 2016 American Society of Human Genetics Terms and Conditions

Figure 3 Performance of the Exome-Based IBD Detection Method ExIBD (A) The average locus-specific power (with 95% confidence interval) to detect IBD segments with size of 2 cM, 5 cM, and 10 cM after the first (identification) and second (refinement) steps. (B) The average locus-specific F-score measuring the tradeoff between precision and recall (with 95% confidence interval) after the first (identification) and second (refinement) steps. (C) The false-positive rate to detect IBD segments with different size intervals (e.g., 1.5–2.5 cM, 4.5–5.5 cM, and 9.5–10.5 cM) after the first (identification) and the second (refinement) steps. See Figure S4 for more results. The American Journal of Human Genetics 2016 99, 1106-1116DOI: (10.1016/j.ajhg.2016.09.011) Copyright © 2016 American Society of Human Genetics Terms and Conditions

Figure 4 Genomic Regions Can Be Used to Detect IBD Segments in Exome-Sequencing Data Proportion of genomic regions that can be used to detect IBD segments with different size intervals (e.g., 0.5–1.5 cM, 1.5–2.5 cM, …, 9.5–10.5 cM, and ≥10.5 cM) with a FDR < 0.1 for the three major continental populations (i.e., African, European, and East Asian populations). The American Journal of Human Genetics 2016 99, 1106-1116DOI: (10.1016/j.ajhg.2016.09.011) Copyright © 2016 American Society of Human Genetics Terms and Conditions

Figure 5 Performance of GERMLINE under Different Parameter Settings (A) The false-positive rate to detect IBD segments with different size intervals (e.g., 1.5–2.5 cM, 4.5–5.5 cM, and 9.5–10.5 cM). (B) The average locus-specific power (with 95% confidence interval) to detect IBD segments with size of 2 cM, 5 cM, and 10 cM. GERMLINE ran by setting the size of slice as 256, 128, 64, and 8, the maximum number of mismatching homozygous markers for a slice as 2, and turn-off of haplotype extension. See Figures S5 and S6 for more results. For comparison, the performance of ExIBD under the default setting (fastibdthreshold = 10−10, ibd2nonibd = 0.01, and nonibd2ibd = 0.0001) was shown in red line. A critical line, any false-positive rates above which can not control FDR < 0.1 even assuming the detection power is as high as 100%, was shown in black line. The American Journal of Human Genetics 2016 99, 1106-1116DOI: (10.1016/j.ajhg.2016.09.011) Copyright © 2016 American Society of Human Genetics Terms and Conditions

Figure 6 Delineating Fine-Scale Population Structure in 6,515 US Individuals (A) The relationship of IBD sharing among individuals summarized as a network, where each node represents an individual with the size proportional to the number of IBD segments shared between this individual and others, and each edge connects two nodes when individuals share IBD segments with the transparency proportional to the maximum IBD segment size. The software conf-informap identified two significant clusters (i.e., cluster 1 in blue and cluster 2 in brown). The significant subset of nodes within each cluster that are clustered together in at least 95% of all bootstrap networks is shown in darker colors. The significant subset of cluster 1 is a good representation of African Americans. The significant subset of cluster 2 represents a cryptic population structure in European Americans, likely with Ashkenazi Jewish ancestry. (B) Principal-component analysis of 6,515 US individuals based on common (MAF ≥ 0.1) and rare (MAF ≤ 0.005) variants. Individuals were colored as in (A). A small number of individuals with Ashkenazi Jewish ancestry, who are likely with higher admixture contribution from Europeans and clustered with other European Americans in the IBD networks with larger segment size, are highlighted in black. (C) Changes in the structure of IBD networks as a function of IBD size categories. Patterns of IBD sharing with different size intervals [0.5, 1.5), [1.5, 2.5), [2.5, 3.5), [3.5, 4.5), and ≥4.5 cM were separately shown by the networks. For each network, the node represents a cluster of individuals identified by conf-infomap, with the size proportional to the number of individuals in this cluster. The pie plot in each node represents the ancestry composition for individuals from the cluster, with the size proportional to the information flow contributed by each individual. An edge connects two nodes (clusters) if any individual pairs from these two clusters share IBD segments, with the transparency proportional to information flow between the clusters. The American Journal of Human Genetics 2016 99, 1106-1116DOI: (10.1016/j.ajhg.2016.09.011) Copyright © 2016 American Society of Human Genetics Terms and Conditions