Evaluating the Effects of Imputation on the Power, Coverage, and Cost Efficiency of Genome-wide SNP Platforms  Carl A. Anderson, Fredrik H. Pettersson,

Slides:



Advertisements
Similar presentations
Michael Dannemann, Janet Kelso  The American Journal of Human Genetics 
Advertisements

Rare, Low-Frequency, and Common Variants in the Protein-Coding Sequence of Biological Candidate Genes from GWASs Contribute to Risk of Rheumatoid Arthritis 
Three Genome-wide Association Studies and a Linkage Analysis Identify HERC2 as a Human Iris Color Gene  Manfred Kayser, Fan Liu, A. Cecile J.W. Janssens,
Marc A. Coram, Huaying Fang, Sophie I. Candille, Themistocles L
Claudio Verzilli, Tina Shah, Juan P
Chromosomal Microarray Detection of Constitutional Copy Number Variation Using Saliva DNA  Jennifer Reiner, Lisa Karger, Ninette Cohen, Lakshmi Mehta,
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
2016 Curt Stern Award Address: From Rare to Common Diseases: Translating Genetic Discovery to Therapy1  Brendan Lee  The American Journal of Human Genetics 
Comparing Algorithms for Genotype Imputation
Haplotype Estimation Using Sequencing Reads
Estimating Kinship in Admixed Populations
Caroline Durrant, Krina T. Zondervan, Lon R
Genetic variations associated with diabetic nephropathy and type II diabetes in a Japanese population  S. Maeda, N. Osawa, T. Hayashi, S. Tsukada, M.
Deep Whole-Genome Sequencing of 100 Southeast Asian Malays
Improved Heritability Estimation from Genome-wide SNPs
Brian K. Maples, Simon Gravel, Eimear E. Kenny, Carlos D. Bustamante 
Rounak Dey, Ellen M. Schmidt, Goncalo R. Abecasis, Seunggeun Lee 
Elizabeth Theusch, Analabha Basu, Jane Gitschier 
Gene-Expression Variation Within and Among Human Populations
Biased Gene Conversion Skews Allele Frequencies in Human Populations, Increasing the Disease Burden of Recessive Alleles  Joseph Lachance, Sarah A. Tishkoff 
10 Years of GWAS Discovery: Biology, Function, and Translation
Arpita Ghosh, Fei Zou, Fred A. Wright 
Michael Dannemann, Janet Kelso  The American Journal of Human Genetics 
Jingjing Li, Xiumei Hong, Sam Mesiano, Louis J
Multilocus Bayesian Meta-Analysis of Gene-Disease Associations
Genomic Signatures of Selective Pressures and Introgression from Archaic Hominins at Human Innate Immunity Genes  Matthieu Deschamps, Guillaume Laval,
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Imputation of KIR Types from SNP Variation Data
An Excess of Risk-Increasing Low-Frequency Variants Can Be a Signal of Polygenic Inheritance in Complex Diseases  Yingleong Chan, Elaine T. Lim, Niina.
Sequencing the IL4 locus in African Americans implicates rare noncoding variants in asthma susceptibility  Gabe Haller, BA, Dara G. Torgerson, PhD, Carole.
Ivan P. Gorlov, Olga Y. Gorlova, Shamil R. Sunyaev, Margaret R
Wei Zhang, Shiwei Duan, Emily O. Kistner, Wasim K. Bleibel, R
Sang Hong Lee, Naomi R. Wray, Michael E. Goddard, Peter M. Visscher 
The Genetics of Major Depression
Structural Architecture of SNP Effects on Complex Traits
Simultaneous Genotype Calling and Haplotype Phasing Improves Genotype Accuracy and Reduces False-Positive Associations for Genome-wide Association Studies 
Genotype Imputation with Millions of Reference Samples
Christoph Lange, Nan M. Laird  The American Journal of Human Genetics 
A Variant in LIN28B Is Associated with 2D:4D Finger-Length Ratio, a Putative Retrospective Biomarker of Prenatal Testosterone Exposure  Sarah E. Medland,
10 Years of GWAS Discovery: Biology, Function, and Translation
E. Wang, Y. -C. Ding, P. Flodman, J. R. Kidd, K. K. Kidd, D. L
Jon Wakefield  The American Journal of Human Genetics 
CAG Expansion in the Huntington Disease Gene Is Associated with a Specific and Targetable Predisposing Haplogroup  Simon C. Warby, Alexandre Montpetit,
Five Years of GWAS Discovery
Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test  Michael C. Wu, Seunggeun Lee, Tianxi Cai, Yun Li, Michael.
No Gene Is an Island: The Flip-Flop Phenomenon
Privacy Risks from Genomic Data-Sharing Beacons
Highly Punctuated Patterns of Population Structure on the X Chromosome and Implications for African Evolutionary History  Charla A. Lambert, Caitlin F.
Shuhua Xu, Wei Huang, Ji Qian, Li Jin 
Daniel J. Schaid, Shannon K. McDonnell, Scott J. Hebbring, Julie M
Benjamin A. Rybicki, Robert C. Elston 
A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals  Brian L. Browning, Sharon.
Gad Kimmel, Ron Shamir  The American Journal of Human Genetics 
Stephen Leslie, Peter Donnelly, Gil McVean 
Increasing the Power and Efficiency of Disease-Marker Case-Control Association Studies through Use of Allele-Sharing Information  Tasha E. Fingerlin,
Wei Pan, Il-Youp Kwak, Peng Wei  The American Journal of Human Genetics 
Jared R. Kohler, David J. Cutler 
Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium  Christopher S. Carlson,
L-GATOR: Genetic Association Testing for a Longitudinally Measured Quantitative Trait in Samples with Related Individuals  Xiaowei Wu, Mary Sara McPeek 
Trevor J. Pemberton, Chaolong Wang, Jun Z. Li, Noah A. Rosenberg 
Deleterious- and Disease-Allele Prevalence in Healthy Individuals: Insights from Current Predictions, Mutation Databases, and Population-Scale Resequencing 
Long Runs of Homozygosity Are Enriched for Deleterious Variation
Yu Zhang, Tianhua Niu, Jun S. Liu 
A Definitive Haplotype Map as Determined by Genotyping Duplicated Haploid Genomes Finds a Predominant Haplotype Preference at Copy-Number Variation Events 
Jung-Ying Tzeng, Chih-Hao Wang, Jau-Tsuen Kao, Chuhsing Kate Hsiao 
Iuliana Ionita-Laza, Seunggeun Lee, Vlad Makarov, Joseph D
Sarah E. Medland, Dale R. Nyholt, Jodie N. Painter, Brian P
Genotype-Imputation Accuracy across Worldwide Human Populations
Zuoheng Wang, Mary Sara McPeek  The American Journal of Human Genetics 
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Presentation transcript:

Evaluating the Effects of Imputation on the Power, Coverage, and Cost Efficiency of Genome-wide SNP Platforms  Carl A. Anderson, Fredrik H. Pettersson, Jeffrey C. Barrett, Joanna J. Zhuang, Jiannis Ragoussis, Lon R. Cardon, Andrew P. Morris  The American Journal of Human Genetics  Volume 83, Issue 1, Pages 112-119 (July 2008) DOI: 10.1016/j.ajhg.2008.06.008 Copyright © 2008 The American Society of Human Genetics Terms and Conditions

Figure 1 Minor-Allele Frequency of SNPs Directly Genotyped in 1,376 Samples from the 58C (A) Minor-allele frequency for the 450,769 SNPs that are featured on the HumanHap 550 but not the Affymetrix SNP Array 5.0 and are also polymorphic in the 58C. (B) Minor-allele frequency for the subset of 427,839 SNPs from (A) that are also polymorphic in the CEU HapMap data. (C) Minor-allele frequency for the 215,998 that are featured on the HumanHap 550 but not the Illumina HumanHap 300∗ and are also polymorphic in the 58C. (D) Minor-allele frequency for the subset of 203,860 SNPs from (C) that are also polymorphic in the CEU HapMap data. Basing imputations on haplotype data from the HapMap causes variation at rare SNPs (MAF ≤ 0.02) to be lost. The American Journal of Human Genetics 2008 83, 112-119DOI: (10.1016/j.ajhg.2008.06.008) Copyright © 2008 The American Society of Human Genetics Terms and Conditions

Figure 2 Assessment of Imputed-Genotype Filtering Criteria Assessment of filtering criteria for the Illumina HumanHap 550 genotypes based on Affymetrix SNP Array 5.0 (A–C) and Illumina HumanHap300∗ (D–F) genotype data. (A and D) The number of SNPs passing filter thresholds based on per-SNP measures of mean maximum posterior probability (blue) or genotype call rate (red). The number of these SNPs with an r2 ≥ 0.8 between direct and imputed genotype calls is shown after the removal of SNPs not passing filtering thresholds based on per-SNP measures of mean maximum posterior probability (dark gray) and genotype call rate (light gray). (B and E) The PLS-DA Q2 value after the removal of SNPs not passing filtering thresholds based on per-SNP measures of mean maximum posterior probability (blue) and genotype call rate (red). A Q2 value of 1 indicates that the current PLS model can perfectly predict whether a given genotype vector is of direct or imputed origin. A Q2 of 0 indicates that the model has no power to predict the genotype's origin. (C and F) Mean r2 between direct and imputed genotypes after the removal of SNPs not passing filtering thresholds based on per-SNP measures of mean maximum posterior probability (blue) and genotype call rate (red). The American Journal of Human Genetics 2008 83, 112-119DOI: (10.1016/j.ajhg.2008.06.008) Copyright © 2008 The American Society of Human Genetics Terms and Conditions

Figure 3 Mean power to Detect Association to a Disease with a Fixed Baseline Sample Size Mean power to detect association (α = 10−5) to a disease with a population prevalence of 0.0001 and a fixed baseline sample size across different genome-wide platforms (simulated under varying risk allele frequency [RAF] and sample size). RAF ranges are as follows: (A–C) 0.05 ≤ RAF < 0.10; (D–F) 0.10 ≤ RAF < 0.20; (G and H) 0.20 ≤ RAF ≤ 0.50. Cases and controls are as follows: (A, D, and G) 1,000 cases, 1,000 controls; (B, E, and F) 2,000 cases, 2,000 controls; (C, F, and I) 5,000 cases, 5,000 controls. Mean power was calculated after 10,000 simulations where sample size per simulation for each SNP set was weighted by the maximum r2 between a randomly selected HapMap SNP (satisfying RAF constraints) and the SNPs on the given genotyping platform (with HapMap release 21 CEU data). The American Journal of Human Genetics 2008 83, 112-119DOI: (10.1016/j.ajhg.2008.06.008) Copyright © 2008 The American Society of Human Genetics Terms and Conditions

Figure 4 Mean Power to Detect Association to a Disease Where Baseline Sample Size Has Been Varied across Genome-wide SNP Platforms to Reflect Relative Cost Mean power to detect association (α = 10−5) to a disease with a population prevalence of 0.0001 where baseline sample size has been varied across genome-wide SNP platforms to reflect the genotyping cost per sample (sample-size ratios: SNP Array 5.0 = 1.22; HumanHap 300 = 1.32; HumanHap 550 = 1; SNP Array 6.0 = 0.99; HumanHap 1M = 0.57). RAF ranges are as follows: (A–C) 0.05 ≤ RAF < 0.10; (D–F) 0.10 ≤ RAF < 0.20; (G and H) 0.20 ≤ RAF ≤ 0.50. Cases and controls are as follows: (A, D, and G) 1,000 cases, 1,000 controls; (B, E, and F) 2,000 cases, 2,000 controls; (C, F, and I) 5000 cases, 5,000 controls. Mean power was calculated after 10,000 simulations where sample size per simulation for each SNP set was weighted by the maximum r2 between a randomly selected HapMap SNP (satisfying RAF constraints) and the SNPs on the given genotyping platform (with HapMap release 21 CEU data). The American Journal of Human Genetics 2008 83, 112-119DOI: (10.1016/j.ajhg.2008.06.008) Copyright © 2008 The American Society of Human Genetics Terms and Conditions