Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium  Christopher S. Carlson,

Slides:



Advertisements
Similar presentations
SNP Selection University of Louisville Center for Genetics and Molecular Medicine January 10, 2008 Dana Crawford, PhD Vanderbilt University Center for.
Advertisements

A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
The Structure of Common Genetic Variation in United States Populations
Extensive Linkage Disequilibrium in Small Human Populations in Eurasia
Michael Dannemann, Janet Kelso  The American Journal of Human Genetics 
Analysis of Positive Selection at Single Nucleotide Polymorphisms Associated with Body Mass Index Does Not Support the “Thrifty Gene” Hypothesis  Guanlin.
Itsik Pe’er, Yves R. Chretien, Paul I. W. de Bakker, Jeffrey C
Power to detect QTL Association
CYP3A Variation and the Evolution of Salt-Sensitivity Variants
Lipopolysaccharide binding protein promoter variants influence the risk for Gram-negative bacteremia and mortality after allogeneic hematopoietic cell.
Population Genetic Structure of the People of Qatar
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Henry R. Johnston, David J. Cutler 
Polycystic ovary syndrome: an ancient disorder?
Linkage Thresholds for Two-stage Genome Scans
Caroline Durrant, Krina T. Zondervan, Lon R
Alessia Ranciaro, Michael C. Campbell, Jibril B
David H. Spencer, Kerry L. Bubb, Maynard V. Olson 
Accuracy of Haplotype Frequency Estimation for Biallelic Loci, via the Expectation- Maximization Algorithm for Unphased Diploid Genotype Data  Daniele.
Gene-Expression Variation Within and Among Human Populations
Biased Gene Conversion Skews Allele Frequencies in Human Populations, Increasing the Disease Burden of Recessive Alleles  Joseph Lachance, Sarah A. Tishkoff 
Michael Dannemann, Janet Kelso  The American Journal of Human Genetics 
Structural Analysis of Insulin Minisatellite Alleles Reveals Unusually Large Differences in Diversity between Africans and Non-Africans  John D.H. Stead,
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Complex Role of TNF Variants in Psoriatic Arthritis and Treatment Response to Anti- TNF Therapy: Evidence and Concepts  Ulrike Hüffmeier, Rotraut Mössner 
Guidelines for Large-Scale Sequence-Based Complex Trait Association Studies: Lessons Learned from the NHLBI Exome Sequencing Project  Paul L. Auer, Alex.
CYP3A Variation and the Evolution of Salt-Sensitivity Variants
Haplotype Fine Mapping by Evolutionary Trees
Systematic Linkage Disequilibrium Analysis of SLC12A8 at PSORS5 Confirms a Role in Susceptibility to Psoriasis Vulgaris  Ulrike Hüffmeier, Jesús Lascorz,
Sequencing the IL4 locus in African Americans implicates rare noncoding variants in asthma susceptibility  Gabe Haller, BA, Dara G. Torgerson, PhD, Carole.
Ivan P. Gorlov, Olga Y. Gorlova, Shamil R. Sunyaev, Margaret R
Volume 9, Issue 12, Pages (December 2016)
Ellen E. Quillen, Mark D. Shriver  Journal of Investigative Dermatology 
Structural Architecture of SNP Effects on Complex Traits
Matthieu Foll, Oscar E. Gaggiotti, Josephine T
Haplotypes at ATM Identify Coding-Sequence Variation and Indicate a Region of Extensive Linkage Disequilibrium  Penelope E. Bonnen, Michael D. Story,
Analysis of Positive Selection at Single Nucleotide Polymorphisms Associated with Body Mass Index Does Not Support the “Thrifty Gene” Hypothesis  Guanlin.
Christoph Lange, Nan M. Laird  The American Journal of Human Genetics 
A Three–Single-Nucleotide Polymorphism Haplotype in Intron 1 of OCA2 Explains Most Human Eye-Color Variation  David L. Duffy, Grant W. Montgomery, Wei.
E. Wang, Y. -C. Ding, P. Flodman, J. R. Kidd, K. K. Kidd, D. L
CAG Expansion in the Huntington Disease Gene Is Associated with a Specific and Targetable Predisposing Haplogroup  Simon C. Warby, Alexandre Montpetit,
Are Interactions between cis-Regulatory Variants Evidence for Biological Epistasis or Statistical Artifacts?  Alexandra E. Fish, John A. Capra, William.
Highly Punctuated Patterns of Population Structure on the X Chromosome and Implications for African Evolutionary History  Charla A. Lambert, Caitlin F.
Complex Signatures of Natural Selection at the Duffy Blood Group Locus
Shuhua Xu, Wei Huang, Ji Qian, Li Jin 
Jonathan K. Pritchard, Joseph K. Pickrell, Graham Coop  Current Biology 
A modest but significant effect of CGB5 gene promoter polymorphisms in modulating the risk of recurrent miscarriage  Kristiina Rull, M.D., Ph.D., Ole.
A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals  Brian L. Browning, Sharon.
Leonardo Arbiza, Srikanth Gottipati, Adam Siepel, Alon Keinan 
James A. Lautenberger, J. Claiborne Stephens, Stephen J
Extensive Linkage Disequilibrium in Small Human Populations in Eurasia
Haplotype Diversity across 100 Candidate Genes for Inflammation, Lipid Metabolism, and Blood Pressure Regulation in Two Populations  Dana C. Crawford,
Evolutionary History of the ADRB2 Gene in Humans
Identifying Darwinian Selection Acting on Different Human APOL1 Variants among Diverse African Populations  Wen-Ya Ko, Prianka Rajan, Felicia Gomez, Laura.
L-GATOR: Genetic Association Testing for a Longitudinally Measured Quantitative Trait in Samples with Related Individuals  Xiaowei Wu, Mary Sara McPeek 
Genome-Wide Association Study of Generalized Vitiligo in an Isolated European Founder Population Identifies SMOC2, in Close Proximity to IDDM8   Stanca.
Genetic Polymorphisms in the Polycomb Group Gene EZH2 and the Risk of Lung Cancer  Kyong-Ah Yoon, MD, Hye Jin Gil, MS, Jihye Han, MS, Jaehee Park, MS,
Stephen Wooding, Un-kyung Kim, Michael J
Tao Wang, Robert C. Elston  The American Journal of Human Genetics 
Sarah E. Medland, Dale R. Nyholt, Jodie N. Painter, Brian P
Enhanced Localization of Genetic Samples through Linkage-Disequilibrium Correction  Yael Baran, Inés Quintela, Ángel Carracedo, Bogdan Pasaniuc, Eran Halperin 
Evaluating the Effects of Imputation on the Power, Coverage, and Cost Efficiency of Genome-wide SNP Platforms  Carl A. Anderson, Fredrik H. Pettersson,
Lactase Haplotype Diversity in the Old World
Genotype-Imputation Accuracy across Worldwide Human Populations
Volume 23, Issue 11, Pages (November 2015)
Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors  Michael Dannemann, Aida M.
Michael P. Epstein, Richard Duncan, Erin B. Ware, Min A
Brian C. Verrelli, Sarah A. Tishkoff 
Beyond GWASs: Illuminating the Dark Road from Association to Function
Population Genetic Structure of the People of Qatar
Presentation transcript:

Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium  Christopher S. Carlson, Michael A. Eberle, Mark J. Rieder, Qian Yi, Leonid Kruglyak, Deborah A. Nickerson  The American Journal of Human Genetics  Volume 74, Issue 1, Pages 106-120 (January 2004) DOI: 10.1086/381000 Copyright © 2004 The American Society of Human Genetics Terms and Conditions

Figure 1 Common variation and LD in European Americans at BDKRB2. At the BDKRB2 gene, 22 SNPs with MAF>10% were described for the European American samples (A). Patterns of genotype at each SNP are shown as a visual genotype plot, in which each column represents a site and each row represents a sample. Genotype is color coded, as shown, with SNPs presented in the order they were identified across the gene. Patterns of genotype are clearly similar for many SNPs (e.g., sites 10922 and 12574) but not necessarily for adjacent SNPs. The same data are shown in panel B, with the order of SNPs rearranged such that each SNP is adjacent to SNPs with similar patterns of genotype. Among the 22 SNPs, the LD-based SNP-selection algorithm identified five bins of tagSNPs at an r2 threshold of 0.5. tagSNP bins are boxed (B). The LD statistic r2 describes the similarity of pattern between pairs of polymorphic sites: pairwise r2 between SNPs is shown for the same order of SNPs as in panel B, and bins of SNPs with similar patterns are visible as reddish triangles above the diagonal (C). The American Journal of Human Genetics 2004 74, 106-120DOI: (10.1086/381000) Copyright © 2004 The American Society of Human Genetics Terms and Conditions

Figure 2 tagSNPs per gene, with threshold r2>0.5 and MAF>10%. The complete genomic region of 100 genes was resequenced in 24 unrelated African American and 23 unrelated European American samples. Within each population, tagSNPs were selected from all SNPs with MAF>10% at an r2 threshold of 0.5. A, The number of tagSNPs selected in each gene under these parameters, plotted against the size of the genomic region for each gene. Although there is a clear trend toward more tagSNPs in larger genes, there is considerable variance in the required tagSNP density in both populations. B, The number of tagSNPs selected in each gene, plotted against nucleotide diversity (π) per base pair. Thus, variance in tagSNP density between genes reflects both variation in nucleotide diversity and variation in the average extent of LD within genes. Within each gene, a greater number of tagSNPs is generally required in the African American population, reflecting both greater nucleotide diversity and shorter range LD, relative to the European American population. The American Journal of Human Genetics 2004 74, 106-120DOI: (10.1086/381000) Copyright © 2004 The American Society of Human Genetics Terms and Conditions

Figure 3 Total tagSNP bins in 100 genes, versus threshold r2. At each r2 threshold, tagSNP bins were identified for 100 genes within African American (“AA tagSNPs”) and European American (“EA tagSNPs”) populations. As expected, more tagSNP bins were identified in African American samples than in European American samples. To measure the effects of population stratification on the LD-select algorithm, tagSNPs were also selected from merged African American and European American populations (“Merged tagSNPs”). The minimal set of tagSNPs relevant to both populations was also assembled at each r2 threshold as the union of the tagSNP sets selected in each subpopulation (“Optimal tagSNPs”); this set was larger than the tagSNP set in either subpopulation alone but considerably smaller than the sum of the population-specific site sets, reflecting the fact that many (but not all) tagSNPs were useful in both populations. The American Journal of Human Genetics 2004 74, 106-120DOI: (10.1086/381000) Copyright © 2004 The American Society of Human Genetics Terms and Conditions

Figure 4 The relationship between LD-selected tagSNPs and haplotypes. For each gene, haplotypes were inferred computationally. Results are shown as the fraction of haplotypes resolved using only LD-selected tagSNPs, relative to haplotypes resolved using all common SNPs. Results are shown across a range of r2 values in each population (A). The effective number of haplotypes weights the number of haplotypes by frequency, with common haplotypes more heavily weighted. For each gene, the fraction of effective haplotypes resolved using only LD-selected tagSNPs, relative to effective haplotypes resolved, by use of all common SNPs is shown across a range of r2 values in each population (B). For r2 thresholds >0.5, >80% of effective haplotypes were resolved, demonstrating how, at adequately stringent r2 thresholds, LD-selected tagSNPs efficiently resolve common haplotypes. The American Journal of Human Genetics 2004 74, 106-120DOI: (10.1086/381000) Copyright © 2004 The American Society of Human Genetics Terms and Conditions

Figure 5 tagSNP bins and the evolutionary relationships between haplotypes. A hypothetical nonrecombinant region with five existing haplotypes is shown, with each row (A–E) representing a haplotype and each column (1–7) representing an SNP with a unique pattern of alleles. The common allele is shown as blue and the rare allele as yellow. There are five possible patterns (1–5) that are haplotype specific, and two (6 and 7) that are specific to clades of related haplotypes. LD-based tagSNP selection at an adequately stringent r2 threshold would identify all seven patterns in this hypothetical region. Thus, directly testing LD-selected tagSNPs can identify disease associations with either specific haplotypes or with clades of related haplotypes. The American Journal of Human Genetics 2004 74, 106-120DOI: (10.1086/381000) Copyright © 2004 The American Society of Human Genetics Terms and Conditions