A Fine Mapping Theorem to Refine Results from Association Genetics Studies S.J. Schrodi, V.E. Garcia, C.M. Rowland Celera, Alameda, CA ABSTRACT Justification.

Slides:



Advertisements
Similar presentations
Linkage and Genetic Mapping
Advertisements

What is an association study? Define linkage disequilibrium
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Association Tests for Rare Variants Using Sequence Data
Kin 304 Regression Linear Regression Least Sum of Squares
HERITAGE QTL3 Chromosome 13 July 16 Video Conference.
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
ASSOCIATION MAPPING WITH TASSEL Presenter: VG SHOBHANA PhD Student CPMB.
Meta-analysis for GWAS BST775 Fall DEMO Replication Criteria for a successful GWAS P
METHODS FOR HAPLOTYPE RECONSTRUCTION
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap Peter Castaldi January 29, 2013.
QTL Mapping R. M. Sundaram.
MALD Mapping by Admixture Linkage Disequilibrium.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
The role of variation in finding functional genetic elements Andy Clark – Cornell Dave Begun – UC Davis.
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Using biological networks to search for interacting loci in genome-wide association studies Mathieu Emily et. al. European journal of human genetics, e-pub.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Statistical Power Calculations Boulder, 2007 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Standardization of Pedigree Collection. Genetics of Alzheimer’s Disease Alzheimer’s Disease Gene 1 Gene 2 Environmental Factor 1 Environmental Factor.
Characterizing the role of miRNAs within gene regulatory networks using integrative genomics techniques Min Wenwen
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
A single-nucleotide polymorphism tagging set for human drug metabolism and transport Kourosh R Ahmadi, Mike E Weale, Zhengyu Y Xue, Nicole Soranzo, David.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Biology 101 DNA: elegant simplicity A molecule consisting of two strands that wrap around each other to form a “twisted ladder” shape, with the.
CS177 Lecture 10 SNPs and Human Genetic Variation
From Genome-Wide Association Studies to Medicine Florian Schmitzberger - CS 374 – 4/28/2009 Stanford University Biomedical Informatics
Supplemental Figure 1A. A small fraction of genes were mapped to >=20 SNPs. Supplemental Figure 1B. The density of distance from the position of an associated.
Genome-Wide Association Study (GWAS)
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Large-scale recombination rate patterns are conserved among human populations David Serre McGill University and Genome Quebec Innovation Center UQAM January.
Linear Reduction Method for Tag SNPs Selection Jingwu He Alex Zelikovsky.
Association analysis Genetics for Computer Scientists Biomedicum & Department of Computer Science, Helsinki Päivi Onkamo.
The International Consortium. The International HapMap Project.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Using Merlin in Rheumatoid Arthritis Analyses Wei V. Chen 05/05/2004.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Slide Slide 1 Chapter 10 Correlation and Regression 10-1 Overview 10-2 Correlation 10-3 Regression 10-4 Variation and Prediction Intervals 10-5 Multiple.
EBF FLJ31951UBLCP1 IL12B B36 Position Genes LD Regions Genotyped Markers Chr5 (q33.3) rs rs Figure 1. Physical map of 360kb around IL12B.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Understanding GWAS SNPs Xiaole Shirley Liu Stat 115/215.
Date of download: 7/2/2016 Copyright © 2016 American Medical Association. All rights reserved. From: How to Interpret a Genome-wide Association Study JAMA.
Date of download: 11/12/2016 Copyright © 2016 American Medical Association. All rights reserved. From: Influence of Child Abuse on Adult DepressionModeration.
Sofia A. Oliveira, Yi-Ju Li, Maher A
SNPs and complex traits: where is the hidden heritability?
Statistical analysis.
Common variation, GWAS & PLINK
S. J. Schrodi1, M. Chang1, V. E. Garcia1, N. Matsunami2, M. F
Statistical analysis.
CHS 221 Biostatistics Dr. wajed Hatamleh
Genome Wide Association Studies using SNP
BPK 304W Correlation.
Basic Practice of Statistics - 3rd Edition Inference for Regression
QTL Fine Mapping by Measuring and Testing for Hardy-Weinberg and Linkage Disequilibrium at a Series of Linked Marker Loci in Extreme Samples of Populations 
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants  Andrew.
Volume 14, Issue 7, Pages (February 2016)
Perspectives from Human Studies and Low Density Chip
GWAS-eQTL signal colocalisation methods
Evaluation of power for linkage disequilibrium mapping
Xiang Wan, Can Yang, Qiang Yang, Hong Xue, Xiaodan Fan, Nelson L. S
Tao Wang, Robert C. Elston  The American Journal of Human Genetics 
Yen-Pei Christy Chang, Xin Liu, James Dae Ok Kim, Morna A
Presentation transcript:

A Fine Mapping Theorem to Refine Results from Association Genetics Studies S.J. Schrodi, V.E. Garcia, C.M. Rowland Celera, Alameda, CA ABSTRACT Justification of Fine Mapping Theorem Figure 3. Error AnalysesFigure 1. Performance Under Disease Models Figure 2. Simulation Results THEORETICAL RESULTS Use of the Fine Mapping Theorem for Association Studies Multipoint Determination of the Most Likely Causal Site in a Region Figure 5. Decay of Association for TRAF1 Region in RA Figure4. T1D Fine Mapping Following the Fundamental Theorem of the HapMap originally described in Lai et al. (1994), the derivation of the Fine Mapping Theorem directly follows: Much remains to be explained about human genetic architecture and specific variants underlying important traits such as disease phenotypes – both critical to successful fine mapping following GWAS. High density mapping and inference of susceptibility variants is highly reliant upon the positional pattern of disease association peaks. In this work we describe the nature of the decay curve of association patterns due to declining LD from a causative site. Under a variety of disease models, we show that the central tendency approximation  2 M ~ r 2  2 D holds, where  2 M and  2 D are the chi-sq association statistics at a marker and disease-causing site, respectively; and r 2 is the standard measure of LD between the two sites. We use the phrase “fine mapping theorem” for this approximation to underscore its potential utility in discovering specific variants underlying traits studied in very high density mapping studies. Monte Carlo simulations were used to characterize the amount of error in the approximation. These results showed that the maximum mean squared error is a concave function of r 2 peaking at intermediate levels of r 2 across all the disease models screened. Next, given a potential causative polymorphism and several closely-linked sites with disease association data, a method was developed to quantify the departure from the fine mapping theorem. Calculating this departure metric for all SNPs in an associated region will give a measure of correspondence with the fine mapping theorem for each polymorphism, and enable one to determine the most likely (i.e. those with the smallest departure metric) disease-causing variants under a theoretical model (i.e. a single disease-predisposing variant and numerous closely-linked markers associated with disease solely through LD). Lastly, we applied these approaches to previously- published fine mapping datasets for type 1 diabetes (IL2RA region consisting of 305 SNPs) and rheumatoid arthritis (TRAF1 region consisting of 138 SNPs). In both datasets, single SNPs with the highest correspondence to the theoretical association decay patterns were identified. Conversely, SNPs deviating from their chi-sq values expected under the theorem may constitute additional susceptibility polymorphisms in the region studied. Similar applications of this fine mapping theorem may prove to be a pragmatic approach to delineate genes, gene regions, or functional motifs responsible for disease etiology subsequent to initial genetic results from GWAS. The decay of disease association (as measured by a P-value) at a marker as a function of decreasing LD with the disease-causing site. Three different instances of each of the classic disease models were evaluated. Decay patterns approximately follow a linear relationship between log P and r 2. The figure shows the central tendency pattern of the Chi-sq statistic at the marker to be closely approximated by the product of the r 2 value and the Chi-sq association statistic at the disease site. Two-locus simulation performed under Haldane recombination and an additive disease model at one SNP. The result shows the correlation between the product of the standard measure of LD (r 2 ) and the Chi-sq disease association statistic at the disease site and the disease association Chi-Sq statistic at the marker. Two-locus disease models were analytically modeled and simulated under additive, multiplicative, recessive and dominant effects. The results demonstrate the wide-ranging applicability of the fine mapping theorem across disease models and aid in characterizing the error in the approximation. Combined analyses of P-values across rheumatoid arthritis sample sets (data from Chang et al, 2008) are plotted as a function of LD with rs If rs were the sole driving force of the association observed in this region, then these data should fall along the theoretical line. Departures from this theoretical result can indicate markers that independently contribute to disease risk. Patterns show that the most likely causative SNPs are found in the TRAF1 gene. There are four direct uses of the fine mapping theorem: 1) The fine mapping theorem gives insight into how to select SNPs for fine mapping studies given an initial association finding. Having good LD coverage (i.e. SNPs in varying ranges of LD from the initial hit) is a key feature of fine mapping coverage. 2) The fine mapping theorem graphically shows which markers are good candidates for association tests of conditional independence to identify markers in the region that are independently associated with disease status 3) The fine mapping theorem enables one to directly calculate the min LD with a causative site to detect association at a marker: and 4) The fine mapping theorem provides a framework to test for the best fit causative marker given all of the genotyping data in a region (see below). An analytic/computational method has been developed to test the fit of the decay of association with decreasing LD for every marker in a fine mapped region against the theoretical prediction from the fine mapping theorem. The marker with the highest score will have the highest likelihood of being the causative site under the model constructed. The initial measure used is the sum of the squared deviation from the decay pattern predicted from the model. A Bayesian method is currently under development. In this poster we have presented a simple theoretical result that could potentially aid fine mapping efforts to refine association signals in a region of linkage disequilibrium. While this theorem is an approximation, it nonetheless provides an expected elementary, multipoint association pattern expected under a basic disease model with one causal site and closely-linked set of associated markers. Use of the theorem enables higher powered analysis to detect causative sites as well as markers with independent effects. REFERENCES CONCLUSIONS 1. Schrodi SJ. A Fine Mapping Theorem. Manuscript in Preparation 2. Lai C, Lyman RF, Long AD, Langley CH, Mackay TF (1994) Science 266: Pritchard JK, Przeworski M (2001) AJHG 69: Schrodi SJ, Garcia VE, Rowland C, Jones HB (2007) EJHG 15: Lowe CE, Cooper JD, Brusko T, et al. (2007) Nat Genet 39: Chang M, Rowland CM, Garcia VE, et al (2008) PLoS Genet 4(6):e Analysis results from Lowe et al data at IL2RA-linked markers. The T1D P-values for 305 SNPs were plotted as a function of LD with the most significant SNP. The diagonal line is the prediction made by the fine mapping theorem approx. The SNP circled is rs which significantly deviates from the expected pattern. In addition, this analysis shows how information is borrowed from neighboring sites to further support association at the top left marker. The mean squared error is presented between the LHS and RHS of the approximation as a function of LD under simulated data. Two disease models of small effect and 2-5% disease prevalence at the disease site were employed. independent marker EMPIRICAL RESULTS