Linkage in Selected Samples

Slides:



Advertisements
Similar presentations
Chapter 6: Quantitative traits, breeding value and heritability Quantitative traits Phenotypic and genotypic values Breeding value Dominance deviation.
Advertisements

Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Summarizing Variation Matrix Algebra & Mx Michael C Neale PhD Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Estimating “Heritability” using Genetic Data David Evans University of Queensland.
Power in QTL linkage: single and multilocus analysis Shaun Purcell 1,2 & Pak Sham 1 1 SGDP, IoP, London, UK 2 Whitehead Institute, MIT, Cambridge, MA,
Quantitative Genetics
Summarizing Variation Michael C Neale PhD Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
Missing Data Michael C. Neale International Workshop on Methodology for Genetic Studies of Twins and Families Boulder CO 2006 Virginia Institute for Psychiatric.
Mx Practical TC18, 2005 Dorret Boomsma, Nick Martin, Hermine H. Maes.
Raw data analysis S. Purcell & M. C. Neale Twin Workshop, IBG Colorado, March 2002.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Statistical Power Calculations Boulder, 2007 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003
Copy the folder… Faculty/Sarah/Tues_merlin to the C Drive C:/Tues_merlin.
Summarizing Variation Matrix Algebra Benjamin Neale Analytic and Translational Genetics Unit, Massachusetts General Hospital Program in Medical and Population.
Introduction to QTL analysis Peter Visscher University of Edinburgh
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
Linkage in selected samples Manuel Ferreira QIMR Boulder Advanced Course 2005.
Permutation Analysis Benjamin Neale, Michael Neale, Manuel Ferreira.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
Power of linkage analysis Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Linkage and association Sarah Medland. Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
1 Matrix Algebra Variation and Likelihood Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics VCU International Workshop on Methodology.
Epistasis / Multi-locus Modelling Shaun Purcell, Pak Sham SGDP, IoP, London, UK.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
David M. Evans Multivariate QTL Linkage Analysis Queensland Institute of Medical Research Brisbane Australia Twin Workshop Boulder 2003.
Categorical Data Frühling Rijsdijk 1 & Caroline van Baal 2 1 IoP, London 2 Vrije Universiteit, A’dam Twin Workshop, Boulder Tuesday March 2, 2004.
Efficient calculation of empirical p- values for genome wide linkage through weighted mixtures Sarah E Medland, Eric J Schmitt, Bradley T Webb, Po-Hsiu.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
Power in QTL linkage analysis
Regression Models for Linkage: Merlin Regress
Boulder Colorado Workshop March
HGEN Thanks to Fruhling Rijsdijk
Power and p-values Benjamin Neale March 10th, 2016
Genome Wide Association Studies using SNP
Can resemblance (e.g. correlations) between sib pairs, or DZ twins, be modeled as a function of DNA marker sharing at a particular chromosomal location?
Introduction to Linkage and Association for Quantitative Traits
I Have the Power in QTL linkage: single and multilocus analysis
Regression-based linkage analysis
Chapter 9 Hypothesis Testing.
Statistical Methods For Engineers
Power to detect QTL Association
GxE and GxG interactions
Pak Sham & Shaun Purcell Twin Workshop, March 2002
David M. Evans Sarah E. Medland
What are BLUP? and why they are useful?
EM for Inference in MV Data
Power and Sample Size I HAVE THE POWER!!! Boulder 2006 Benjamin Neale.
Lecture 9: QTL Mapping II: Outbred Populations
EM for Inference in MV Data
Univariate Linkage in Mx
William F. Forrest, Eleanor Feingold 
Power Calculation for QTL Association
BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS
Chapter 9 Estimation: Additional Topics
Multivariate Genetic Analysis: Introduction
Presentation transcript:

Linkage in Selected Samples Boulder Methodology Workshop 2003 Michael C. Neale Virginia Institute for Psychiatric & Behavioral Genetics

Pihat = p(IBD=2) + .5 p(IBD=1) Basic Genetic Model Pihat = p(IBD=2) + .5 p(IBD=1) 1 1 1 1 Pihat 1 1 1 E1 F1 Q1 Q2 F2 E2 e f q q f e P1 P2 Q: QTL Additive Genetic F: Family Environment E: Random Environment 3 estimated parameters: q, f and e Every sibship may have different model P P

Mixture distribution model Each sib pair i has different set of WEIGHTS rQ=1 rQ=.5 rQ=.0 weightj x Likelihood under model j p(IBD=2) x P(LDL1 & LDL2 | rQ = 1 ) p(IBD=1) x P(LDL1 & LDL2 | rQ = .5 ) p(IBD=0) x P(LDL1 & LDL2 | rQ = 0 ) Total likelihood is sum of weighted likelihoods

QTL's are factors Multiple QTL models possible, at different places on genome A big QTL will introduce non-normality Introduce mixture of means as well as covariances (27ish component mixture) Mixture distribution gets nasty for large sibships

Biometrical Genetic Model Genotype means AA m + a -a d +a Aa m + d aa m – a Courtesy Pak Sham, Boulder 2003

Mixture of Normal Distributions :aa :Aa :AA Equal variances, Different means and different proportions

Implementing the Model Estimate QTL allele frequency p Estimate distance between homozygotes 2a Compute QTL additive genetic variance as 2pq[a+d(q-p)]2 Compute likelihood conditional on IBD status QTL allele configuration of sib pair

27 Component Mixture Sib1 AA Aa aa Sib 2 AA Aa aa

19 Possible Component Mixture AA Aa aa Sib 2 AA Aa aa

Results of QTL Simulation 3 Component vs 19 Component 3 Component 19 Component Parameter True Q 0.4 0.414 0.395 A 0.08 0.02 0.02 E 0.6 0.56 0.58 Test Q=0 (Chisq) --- 13.98 15.88 200 simulations of 600 sib pairs each GASP http://www.nhgri.nih.gov/DIR/IDRB/GASP/

Information in selected samples Concordant or discordant sib pairs Deviation of pihat from .5 Concordant high pairs > .5 Concordant low pairs > .5 Discordant pairs < .5 How come?

Pihat deviates > .5 in ASP Larger proportion of IBD=2 pairs t IBD=2 Sib 2 IBD=1 IBD=0 t Sib 1

Pihat deviates <.5 in DSP’s Larger proportion of IBD=0 pairs t IBD=2 Sib 2 IBD=1 IBD=0 t Sib 1

Sibship informativeness : sib pairs -4 -3 -2 -1 1 2 3 4 Sib 1 trait Sib 2 trait 0.2 0.4 0.6 0.8 1.2 1.4 1.6 Sibship NCP Courtesy Shaun Purcell, Boulder Workshop 03/03

Sibship informativeness : sib pairs -4 -3 -2 -1 1 2 3 4 Sib 1 trait Sib 2 trait 0.5 1.5 Sibship NCP -4 -3 -2 -1 1 2 3 4 Sib 1 trait Sib 2 trait 0.5 1.5 Sibship NCP dominance -4 -3 -2 -1 1 2 3 4 Sib 1 trait Sib 2 trait 0.5 1.5 Sibship NCP unequal allele frequencies rare recessive Courtesy Shaun Purcell, Boulder Workshop 03/03

Two sources of information Forrest & Feingold 2000 Phenotypic similarity IBD 2 > IBD 1 > IBD 0 Even present in selected samples Deviation of pihat from .5 Concordant high pairs > .5 Concordant low pairs > .5 Discordant pairs < .5 These sources are independent

Implementing F&F Simplest form test mean pihat = .5 Predict amount of pihat deviation Expected pihat for region of sib pair scores Expected pihat for observed scores Use multiple groups in Mx

Predicting Expected Pihat deviation IBD=2 Sib 2 + IBD=1 IBD=0 t Sib 1

Expected Pihats: Theory IBD probability conditional on phenotypic scores x1,x2 E(pihat) = p(IBD=2|(x1,x2))+p(IBD=1|(x1,x2)) p(IBD=2|(x1,x2)) = Normal pdf: NIBD=2(x1,x2) p(IBD=2 |(x1,x2) )+2p(IBD=1 |(x1,x2) )+p(IBD=0 |(x1,x2) )

Expected Pihats Compute Expected Pihats with pdfnor \pdfnor(X_M_C) Observed scores X (row vector 1 x nvar) Means M (row vector) Covariance matrix C (nvar x nvar)

How to measure covariance? IBD=2 + IBD=1 IBD=0 t

Ascertainment Critical to many QTL analyses Deliberate Accidental Study design Accidental Volunteer bias Subjects dying

Advantages & Disadvantages Likelihood approach Advantages & Disadvantages Usual nice properties of ML remain Flexible Simple principle Consideration of possible outcomes Re-normalization May be difficult to compute

Example: Two Coin Toss Probability i = freq i / sum (freqs) Frequency 3 outcomes Frequency 2.5 2 1.5 1 0.5 HH HT/TH TT Outcome Probability i = freq i / sum (freqs)

Example: Two Coin Toss Probability i = freq i / sum (freqs) Frequency 3 outcomes Frequency 2.5 2 1.5 1 0.5 HH HT/TH TT Outcome Probability i = freq i / sum (freqs)

Non-random ascertainment Example Probability of observing TT globally 1 outcome from 4 = 1/4 Probability of observing TT if HH is not ascertained 1 outcome from 3 = 1/3 or 1/4 divided by ‘Ascertainment Correction' of 3/4 = 1/3

Correcting for ascertainment Univariate case; only subjects > t ascertained N t 0.5 0.4 0.3 G 0.2 likelihood 0.1 -4 -3 -2 -1 1 2 3 4 : xi

Ascertainment Correction Be / All you can be N(x) Itx N(x) dx

Affected Sib Pairs 4 4 Itx Ity N(x,y) dy dx ty tx 1 0 1 + + + + + + + + + + + + + ty + + + + + + + + + + + + + + + + + + + + + + + + tx 0 1

Selecting on sib 1 4 4 4 Ity I-4 N(x,y) dx dy Ity N(y) dy = ty tx 1 + + + 1 + + + + + + ty + + + + + + Twin 1 + + + + + + + + + + + + + + + + + + tx Twin 2 0 1

Correcting for ascertainment Dividing by the realm of possibilities Without ascertainment, we compute With ascertainment, the correction is pdf, N(:ij,Gij), at observed value xi divided by: I-4 N(:ij,Gij) dx = 1 4 4 It N(:ij,Gij) dx < 1

Ascertainment Corrections for Sib Pairs Itx Ity N(x,y) dy dx 4 ASP ++ 4 Itx Ity N(x,y) dy dx DSP +- -4 Itx Ity N(x,y) dy dx CUSP +- -4 -4

Weighted likelihood Correction not always necessary ML MCAR/MAR Prediction of missingness Correct through weight formula Use Mx’s \mnor(C_M_U_L_T) Cov matrix C Means M Upper Threshold U Lower Threshold L Type T 0 – below threshold U 1 – above threshold L 2 – between threshold 3 – -infinity to + infinity (ignore)

Normal Theory Likelihood Function For raw data in Mx m ln Li = fi ln [ 3 wj g(xi,:ij,Gij)] j=1 xi - vector of observed scores on n subjects :ij - vector of predicted means Gij - matrix of predicted covariances - functions of parameters

Correcting for ascertainment Linkage studies Multivariate selection: multiple integrals double integral for ASP four double integrals for EDAC Use (or extend) weight formula Precompute in a calculation group unless they vary by subject

Initial Results of Simulations Null Model 50% heritability No QTL Used to generate null distribution .05 empirical significance level at approximately 91 Chi-square QTL Simulations 37.5% heritability 12.5% QTL Mx: 879 significant at nominal .05 p-value Merlin: 556 significant at nominal .05 p-value Some apparent increase in power

Conclusion Quantifying QTL variation in selected samples can be done It ain’t easy Can provide more power Could be extended for analysis of correlated traits