Power in QTL linkage: single and multilocus analysis Shaun Purcell 1,2 & Pak Sham 1 1 SGDP, IoP, London, UK 2 Whitehead Institute, MIT, Cambridge, MA,

Slides:



Advertisements
Similar presentations
Basics of Linkage Analysis
Advertisements

Linkage Analysis: An Introduction Pak Sham Twin Workshop 2001.
Linkage analysis: basic principles Manuel Ferreira & Pak Sham Boulder Advanced Course 2005.
Human Genetics Genetic Epidemiology.
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Biometrical genetics Manuel Ferreira Shaun Purcell Pak Sham Boulder Introductory Course 2006.
Biometrical genetics Manuel Ferreira Shaun Purcell Pak Sham Boulder Introductory Course 2006.
Genetic Theory Manuel AR Ferreira Egmond, 2007 Massachusetts General Hospital Harvard Medical School Boston.
Association analysis Shaun Purcell Boulder Twin Workshop 2004.
Mapping Basics MUPGRET Workshop June 18, Randomly Intermated P1 x P2  F1  SELF F …… One seed from each used for next generation.
Reminder - Means, Variances and Covariances. Covariance Algebra.
Biometrical Genetics Pak Sham & Shaun Purcell Twin Workshop, March 2002.
Statistical Power Calculations Boulder, 2007 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Shaun Purcell & Pak Sham Advanced Workshop Boulder, CO, 2003
Copy the folder… Faculty/Sarah/Tues_merlin to the C Drive C:/Tues_merlin.
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
QTL mapping in animals. It works QTL mapping in animals It works It’s cheap.
Quantitative Trait Loci, QTL An introduction to quantitative genetics and common methods for mapping of loci underlying continuous traits:
Introduction to QTL analysis Peter Visscher University of Edinburgh
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,
Univariate modeling Sarah Medland. Starting at the beginning… Data preparation – The algebra style used in Mx expects 1 line per case/family – (Almost)
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
Gene Mapping Quantitative Traits using IBD sharing References: Introduction to Quantitative Genetics, by D.S. Falconer and T. F.C. Mackay (1996) Longman.
Whole genome approaches to quantitative genetics Leuven 2008.
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Power and Sample Size Boulder 2004 Benjamin Neale Shaun Purcell.
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
Quantitative Genetics
Power of linkage analysis Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Tutorial #10 by Ma’ayan Fishelson. Classical Method of Linkage Analysis The classical method was parametric linkage analysis  the Lod-score method. This.
Lecture 15: Linkage Analysis VII
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Environmental Modeling Basic Testing Methods - Statistics III.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Linkage and association Sarah Medland. Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
A Transmission/disequilibrium Test for Ordinal Traits in Nuclear Families and a Unified Approach for Association Studies Heping Zhang, Xueqin Wang and.
Errors in Genetic Data Gonçalo Abecasis. Errors in Genetic Data Pedigree Errors Genotyping Errors Phenotyping Errors.
Genetic Theory Pak Sham SGDP, IoP, London, UK. Theory Model Data Inference Experiment Formulation Interpretation.
Epistasis / Multi-locus Modelling Shaun Purcell, Pak Sham SGDP, IoP, London, UK.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Lecture 22: Quantitative Traits II
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
David M. Evans Multivariate QTL Linkage Analysis Queensland Institute of Medical Research Brisbane Australia Twin Workshop Boulder 2003.
Introduction to Genetic Theory
Biometrical Genetics Shaun Purcell Twin Workshop, March 2004.
QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
Association Mapping in Families Gonçalo Abecasis University of Oxford.
Lecture 17: Model-Free Linkage Analysis Date: 10/17/02  IBD and IBS  IBD and linkage  Fully Informative Sib Pair Analysis  Sib Pair Analysis with Missing.
Power in QTL linkage analysis
Regression Models for Linkage: Merlin Regress
Identifying QTLs in experimental crosses
I Have the Power in QTL linkage: single and multilocus analysis
Regression-based linkage analysis
Linkage in Selected Samples
Harald H.H. Göring, Joseph D. Terwilliger 
Power to detect QTL Association
Pak Sham & Shaun Purcell Twin Workshop, March 2002
Sarah Medland faculty/sarah/2018/Tuesday
Lecture 9: QTL Mapping II: Outbred Populations
Univariate Linkage in Mx
Power Calculation for QTL Association
Presentation transcript:

Power in QTL linkage: single and multilocus analysis Shaun Purcell 1,2 & Pak Sham 1 1 SGDP, IoP, London, UK 2 Whitehead Institute, MIT, Cambridge, MA, USA

Overview 1) Brief power primer 2) Calculating power for QTL linkage analysis Practical 1 : Using GPC for linkage power calculations 3) The adequacy of additive single locus analysis Practical 2 : Using Mx for linkage power calculations

1) Power primer Test statistic YES NO

P(T)P(T) T Critical value Sampling distribution if H A were true Sampling distribution if H 0 were true  

Rejection of H 0 Nonrejection of H 0 H 0 true H A true Type I error at rate  Type II error at rate  Significant result Nonsignificant result POWER =(1-  )

Impact of  effect size, N P(T)P(T) T Critical value  

Impact of  alpha P(T)P(T) T Critical value  

2) Power for QTL linkage For chi-squared tests on large samples, power is determined by non-centrality parameter ( ) and degrees of freedom (df) = E(2lnL A - 2lnL 0 ) = E(2lnL A ) - E(2lnL 0 ) where expectations are taken at asymptotic values of maximum likelihood estimates (MLE) under an assumed true model

Linkage test HAH0HAH0 for i=j for i  j for i=j for i  j

Linkage test For sib-pairs under complete marker information Expected NCP Determinant of 2-by-2 standardised covariance matrix = 1 - r 2 Note: standardised trait See Sham et al (2000) AJHG, 66. for further details

200 sibling pairs; sibling correlation 0.5. To calculate NCP if QTL explained 10% variance: 200 × = Concrete example

Approximation of NCP NCP per sibship is proportional to - the # of pairs in the sibship (large sibships are powerful) - the square of the additive QTL variance (decreases rapidly for QTL of v. small effect) - the sibling correlation (structure of residual variance is important)

IBD at QTL IBD at M P(IBD at M | IBD at QTL)

Using GPC Comparison to Haseman-Elston regression linkage Amos & Elston (1989) H-E regression - 90% power (at significant level 0.05) - QTL variance marker & major gene completely linked (θ = 0)  320 sib pairs - if θ = 0.1  778 sib pairs

GPC input parameters Proportions of variance additive QTL variance dominance QTL variance residual variance (shared / nonshared) Recombination fraction ( ) Sample size & Sibship size ( ) Type I error rate Type II error rate

GPC output parameters Expected sibling correlations - by IBD status at the QTL - by IBD status at the marker Expected NCP per sibship Power - at different levels of alpha given sample size Sample size - for specified power at different levels of alpha given power

GPC

From GPC Modelling additive effects only SibshipsIndividuals Pairs216 (320)432 Pairs (  = 0.1)543 (778)1086 Trios (  = 0.1) Quads (  = 0.1)90360 Quints (  = 0.1)55275

Practical 1 Using GPC, what is the effect on power to detect linkage of : 1. QTL variance? 2. residual sibling correlation? 3. marker-QTL recombination fraction?

Pairs required (  =0, p=0.05, power=0.8)

Effect of residual correlation QTL additive effects account for 10% trait variance Sample size required for 80% power (  =0.05) No dominance  = 0.1 Aresidual correlation 0.35 Bresidual correlation 0.50 Cresidual correlation 0.65

Individuals required

Effect of incomplete linkage

Some factors influencing power 1. QTL variance 2. Sib correlation 3. Sibship size 4. Marker informativeness & density 5. Phenotypic selection

Marker informativeness: Markers should be highly polymorphic - alleles inherited from different sources are likely to be distinguishable Heterozygosity (H) Polymorphism Information Content (PIC) - measure number and frequency of alleles at a locus

Polymorphism Information Content IF a parent is heterozygous, their gametes will usually be informative. BUT if both parents & child are heterozygous for the same genotype, origins of child’s alleles are ambiguous IF C = the probability of this occurring,

Singlepoint Trait locus Marker 1 θ1θ1 Trait locus Marker 1 Marker 2 T1T1 Multipoint T2T2 T3T3 T4T4 T5T5 T6T6 T7T7 T8T8 T9T9 T 10 T 11 T 12 T 13 T 14 T 15 T 16 T 17 T 18 T 19 T 20

Multipoint PIC: 10 cM map

Multipoint PIC: 5 cM map

The Singlepoint Information Content of the markers: Locus 1 PIC = Locus 2 PIC = Locus 3 PIC = The Multipoint Information Content of the markers: PosMPIC … meaninf

Selective genotyping ASPExtreme Discordant Maximally Dissimilar Proband Selection EDAC Unselected Mahanalobis Distance

Sibship informativeness : sib pairs

Impact of selection

E(-2LL) Sib 1 Sib 2 Sib

3) Single additive locus model locus A shows an association with the trait locus B appears unrelated Locus A Locus B

Joint analysis locus B modifies the effects of locus A: epistasis

Partitioning of effects Locus A Locus B MP MP

4 main effects M P M P Additive effects

6 twoway interactions MP MP   Dominance

6 twoway interactions M PM P   Additive-additive epistasis M PP M  

4 threeway interactions MPM P  M P MP   M P  MP   Additive- dominance epistasis

1 fourway interaction MMP  Dominance- dominance epistasis P

One locus Genotypic means AAm + a Aam + d aam - a 0 d +a-a

Two loci AAAaaa BB Bb bb m m m m m m m m m + a A – a A + d A + a B – a B + d B – aa + aa + dd + ad – da + da – ad

IBD locus 1 2 Expected Sib Correlation 01  2 A /2 +  2 S 02  2 A +  2 D +  2 S 10  2 A /2 +  2 S 11  2 A /2 +  2 A /2 +  2 AA /4 +  2 S 12  2 A /2 +  2 A +  2 D +  2 AA /2 +  2 AD /2 +  2 S 20  2 A +  2 D +  2 S 21  2 A +  2 D +  2 A /2 +  2 AA /2 +  2 DA /2 +  2 S 22  2 A +  2 D +  2 A +  2 D +  2 AA +  2 AD +  2 DA +  2 DD +  2 S 002S002S

Estimating power for QTL models Using Mx to calculate power i. Calculate expected covariance matrices under the full model ii. Fit model to data with value of interest fixed to null value i.True modelii. Submodel Q0SN -2LL0.000=NCP

Model misspecification Using the domqtl.mx script i.True ii. Full iii. Null Q A Q A 0 Q D 00 SSS NNN -2LL0.000T 1 T 2 Test: dominance onlyT 1 additive & dominanceT 2 additive onlyT 2 -T 1

Results Using the domqtl.mx script i.True ii. Full iii. Null Q A Q D S N LL Test: dominance only (1df)1.269 additive & dominance(2df) additive only(1df) = 11.28

Expected variances, covariances i.True ii. Full iii. Null Var Cov(IBD=0) Cov(IBD=1) Cov(IBD=2)

Potential importance of epistasis “… a gene’s effect might only be detected within a framework that accommodates epistasis…” Locus A A 1 A 1 A 1 A 2 A 2 A 2 Marginal Freq B 1 B Locus B B 1 B B 2 B Marginal

- DDV* A1 V* D1 V* A2 V* D2 V* AA V* AD V* DA - - ADV* A1 V* D1 V* A2 V* D2 V* AA AAV* A1 V* D1 V* A2 V* D DV* A1 -V* A AV* A H H FullV A1 V D1 V A2 V D2 V AA V AD V DA V DD V S and V N estimated in all models

True model VC Means matrix

NCP for test of linkage NCP1 Full model NCP2 Additive only model

Apparent VC under additive-only model Means matrix

Summary Linkage has low power to detect QTL of small effect Using selected and/or larger sibships increases power Single locus additive analysis is usually acceptable

GPC: two-locus linkage Using the module, for unlinked loci A and B with Means :Frequencies: 001p A = p B = Power of the full model to detect linkage? Power to detect epistasis? Power of the single additive locus model? (1000 pairs, 20% joint QTL effect, VS=VN)