Download presentation
Presentation is loading. Please wait.
1
Association analysis Shaun Purcell Boulder Twin Workshop 2004
2
Overview Candidate gene association Haplotypes and linkage disequilibrium Linkage and association Family-based association
3
What is association? Categorical traits –disease susceptibility genes Continuous traits –quantitative trait loci, QTL
4
Disease traits Case Control AAn 1 n 2 Aan 3 n 4 aan 5 n 6 Is there a difference in allele/genotype frequency between cases and controls?
5
Disease traits Case Control AA 3025p 2 Aa 50502p(1-p) aa 20 25 (1-p) 2 Is there a difference in allele/genotype frequency between cases and controls? Test for independence, p-value
6
Disease traits CaseControl AAn1n1 n2n2 Aan3n3 n4n4 aan5n5 n6n6 CaseControl A2n 1 +n 3 2n 2 +n 4 a2n 5 +n 3 2n 6 +n 4 CaseControl A*n 1 +n 3 n 2 +n 4 aan5n5 n6n6 General model Additive modelDominant model for A 2 df 1 df Effect sizes calculated as odds ratios
7
Relative risk D+D- E+ab E-cd Risk in E+ = a / ( a + b ) Risk in E- = c / ( c + d ) Relative risk of exposure = (a /( a + b )) / (c /(c + d ))
8
Odds ratio D+D- E+ab E-cd Odds in D+ = a/c Odds in D- = b/d Odds ratio = (a/c) / (b/d)
9
Quantitative traits AA Aa aa Aa AA IDYGAD 0010.34aa-10 0021.23Aa01 0031.66Aa01 0042.74AA10 0051.33AA10 …………… Y = aA + dD + e
10
Some web resources BGIM http://statgen.iop.kcl.ac.uk/bgim/ Introductory tutorials on twin analysis, primer on maximum likelihood, Mx language. GxE moderator models http://statgen.iop.kcl.ac.uk/gxe/ Power calculation http://statgen.iop.kcl.ac.uk/gpc/ Case/control association tools http://statgen.iop.kcl.ac.uk/gpc/model/
12
Relative risk GenotypeP(D|G)RR AAP(D|AA)P(D|AA)/P(D|aa) AaP(D|Aa)P(D|Aa)/P(D|aa) aaP(D|aa)1 P(D|AA) / P(D|aa) labelled RR(AA) P(D|Aa) / P(D|aa) labelled RR(Aa)
13
Genetic models ModelRR(Aa)RR(AA) Generalxy Multiplicativexx2x2 Dominantxx Recessive1.000x No effect1.000
14
Tests TestAlternateNull Any effect? GeneralNo effect Any effect assuming a multiplicative gene? MultiplicativeNo effect Any effect assuming a dominant gene? DominanceNo effect Any effect assuming a recessive gene? RecessiveNo effect Can we assume a multiplicative effect? GeneralMultiplicative Can we assume a dominant effect? GeneralDominance Can we assume a recessive effect? GeneralRecessive
15
Multiple samples Constrain frequencies across samples Constrain effects across samples –Can test genetic models with effects and/or frequencies constrained to be equal –Can perform tests of homogeneity of effects and/or frequencies across samples
16
An example 2 case/control samples Population frequency 5% CaseControl AA1711 Aa3559 aa2440 CaseControl AA3710 Aa6743 aa2037
18
Homogeneous effects across samples Homogeneous allele frequencies across samples ModelpRR(Aa)RR(AA)-2LL ---------------------- Gen0.367 1.979 3.663 0.367 1.979 3.663793.143 Mult0.367 1.9113.651 0.367 1.9113.651793.199 Dom 0.4011.9901.990 0.4011.9901.990802.927 Rec0.4051.0001.921 0.4051.0001.921805.064 None0.4421.0001.000 0.4421.0001.000 815.628
19
Heterogeneous effects across samples Homogeneous allele frequencies across samples ModelpRR(Aa)RR(AA)-2LL ----- ------- ---------- Gen0.367 1.2352.136 0.367 2.890 5.547786.498 Mult 0.367 1.4402.073 0.367 2.2825.208788.262 Dom 0.4011.2161.216 0.4012.9362.936796.422 Rec0.4051.0001.519 0.4051.0002.195803.849 None0.4431.0001.000 0.4431.0001.000815.628
20
TESTS OF GENETIC MODELS -- ASSUMING EQ EFFECTS & EQ FREQS ========================================================= Gen vs None (2 df) : 22.485p = 0.000 Mult vs None (1 df) : 22.429p = 0.000 Dom vs None (1 df) : 12.701p = 0.000 Rec vs None (1 df) : 10.564p = 0.001 Gen vs Mult (1 df) : 0.056p = 0.813 Gen vs Dom (1 df) : 9.784p = 0.002 Gen vs Rec (1 df) : 11.921p = 0.001 TESTS OF GENETIC MODELS -- ASSUMING UNEQ EFFECTS & EQ FREQS =========================================================== Gen vs None (4 df) : 29.130p = 0.000 Mult vs None (2 df) : 27.366p = 0.000 Dom vs None (2 df) : 19.205p = 0.000 Rec vs None (2 df) : 11.779p = 0.003 Gen vs Mult (2 df) : 1.764p = 0.414 Gen vs Dom (2 df) : 9.925p = 0.007 Gen vs Rec (2 df) : 17.351p = 0.000 TESTS OF EQUAL EFFECTS -- ASSUMING EQ FREQS =========================================== w/ Gen model (2 df) : 6.645p = 0.036 w/ Mult model (1 df) : 4.938p = 0.026 w/ Dom model (1 df) : 6.505p = 0.011 w/ Rec model (1 df) : 1.215p = 0.270
21
Indirect association QTL Genotyped markers Ungenotyped markers
22
Recombination Paternal chromosome Maternal chromosome Homologous chromosomes in one parent Recombination event during meiosis Recombinant gamete transmitted, harboring mutation
23
Recombination Paternal chromosome Maternal chromosome Homologous chromosomes in one parent No recombination event during meiosis Nonrecombinant gamete transmitted, not harboring mutation
24
Linkage: affected sib pairs Paternal chromosome Maternal chromosome First affected offspring, no recombination Second affected offspring, recombinant gamete IBD sharing from this one parent (0 or 1) 1 0
25
Association analysis Mutation occurs on a ‘red’ chromosome
26
Association analysis Mutation occurs on a ‘red’ chromosome
27
Association analysis Association due to `linkage disequilibrium’
28
Aa MAMaM mAmam This individual has aa and Mm genotypes and am and aM haplotypes Haplotypes
29
Aa MAMaM mAmam This individual has Aa and Mm genotypes and AM and am haplotypes … but given only genotype data, consistent with Am/aM as well as AM/amHaplotypes
30
Aa MAMaM mAmam This individual has AA and Mm genotypes and AM and Am haplotypesHaplotypes
31
Equilibrium haplotype frequencies Aa Mprpsp mqrqsq rs
32
Linkage disequilibrium Aa Mpr + Dps - Dp mqr - Dqs + Dq rs D MAX = Min(qs, pr) D’ = D /D MAX r 2 = D’ / pqrs
33
Haplotype analysis 1.Estimate haplotypes from genotypes 2.Associate haplotypes with trait HaplotypeFreq.Odds Ratio AAGG40%1.00* AAGT30%2.21 CGCG25%1.07 AGCT5%0.92 * baseline, fixed to 1.00
35
LinkageAssociation QTL genotype Trait IBD at the QTL Sib correlation 0 1 2 aaAaAA Marker genotype Trait QTL genotype Trait LD RF IBD at the Marker Sib correlation 0 1 2 IBD at the QTL Sib correlation 0 1 2 aaAaAA aaAaAA
36
Variance Components Means M 1 M 2 Variance-covariance matrix V 1 C 21 C 12 V 2 ASSOCIATION LINKAGE
37
Variance Components Means M 1 + bG 1 M 2 + bG 2 Variance-covariance matrix V 1 C 21 + q( -½) C 12 + q( -½) V 2 LINKAGE q = regression coef. = IBD sharing 0, ½, 1 ASSOCIATION b = regression coef. G = individual’s genotype
38
POPULATION MODEL –Allele & genotype frequencies –Demographics & population history –Linkage disequilibrium, haplotype structure TRANSMISSION MODEL –Mendelian segregation –Identity by descent & genetic relatedness PHENOTYPE MODEL –Biometrical model of quantitative traits –Additive & dominance components Components of a Genetic Theory G G G G G G G G Time G G G G G G G G G G G G G G GG PP
39
3/52/6 3/2 5/2 3/52/6 3/6 5/6 Both families are ‘linked’ with the marker… …but a different allele is involved. Linkage without association
40
3/62/4 3/2 6/2 3/52/6 3/6 5/6 All families are ‘linked’ with the marker… … and allele 6 is ‘associated’ with disease 4/62/6 6/66/6 6/66/6 Linkage is just association within families Linkage and association
41
3/6 2/4 3/2 6/2 3/5 2/5 3/6 5/6 Allele 6 is more common in the GREEN population The disease is more common in the GREEN population … a ‘spurious association’ 4/6 2/6 6/66/6 2/2 3/4 5/2 ControlsCases Association without linkage
42
TDT Transmission disequilibrium test –test for linkage and association AA Aa AA Aa aa AA Aa
43
TDT “A” disease allele AA x Aa AA x Aa aa x Aa aa x Aa AA Aa Aa aa + - + - 0.5 0.5 + - + - 0.5 0.5 Additive Dominant Recessive
44
Between and within components Sib1 Sib2 Sib1 = B - W Sib2 = B + W
45
Between and within components Fulker et al (1999) S1S1 S2S2 S1S1 S2S2 BWS1S1 S2S2 AA 1110B+WB-W AAAa100.5 B+WB-W AAaa101B+WB-W Note : W = S 1 – B
46
Parental genotypes Use parental genotypes to generate B Examples –AA from AAxAA W = 0 –Aa from AAxAa W = -0.5 –Aa from AaxAa W = 0 PatMatB 111 100.5 10 010.5 000 0-0.5 10 0-0.5
47
assoc.mx Sibling pair sample B and W components precalculated in input file Single SNP genotype Quantitative trait
48
assoc.dat -0.007 -0.972 -1 0 -0.5 -0.5 0.5 -0.829 -0.196 1 1 1 0 0 0.369 0.645 1 1 1 0 0 0.318 1.55 0 1 0.5 -0.5 0.5 1.52 0.910 0 0 0 0 0 -0.948 -1.55 1 1 1 0 0 0.596 -0.394 1 0 0.5 0.5 -0.5 -1.91 -0.905 0 1 0.5 -0.5 0.5 0.499 0.940 1 0 0.5 0.5 -0.5 -1.17 -1.29 1 0 0.5 0.5 -0.5 -0.16 -1.81 1 1 1 0 0 s1 s2 g1 g2 b w1 w2
49
! Mx script for QTL association: sib pairs, univariate Group 1 : Calc NG=2 Begin Matrices; ! ** Parameters B Full 1 1 free! association : between component W Full 1 1 free ! association : within component M Full 1 1 free ! mean S Full 1 1 free ! Shared residual variance N Full 1 1 free! Nonshared residual variance ! ** Definition variables ** C Full 1 1 ! association : between X Full 1 1 ! association : within, sib 1 Y Full 1 1 ! association : within, sib 2 End Matrices; ! ** Uncomment for B=W model ! Equate W 1 1 1 B 1 1 1 ! Starting values Matrix B 0 Matrix W 0 Matrix M 0 Matrix S 0.5 Matrix N 0.5 End
50
Group2 : Data Group Data NI=7 NO=0 RE file=assoc.dat Labels Sib1 Sib2 g1 g2 b w1 w2 Select Sib1 Sib2 b w1 w2 / Definition b w1 w2 / Matrices = Group 1 Means M + B*C + W*X | M + B*C + W*Y / Covariance S + N | S _ S | S + N / Specify C b / Specify X w1 / Specify Y w2 / End
51
Models B & W B Full 1 1 free W Full 1 1 free !Equate W 1 1 1 B 1 1 1 B = W B Full 1 1 free W Full 1 1 free Equate W 1 1 1 B 1 1 1 B B Full 1 1 free W Full 1 1 !Equate W 1 1 1 B 1 1 1 B=W=0 B Full 1 1 W Full 1 1 !Equate W 1 1 1 B 1 1 1
52
Tests TestH A H 0 Standard association testB = WB=W=0 Test of stratificationB & W B = W Robust association testB & W B
53
assoc.mx ModelBW-2LLdf B & W -0.478 -0.3652103.96795 B = W -0.420 -0.4202105.05796 B -0.47782127.01796 B=W=0 2163.34797 Test of total association H A B=W2105.05 H 0 B=W=02163.34 Δ-2LL= 58.29, df = 1, p < 1e-14
54
assoc.mx ModelBW-2LLdf B & W -0.478 -0.3652103.96795 B = W -0.420 -0.4202105.05796 B -0.47782127.01796 B=W=0 2163.34797 Test of stratification H A B &W2103.96 H 0 B = W2105.05 Δ-2LL= 1.09, df = 1, p =0.29
55
assoc.mx ModelBW-2LLdf B & W -0.478 -0.3652103.96795 B = W -0.420 -0.4202105.05796 B -0.47782127.01796 B=W=0 2163.34797 Test of within association H A B &W2103.96 H 0 B2127.01 Δ-2LL= 23.06, df = 1, p < 1e-6
56
Implementation QTDT –Abecasis et al (2001) AJHG –extends between/within model to general pedigrees –multiple alleles –covariates –combined test of linkage and association –discrete as well as quantitative traits
57
Linkage Association families detectable over large distances >10 cM large effects OR >3, variance>10% unrelateds or families detectable over small distances <1 cM small effects OR<2, variance<1%
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.