Presentation is loading. Please wait.

Presentation is loading. Please wait.

Shaun Purcell Psychiatric & Neurodevelopmental Genetics Unit Center for Human Genetic Research Massachusetts General Hospital

Similar presentations


Presentation on theme: "Shaun Purcell Psychiatric & Neurodevelopmental Genetics Unit Center for Human Genetic Research Massachusetts General Hospital"— Presentation transcript:

1 Shaun Purcell Psychiatric & Neurodevelopmental Genetics Unit Center for Human Genetic Research Massachusetts General Hospital http://pngu.mgh.harvard.edu/~purcell spurcell@pngu.mgh.harvard.edu Gene-environment & gene-gene interaction in association studies: a methodologic introduction

2 Finding disease-causing variation The Human Genome chromosome 4 DNA sequence SNP (single nucleotide polymorphism) …GGCGGTGTTCCGGGCCATCACCATTGCGGG CCGGATCAACTGCCCTGTGTACATCACCAAG GTCATGAGCAAGAGTGCAGCCGACATCATCG CTCTGGCCAGGAAGAAAGGGCCCCTAGTTTT TGGAGAGCCCATTGCCGCCAGCCTGGGGACC GATGGCACCCATTACTGGAGCAAGAACTGGG CCAAGGCTGCGGCGTTCGTGACTTCCCCTCC CCTGAGCCCGGACCCTACCACGCCCGACTA…

3 Rare disease, major gene effect GenotypeRisk of disease DD0.001 Dd0.001 dd0.95 Disease prevalence ~1 in 1000 Individuals with dd are ~1000 times more likely to get disease Frequency of d in controls~ 5% Frequency of d in cases~ 96% Disease prevalence ~1 in 1000 Individuals with dd are ~1000 times more likely to get disease Frequency of d in controls~ 5% Frequency of d in cases~ 96%

4 Common polygenic disease

5 GenotypeRisk of disease DD0.01 Dd0.012 dd0.0144 Common disease, polygenic effects Disease prevalence ~1 in 100 Each extra d allele increases risk by ~1.2 times Frequency of d in controls~ 5% Frequency of d in cases~ 6% Disease prevalence ~1 in 100 Each extra d allele increases risk by ~1.2 times Frequency of d in controls~ 5% Frequency of d in cases~ 6%

6 ? Genotype Environment Phenotype

7 ? Gene-environment correlation Gene effect Environmental effect The environment modifies the effect of a gene A gene modifies the effect of an environment G x E interaction Gene-environment interaction

8 Linkage disequilibrium (LD) Epistasis Gene effect Epistasis: one gene modifies the effect of another Gene × gene interaction

9 Classical definition of epistasis The aa genotype masks the effect of the bb genotype AA Aa aa BBBb bb

10 Separate analysis locus A shows an association with the trait locus B appears unrelated Marker A Marker B

11 Joint analysis locus B modifies the effects of locus A

12 Two locus genotypes Locus A Locus BAAAaaa BBAABBAaBBaaBB BbAABbAaBbaaBb bbAabbAabbaabb

13 Epistasis & haplotypes Two-locus genotype A/a B/b (AaBb) A and B need not even be on same chromosome Haplotype AB / ab A and B on same chromosome; effect could appear as “interaction” cis versus trans effects AB haplotype causes diseaseA and B interact to cause disease A a B b A a b B A a B b A a b B disease no disease disease

14 Two locus genotypes Locus A Locus BAAAaaa BBf AABB f AaBB f aaBB f BB Bbf AABb f AaBb f aaBb f Bb bbf Aabb f Aabb f aabb f bb f AA f Aa f aa f “Penetrance” = probability of developing disease given genotype

15 GenotypeRisk of disease DD0.01 Dd0.012 dd0.0144 Common disease, polygenic effects Disease prevalence ~1 in 100 Each extra d allele increases risk by ~1.2 times Frequency of d in controls~ 5% Frequency of d in cases~ 6% Disease prevalence ~1 in 100 Each extra d allele increases risk by ~1.2 times Frequency of d in controls~ 5% Frequency of d in cases~ 6%

16 Small single SNP effects might represent larger epistatic effects AA Aa aa BBBb bb 0.01 0.20 Risk of developing disease 0.01 0.012 Frequency a = b = 0.1

17 Interaction may be a common feature of genetic variation Brem et al (2005) Nature –gene expression phenotypes in yeast –two-stage approach to find pairs of loci 65% of these pairs showed significant interaction many secondary loci would be missed by standard approaches though

18 Examples of interactions? RiskEnvironmentOutcome phenylalanine hydroxylase deficiency dietary phenylalanine mental retardation debrisoquine metabolism smokinglung cancer fair skinsun exposureskin cancer Lewis blood groupalcohol intakecoronary atherosclerosis APOE genotypehead injuryAlzheimer's disease

19 The rest of this talk… Statistical issues Study designs Examples

20 AAAC AA CC AA AC AA ACCC AA ACAC ACACCC  Family-based transmission disequilibrium test (TDT) Population-based case/control 

21 Paternal haplotype Maternal haplotype A C G G T G ACGGTG Marker 1 (“Locus”, “SNP”) Marker 2Marker 3 Genotypes AGT/CGG AGG/CGT Haplotypes ? In the population: 2 alleles implies 3 genotypes: AA AC CC ACAC Frequency p q=1-p Frequency p2p2 q2q2 2pq Allele Genotype An “association study”: does allele/genotype/haplotype frequency differ between cases and controls? Homozygote Heterozygote

22 Relative risk D+D- E+ab E-cd Risk in E+ = a / ( a + b ) Risk in E- = c / ( c + d ) Relative risk of exposure = (a /( a + b )) / (c /(c + d ))

23 Odds ratio: measure of association Aa Caseab Controlcd Odds of A in cases = a/b Odds of A in controls = c/d Odds ratio = (a/b)/(c/d) = ad / bc

24 E-E+ AaAa Case80206040 Control80208020 Odds ratio1.000.375 (80*20)/(80*20) (60*20)/(80*40) Z = ( ln(OR E- ) – ln(OR E+ ) ) / sqrt( V E- + V E+ ) V( ln(OR) ) = 1/a + 1/b + 1/c + 1/d

25 Regression modeling of interaction Y = b X X + e Y = b X X + b Z Z + b I XZ + e Y = ( b X + b I Z )X + b Z Z + e interaction component effect of X on Y is modified by Z

26 Y = b 0 + b 1 G + b 2 E +b 3 G×E Y 0 1 2 Linear for continuous outcomes Logistic regression for yes/no outcomes G = 0, 1, 2 copies of allele “A” E = yes/no exposure (0/1) continuous measure E- E+ Gene dosage

27 Epistasis & dominance Dominance as intralocus interaction –dominance component can also interact too, e.g. with an environment: Epistasis as interlocus interaction –additive × additive (two-way interactions) –additive × dominance (three-way interactions) –dominance × dominance (four-way interactions) Y = b 0 + b 1 A + b 2 D + b 2 E + b 3 A×E + b 4 D×E A coded { -1, 0, +1 } D coded { 0, 1, 0 }

28 Y AA Aa aa E- E+ Genotype Y = b 0 + b 1 A + b 2 D + b 2 E + b 3 A×E + b 4 D×E A coded { -1, 0, +1 } D coded { 0, 1, 0 }

29 The “Interactome”

30 Definitions of epistasis Biological Statistical Individual-level phenomenon Population-level phenomenon

31 Requires: 1) Variation between individuals 2) Effect on disease Requires: 1) Correct statistical definition of effect

32 What do interactions mean? TEST MAIN EFFECT –Null hypothesis straightforward TEST INTERACTION –Null hypothesis is a mathematical model describing joint effects A- A+ B-1a B+b?

33 A-A+RR(A) B-1aa/1 = a B+babab/b = a Additive risk differences A-A+RD(A) B-1aa-1 = a-1 B+ba+b-1a+b-1-b= a-1 Multiplicative risk ratios

34 “…we defined interaction as departure from a multiplicative model…” Multiplicative model(a×b) –common, easy to implement, logistic regression additive on log-odds scale multiplicative on risk scale Other common models (on risk) –additive(a + b) –heterogeneity model (a + b – ab )

35 A- A+ B- B+ 10 20 30 LENGTH = A + B

36 A- A+ B- B+ 100 400 900 AREA = A + B + A×B

37 OriginalLog-transformCubic-transformCensored7-point scale G1 G2 G1  G2 

38 OR(A) = 2 OR(B) = 2 1 23 4 5 1/21/3 Additive (3.00) Multiplicative (4.00) ???

39 OR(A) = 1.2 OR(B) = 1.2 1 23 4 5 1/21/3 Additive (1.40) Multiplicative (1.44) ?

40 AA ACAC  No controls (Case-only design) Population-based controls Family-based controls More robust, fewer assumptions More efficient, powerful v.s.

41 Case-only design Detect interaction only, no main effects Risk factorsPrevalence G-E- p 0 G+E- p G G-E+p E G+E+p GE = p 0 ∙ p G /p 0 ∙ p E /p 0

42 Case-only design Detect interaction only, no main effects Risk factorsPrevalence G-E- p 0 G+E- p G G-E+p E G+E+p GE = p 0 ∙ p G /p 0 ∙ p E /p 0 Leads to OR INT = OR GE / (OR G ∙ OR E ) It turns out, OR INT = OR Case / OR Control where OR Case is the association of G and E in cases and OR Control is the association of G and E in controls

43 No interactionInteraction % replicates significant at p=0.05 Case-only designs offer efficient detection of interaction

44 Case-only design isn’t always valid Chromosomal proximity Multiple ethnicities in case sample Gene AGene B Gene AGene B stratification

45 Epistasis: LD in cases ≠ LD in controls

46 Cases (Scz) Controls Genes in 5q GABA cluster Pamela Sklar Tracey Petryshen C&M Pato Pamela Sklar Tracey Petryshen C&M Pato

47 TDT requires independence assumption aa Aa aa aa Aa Aa AA Aa Aa AA Aa AA Stratify for bb probandsStratify for BB probands →100% →0% →100% If variants A and B are in LD (common haplotypes AB / ab) → false positive interactions (due to linkage or population stratification)

48 An “all pairs of SNPs” approach to epistasis does not scale well # SNPs# pairs 510 1045 501,225 1004,950 500124,750 500000124,999,750,000

49 Multiple testing increases false positives Number of independent tests performed P(at least 1 false positive) per test false positive rate 0.05 per test false positive rate 0.001 = 0.05/50

50 Tests for interaction have low power Increasing sample N Statistical power Epistasis test Standard association test

51 DTNBP1 & 7 other genes encode proteins that make up the BLOC1 protein complex –biogenesis of lysosome-related organelles complex 1 DTNBP1’s effect on Scz mediated via BLOC1? –if so, an analysis including all 8 genes might help to resolve inconsistent studies Dysbindin-1 (DTNBP1) & schizophrenia Derek Morris Aiden Corvin Michael Gill Derek Morris Aiden Corvin Michael Gill

52 DTNBP1 association studies rs1047631 P1328P1333 rs734129 P1287 rs3829893 P1655P1635 rs2619542 P1325 rs2619550 P1765P1757P1320P1763P1578P1792P1795P1583 rs2743852rs2619538 AAT GGC CCC GCAATCC ACATT TGTCA CA CAT CATCTC GG GG 1 2 3 4 5 6 7 8 9 10 Exons Straub et al. (2002) SNPs Schwab et al. (2003) Van den Oord et al. (2003) Van den Bogaert et al. (2003) Tang et al. (2003) Kirov et al. (2004) Williams et al. (2004) Funke et al. (2004) Numakawa et al. (2004) Li et al. (2005)

53 Types of interaction G+ G- G+ G- G+ G- Direction of effectPresence of effectMagnitude of effect

54 Duplicate gene action Example: Kernel Color in Wheat Only 1 dominant allele required, either A or B A_B_Normal A_bbNormal aaB_Normal aabbNo product AAAaaa BB Bb bb 

55 Complementary gene action Example: Flower color in sweet pea One recessive genotype at either gene would increase disease risk i.e. genes A and B required A_B_Normal A_bbNo product aaB_No product aabbNo product AAAaaa BB Bb bb  

56 AAAaaa BB Bb bb                     Complementary gene action Duplicate gene action  Heterogeneity model “Checkerboard” model

57 Negative feedback: a common biological mechanism

58 -/-+/-+/+ -/- +/- +/+ Negative feedback: simple model of dysregulation

59 -/-+/-+/+ -/- +/- +/+ Frequency of one locus (other locus fixed p=0.4) Single marker relative risk Negative feedback: single marker analysis leads to the “opposite allele” problem

60 Standard single SNP analyses DTNBP1 MUTEDPLDNSNAPAPCNO BLOC1S1BLOC1S2 BLOC1S3 -log10(p-value) p=0.05 Dysbindin-1 by itself shows no evidence of association with Scz 373 Irish schizophrenics 812 controls

61 A B C D E F G H I J 1 2 3 4 5 6 7 8 A 1 A 2 A 3 A 4 A 5 A 6 A 7 A 8 B 1 B 2 B 3 B 4 B 5 B 6 B 7 B 8 ……. J 6 J 7 J 8 A single gene-based test 80 allele-based tests

62 MUTED genotype DTNBP1 Odds ratio An independent replication? DTNBP1  MUTED epistasis (Straub et al. WCPG meeting Oct 2005.) An independent replication? DTNBP1  MUTED epistasis (Straub et al. WCPG meeting Oct 2005.) DTNBP1 MUTED BLOC1S2 CNO PLDN SNAPAP BLOC1S1 BLOC1S3 Known protein interactions in BLOC-1 complex Gene-based p = 0.0009 Correcting for multiple tests, p = 0.025 Gene-based p = 0.0009 Correcting for multiple tests, p = 0.025

63 DTNBP1 & MUTED DTNBP1 × MUTED gene-based test p = 0.0009 corrected, p=0.025 Most significant DTNBP1 × MUTED allele-based result: (rs2619539 × rs10458217) Single markerJoint DTNBP11.02 (0.794)0.77 (0.07) MUTED0.93 (0.549)0.93 (0.54) INTERACTIONn/a1.54 (0.009) Odds Ratio (nominal p-value)

64 Methylenetetrahydrofolate reductase (MTHFR) polymorphisms and serum folate interact to influence negative symptoms and cognitive impairment in schizophrenia Joshua Roffman, Donald Goff, et al Folic acid deficiency may contribute to negative symptoms and cognitive impairment in schizophrenia –underlying mechanism remains uncertain A cohort of 159 outpatients with schizophrenia measured: –negative symptoms –frontal lobe deficits

65 PANSS Negative Symptoms C/C & C/T T/T C/C & C/T T/T Verbal Fluency C/C & C/TT/T WCST % Perseverative Errors Interaction of low serum folic acid and homozygosity for the MTHFR 677T allele confers risk. Patients homozygous for the MTHFR 677T allele may therefore benefit specifically from folic acid supplementation.

66 Further reading Cordell HJ (2002) Human Molecular Genetics 11: 2463-2468. –a statistical review of epistasis, methods and definitions Clayton D & McKeigue P (2001) The Lancet, 358, 1357-60. –a critical appraisal of GxE research Marchini J, Donnelly P & Cardon LR (2005) Nature Genetics, 37, 413-417 –epistasis in whole-genome association studies


Download ppt "Shaun Purcell Psychiatric & Neurodevelopmental Genetics Unit Center for Human Genetic Research Massachusetts General Hospital"

Similar presentations


Ads by Google