Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gene-gene and gene-environment interactions Manuel Ferreira Massachusetts General Hospital Harvard Medical School Center for Human Genetic Research.

Similar presentations


Presentation on theme: "Gene-gene and gene-environment interactions Manuel Ferreira Massachusetts General Hospital Harvard Medical School Center for Human Genetic Research."— Presentation transcript:

1 Gene-gene and gene-environment interactions Manuel Ferreira Massachusetts General Hospital Harvard Medical School Center for Human Genetic Research

2 Slides can be found at: http://pngu.mgh.harvard.edu/~mferreira/

3 Outline 3. What is epistasis? 4. Study designs and tests to detect epistasis 5. Application to genome-wide datasets 1. G-G and G-E in the context of gene mapping 2. Genetic concepts

4 1. G-G and G-E in context

5 The Human Genome chromosome 4 DNA sequence SNP (single nucleotide polymorphism) …GGCGGTGTTCCGGGCCATCACCATTGCGGG CCGGATCAACTGCCCTGTGTACATCACCAAG GTCATGAGCAAGAGTGCAGCCGACATCATCG CTCTGGCCAGGAAGAAAGGGCCCCTAGTTTT TGGAGAGCCCATTGCCGCCAGCCTGGGGACC GATGGCACCCATTACTGGAGCAAGAACTGGG CCAAGGCTGCGGCGTTCGTGACTTCCCCTCC CCTGAGCCCGGACCCTACCACGCCCGACTA… Finding disease-causing variation

6 Common disease

7 GenotypeRisk of disease DD0.01 Dd0.012 dd0.0144 Disease prevalence ~1 in 100 Each extra d allele increases risk by ~1.2 times Frequency of d in controls~ 5% Frequency of d in cases~ 6% Common disease, polygenic effects

8 ? Gene-environment correlation Gene effect Environmental effect The environment modifies the effect of a gene A gene modifies the effect of an environment G x E interaction Gene-environment interaction S.Purcell ©

9 Linkage disequilibrium (LD) Epistasis Gene effect Epistasis: one gene modifies the effect of another Gene × gene interaction S.Purcell ©

10 2. Genetic concepts

11 Two-locus genotypes AA (p A 2 ) Aa (2p A q A ) BB (p B 2 ) Bb (2p B q B ) AABB aa (q A 2 ) bb (q B 2 ) AaBB aa BB AABb AaBb aa Bb AAbb Aabb aa bb Locus A: a A (p A ) (q A ) Locus B: b B (p B ) (q B ) p B + q B = 1 p A + q A = 1 AAbb = Ab / Ab A b A b if and only if AAbb ≠ Ab / Ab A A if b b (2-locus genotype) (haplotype)

12 Effect of a locus on disease risk can be expressed as: Genotype Penetrances Genotype Relative Risks Genotype Odds Ratios Effect of a locus on a continuous trait usually expressed as: Genotype Means

13 Genotype penetrance Probability of developing disease for a given genotype Example: P(D=affected | G=AABB) = 0.01 P(D=affected | G) P(D=unaffected | G=AABB) = 0.99 1.75.5.25 0 P(D=unaffected | G) Penetrance scale P(D=affected | G)

14 Genotype relative risk (RR) Risk of developing disease for a given genotype, relative to the risk for a reference genotype Example: RR(AABb) = P(D=affected | G=AABB) P(D=affected | G) P(D=affected | G = ref) P(D=affected | G=AABb) RR(AABb) ∞ 3 2 1 0 = 2.6 Relative risk scale

15 P(D=unaffected | G = ref) Genotype odds ratio (OR) Odds of developing disease for a given genotype, relative to the odds for a reference genotype Example: OR(AABb) = P(D=unaffected | G=AABb) P(D=affected | G) P(D=unaffected | G) P(D=affected | G=AABb) OR(AABb) ∞ 3 2 1 0 P(D=unaffected | G=AABB) P(D=affected | G=AABB) P(D=affected | G = ref) = 2.7 Odds ratio scale

16 Penetrances Relative RisksOdds Ratios Disease trait Genotype Means Continuous trait

17 3. Definition(s) of epistasis

18 AA Aa aa BB Bb bb Epistasis or not ? 113 224 335

19 Definitions of epistasis Biological Statistical Individual-level phenomenon Population-level phenomenon S.Purcell ©

20 Requires: 1) Variation between individuals 2) Effect on disease Requires: 1) Correct statistical definition of effect S.Purcell ©

21 Gene RED Pigment 1 Pigment 2 ? Final pigment Gene YELLOW

22 Gene RED Pigment 1 Pigment 2 Final pigment Gene YELLOW AA Aa aa BB Bb bb

23 Gene RED Gene YELLOW Pigment 1 Pigment 2 Final pigment X Aa aa BB Bb bb Bateson (1909)

24 Gene RED Gene YELLOW Pigment 1 Pigment 2 Final pigment X AA BB Bb bb Bateson (1909)

25 Gene RED Gene YELLOW Pigment 1 Pigment 2 Final pigment Epistasis as a “masking effect”, whereby a variant or allele at one locus prevents the variant at another locus from manifesting its effect. AA Aa aa BB Bb bb Mendelian concept, closer to biological definition of interaction between 2 molecules Bateson (1909)

26 022 133 133 0 1 1 Gene RED Gene YELLOW Epistasis defined as the extent to which the joint contribution of two alleles in different loci towards a phenotype deviates from that expected under a purely additive model. AA Aa aa BB Bb bb 022 022 122 122 AA Aa aa 022 Expected Observed Fisher (1918) Mathematical concept, closer to statistical definition of interaction between 2 variables on a linear scale.

27 Dominance is defined as the extent to which the joint contribution of two alleles in the same locus towards a phenotype deviates from that expected by a purely additive model. 0 1 2 AA Aa aaAA Aa aaAA Aa aa AA Aa aa Epistasis defined as the extent to which the joint contribution of two alleles in different loci towards a phenotype deviates from that expected under a purely additive model. AdditiveDominant Recessive

28 Epistasis is very similar... Deviation from additivity between loci. Within locus: Between loci: Locus A Locus B Additive No effect Additive No effect bb Bb BB BB Bb bb 0 1 2 3 4 AA Aa aaAA Aa aaAA Aa aa Genotypic mean

29 Locus A AdditiveDominant Recessive Additive Dominant Recessive Locus B Between loci: Additive (ie. NO epistasis)

30 Locus A AdditiveDominant Recessive Additive Dominant Recessive Locus B 012 123 234 022 133 244 002 113 224 012 234 234 022 133 133 002 224 224 012 012 234 022 022 133 002 002 224 AA Aa aaAA Aa aaAA Aa aa BB Bb bb BB Bb bb BB Bb bb 1 1 1 1 2 0 1 0 0 2 0 2 Between loci: Additive (ie. NO epistasis)

31 000 011 011 000 012 023 000 001 012 000 000 004 224 242 422 111 111 118 AA Aa aaAA Aa aaAA Aa aa BB Bb bb BB Bb bb Between loci: Non-Additive (ie. epistasis) 0 0 1 0 1 0 1 0 1 0 0 0

32 113 224 335 AA Aa aa BB Bb bb Epistasis or not ?

33 Statistical epistasis is scale dependent AA Aa aa BB Bb bb Defined epistasis as a departure from the genotype effects expected under an additive model. Crucial assumption: genotype effect is measured on the appropriate scale. 113 224 335 Epistasis? Linear OR Scale NO YES

34 AA Aa aa AA Aa aa +4 +0.7+0.4 log (x) No departure from additivity Significant departure from additivity 0.00 1.10 0.69 1.39 1.10 1.61 114 225 336 log (x)

35 Penetrance scale Linear scale RR scale OR scale Epistasis defined as departure from: Additive model Multiplicative model Genotype effects measured on: Additive: Multiplicative: y = LocusA + LocusB y = LocusA × LocusB

36 4. Designs and methods to detect epistasis

37 Study designs Family-basedCase-ControlCase-only More robust, fewer assumptions More efficient, powerful

38 Methods 1. Regression 2. Linkage Disequilibrium 3. Transmission distortion

39 + m 3. (LocusA × LocusB) Methods y = m 1.LocusA + m 2.LocusB y = (m 1 + m 3.LocusB).LocusA + m 2.LocusB Effect of LocusA on y is modified by LocusB 1. Regression y Continuous traitLinear regression Disease traitLogistic regression

40 + m 3. (LocusA × Env) Methods y = m 1.LocusA + m 2.Env y = (m 1 + m 3.Env).LocusA + m 2.Env Effect of LocusA on y is modified by Env 1. Regression

41 Methods 2. LD-based Epistasis induces LD in cases, even for unlinked loci: p(a) = 0.2 p(b) = 0.2 111 111 111.640.160.040 A a B b B b.640.160.040 ~ 0 LD Epistasis model.41.21.02.21.10.01.03.01.00 AA Aa aa.41.21.02.21.10.01.03.01.00 Cases Controls BB Bb bb BB Bb bb BB Bb bb AA Aa aa Genotype frequencies Haplotype frequencies

42 Methods 2. LD-based BB Bb bb p(a) = 0.2 p(b) = 0.2.41.21.02.21.10.01.03.01.00 111 111 1120 AA Aa aa.40.20.03.20.10.01.03.01.02.640.160.040 AA Aa aa A a B b B b.638.158.046 ~ 0 ~ 0.02 Cases Controls Genotype frequencies Haplotype frequencies LD Epistasis model BB Bb bb BB Bb bb Epistasis induces LD in cases, even for unlinked loci:

43 Methods 2. LD-based In the presence of Epistasis: LD cases > 0 LD cases > LD controls Statistics that measure the strength of association (δ) between two loci Case-ControlCase-only H 0 : δ = 0H 0 : δ Cases = δ Controls LD (D, r 2 ) Correlation

44 DTNBP1 MUTEDPLDNSNAPAPCNO BLOC1S1BLOC1S2 BLOC1S3 -log10(p-value) p=0.05 Dysbindin-1 by itself shows no evidence of association with Scz 373 Irish schizophrenics 812 controls Standard single SNP analyses Genes in 5q GABA cluster S.Purcell ©

45 Cases (Scz) Controls Genes in 5q GABA cluster Pamela Sklar Tracey Petryshen C&M Pato Pamela Sklar Tracey Petryshen C&M Pato

46 Methods 3. Transmission distortion aa Aa Aa Subset of BB probands If the effect of locus A on disease risk is modified by Locus B: aa Aa Aa aa Aa Aa 50% Subset of Bb probands 52% Subset of bb probands 56% Same applies for Env instead of Locus B

47 aa Aa aa aa Aa Aa AA Aa Aa AA Aa AA Subset of bb probandsSubset of BB probands →100% →0% →100% If variants A and B are in LD (common haplotypes AB / ab) False positive interactions (due to linkage or population stratification) TDT requires assumption of independence between loci

48 Design & Methods Case-ControlCase-onlyFamily-based Regression LD-based  TDT  

49 Case-only designs offer efficient detection of epistasis

50 Case-only design isn’t always valid Gene AGene B Gene AGene B stratification 1. Physical distance 2. Population substructure in case sample

51 LD Fast, often more powerful Less useful for continuous traits and/or family data ProsCons Efficient, powerfulAssumptions Applicable to linked lociLess efficient More robust Few methods that efficiently handle relatives Case-Control Case-only Family-based TDT PLINK Slow(er) Many extensions possible (GxE, covariates, etc) Regression (unlinked loci, no stratification, etc) Assumptions (unlinked loci, no stratification, etc)

52 5. Application to genome-wide datasets

53 # SNPs # pairs 5 10 10 45 50 1,225 100 4,950 500 124,750 500000 124,999,750,000 An “all pairs of SNPs” approach to epistasis does not scale well… … but it is feasible! ~1 week, running PLINK using ~200 CPUs.

54 Multiple testing increases false positives

55 # SNPs # pairs P-value needed 5 10 5e-3 50 1,225 4e-5 500 124,750 4e-7 250,000 31,249,880,000 2e-12 500,000 124,999,750,000 4e-13 P-value required for experiment-wide significance must be adjusted for the number of tests performed

56 Chromosome 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

57 A B C D E F G H I J 1 2 3 4 5 6 7 8 A 1 A 2 A 3 A 4 A 5 A 6 A 7 A 8 B 1 B 2 B 3 B 4 B 5 B 6 B 7 B 8 ……. J 6 J 7 J 8 A single gene-based test 80 allele-based tests

58 Further reading Cordell HJ (2002) Human Molecular Genetics 11: 2463-2468. –a statistical review of epistasis, methods and definitions Clayton D & McKeigue P (2001) The Lancet, 358, 1357-60. –a critical appraisal of GxE research Marchini J, Donnelly P & Cardon LR (2005) Nature Genetics, 37, 413-417 –epistasis in whole-genome association studies


Download ppt "Gene-gene and gene-environment interactions Manuel Ferreira Massachusetts General Hospital Harvard Medical School Center for Human Genetic Research."

Similar presentations


Ads by Google