Gene-gene and gene-environment interactions Manuel Ferreira Massachusetts General Hospital Harvard Medical School Center for Human Genetic Research.

Slides:



Advertisements
Similar presentations
What is an association study? Define linkage disequilibrium
Advertisements

BST 775 Lecture PLINK – A Popular Toolset for GWAS
Qualitative and Quantitative traits
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Genetics I. Introduction A. A. History. 1. C. C. Darwin & A. A. Wallace  blending 2. G. G. Mendel & F. F. Unger  mixing 3. W. Sutton  Chromosomal theory.
Genetics I. I. Mendelian 1. History A. Introduction.
1 Statistical Considerations for Population-Based Studies in Cancer I Special Topic: Statistical analyses of twin and family data Kim-Anh Do, Ph.D. Associate.
What is Interaction for A Binary Outcome? Chun Li Department of Biostatistics Center for Human Genetics Research September 19, 2007.
Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013.
Basics of Linkage Analysis
Genetics I. Introduction A. History 1. C. Darwin & A. Wallace  blending 2. G. Mendel & F. Unger  mixing 3. W. Sutton  Chromosomal theory of Inheritance.
Association Mapping David Evans. Outline Definitions / Terminology What is (genetic) association? How do we test for association? When to use association.
QTL Mapping R. M. Sundaram.
MALD Mapping by Admixture Linkage Disequilibrium.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
Gene-gene and gene-environment interactions Manuel Ferreira Massachusetts General Hospital Harvard Medical School Center for Human Genetic Research.
31 January, 2 February, 2005 Chapter 6 Genetic Recombination in Eukaryotes Linkage and genetic diversity.
Genetic Theory Manuel AR Ferreira Egmond, 2007 Massachusetts General Hospital Harvard Medical School Boston.
More Powerful Genome-wide Association Methods for Case-control Data Robert C. Elston, PhD Case Western Reserve University Cleveland Ohio.
Shaun Purcell Psychiatric & Neurodevelopmental Genetics Unit Center for Human Genetic Research Massachusetts General Hospital
Genetic Traits Quantitative (height, weight) Dichotomous (affected/unaffected) Factorial (blood group) Mendelian - controlled by single gene (cystic fibrosis)
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
CSE 291: Advanced Topics in Computational Biology Vineet Bafna/Pavel Pevzner
Using biological networks to search for interacting loci in genome-wide association studies Mathieu Emily et. al. European journal of human genetics, e-pub.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Haplotype Discovery and Modeling. Identification of genes Identify the Phenotype MapClone.
Understanding Genetics of Schizophrenia
Analysis of genome-wide association studies
Linkage and LOD score Egmond, 2006 Manuel AR Ferreira Massachusetts General Hospital Harvard Medical School Boston.
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
Gene, Allele, Genotype, and Phenotype
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
Non-Mendelian Genetics
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Introduction to Linkage Analysis Pak Sham Twin Workshop 2003.
Gene Hunting: Linkage and Association
Bioinformatics R for Bioinformatics PART II Kristel Van Steen, PhD, ScD Université de Liege - Institut Montefiore
Lecture 19: Association Studies II Date: 10/29/02  Finish case-control  TDT  Relative Risk.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.
Who was Mendel? Mendel – first to gather evidence of patterns by which parents transmit genes to offspring.
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Statistical Issues in Genetic Association Studies
Association analysis Genetics for Computer Scientists Biomedicum & Department of Computer Science, Helsinki Päivi Onkamo.
1 Balanced Translocation detected by FISH. 2 Red- Chrom. 5 probe Green- Chrom. 8 probe.
Fast test for multiple locus mapping By Yi Wen Nisha Rajagopal.
Multiple-Locus Genome-Wide Association Testing David Dean CSE280A.
Chapter 22 - Quantitative genetics: Traits with a continuous distribution of phenotypes are called continuous traits (e.g., height, weight, growth rate,
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Types of genome maps Physical – based on bp Genetic/ linkage – based on recombination from Thomas Hunt Morgan's 1916 ''A Critique of the Theory of Evolution'',
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Genetics of Gene Expression BIOS Statistics for Systems Biology Spring 2008.
Statistical Analysis of Candidate Gene Association Studies (Categorical Traits) of Biallelic Single Nucleotide Polymorphisms Maani Beigy MD-MPH Student.
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
Increasing Power in Association Studies by using Linkage Disequilibrium Structure and Molecular Function as Prior Information Eleazar Eskin UCLA.
Power Calculations for GWAS
upstream vs. ORF binding and gene expression?
Migrant Studies Migrant Studies: vary environment, keep genetics constant: Evaluate incidence of disorder among ethnically-similar individuals living.
Recombination (Crossing Over)
Genetics I. Introduction A. History
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
Genetics.
Power to detect QTL Association
Genetics.
Presentation transcript:

Gene-gene and gene-environment interactions Manuel Ferreira Massachusetts General Hospital Harvard Medical School Center for Human Genetic Research

Slides can be found at:

Outline 3. What is epistasis? 4. Study designs and tests to detect epistasis 5. Application to genome-wide datasets 1. G-G and G-E in the context of gene mapping 2. Genetic concepts

1. G-G and G-E in context

The Human Genome chromosome 4 DNA sequence SNP (single nucleotide polymorphism) …GGCGGTGTTCCGGGCCATCACCATTGCGGG CCGGATCAACTGCCCTGTGTACATCACCAAG GTCATGAGCAAGAGTGCAGCCGACATCATCG CTCTGGCCAGGAAGAAAGGGCCCCTAGTTTT TGGAGAGCCCATTGCCGCCAGCCTGGGGACC GATGGCACCCATTACTGGAGCAAGAACTGGG CCAAGGCTGCGGCGTTCGTGACTTCCCCTCC CCTGAGCCCGGACCCTACCACGCCCGACTA… Finding disease-causing variation

Common disease

GenotypeRisk of disease DD0.01 Dd0.012 dd Disease prevalence ~1 in 100 Each extra d allele increases risk by ~1.2 times Frequency of d in controls~ 5% Frequency of d in cases~ 6% Common disease, polygenic effects

? Gene-environment correlation Gene effect Environmental effect The environment modifies the effect of a gene A gene modifies the effect of an environment G x E interaction Gene-environment interaction S.Purcell ©

Linkage disequilibrium (LD) Epistasis Gene effect Epistasis: one gene modifies the effect of another Gene × gene interaction S.Purcell ©

2. Genetic concepts

Two-locus genotypes AA (p A 2 ) Aa (2p A q A ) BB (p B 2 ) Bb (2p B q B ) AABB aa (q A 2 ) bb (q B 2 ) AaBB aa BB AABb AaBb aa Bb AAbb Aabb aa bb Locus A: a A (p A ) (q A ) Locus B: b B (p B ) (q B ) p B + q B = 1 p A + q A = 1 AAbb = Ab / Ab A b A b if and only if AAbb ≠ Ab / Ab A A if b b (2-locus genotype) (haplotype)

Effect of a locus on disease risk can be expressed as: Genotype Penetrances Genotype Relative Risks Genotype Odds Ratios Effect of a locus on a continuous trait usually expressed as: Genotype Means

Genotype penetrance Probability of developing disease for a given genotype Example: P(D=affected | G=AABB) = 0.01 P(D=affected | G) P(D=unaffected | G=AABB) = P(D=unaffected | G) Penetrance scale P(D=affected | G)

Genotype relative risk (RR) Risk of developing disease for a given genotype, relative to the risk for a reference genotype Example: RR(AABb) = P(D=affected | G=AABB) P(D=affected | G) P(D=affected | G = ref) P(D=affected | G=AABb) RR(AABb) ∞ = 2.6 Relative risk scale

P(D=unaffected | G = ref) Genotype odds ratio (OR) Odds of developing disease for a given genotype, relative to the odds for a reference genotype Example: OR(AABb) = P(D=unaffected | G=AABb) P(D=affected | G) P(D=unaffected | G) P(D=affected | G=AABb) OR(AABb) ∞ P(D=unaffected | G=AABB) P(D=affected | G=AABB) P(D=affected | G = ref) = 2.7 Odds ratio scale

Penetrances Relative RisksOdds Ratios Disease trait Genotype Means Continuous trait

3. Definition(s) of epistasis

AA Aa aa BB Bb bb Epistasis or not ?

Definitions of epistasis Biological Statistical Individual-level phenomenon Population-level phenomenon S.Purcell ©

Requires: 1) Variation between individuals 2) Effect on disease Requires: 1) Correct statistical definition of effect S.Purcell ©

Gene RED Pigment 1 Pigment 2 ? Final pigment Gene YELLOW

Gene RED Pigment 1 Pigment 2 Final pigment Gene YELLOW AA Aa aa BB Bb bb

Gene RED Gene YELLOW Pigment 1 Pigment 2 Final pigment X Aa aa BB Bb bb Bateson (1909)

Gene RED Gene YELLOW Pigment 1 Pigment 2 Final pigment X AA BB Bb bb Bateson (1909)

Gene RED Gene YELLOW Pigment 1 Pigment 2 Final pigment Epistasis as a “masking effect”, whereby a variant or allele at one locus prevents the variant at another locus from manifesting its effect. AA Aa aa BB Bb bb Mendelian concept, closer to biological definition of interaction between 2 molecules Bateson (1909)

Gene RED Gene YELLOW Epistasis defined as the extent to which the joint contribution of two alleles in different loci towards a phenotype deviates from that expected under a purely additive model. AA Aa aa BB Bb bb AA Aa aa 022 Expected Observed Fisher (1918) Mathematical concept, closer to statistical definition of interaction between 2 variables on a linear scale.

Dominance is defined as the extent to which the joint contribution of two alleles in the same locus towards a phenotype deviates from that expected by a purely additive model AA Aa aaAA Aa aaAA Aa aa AA Aa aa Epistasis defined as the extent to which the joint contribution of two alleles in different loci towards a phenotype deviates from that expected under a purely additive model. AdditiveDominant Recessive

Epistasis is very similar... Deviation from additivity between loci. Within locus: Between loci: Locus A Locus B Additive No effect Additive No effect bb Bb BB BB Bb bb AA Aa aaAA Aa aaAA Aa aa Genotypic mean

Locus A AdditiveDominant Recessive Additive Dominant Recessive Locus B Between loci: Additive (ie. NO epistasis)

Locus A AdditiveDominant Recessive Additive Dominant Recessive Locus B AA Aa aaAA Aa aaAA Aa aa BB Bb bb BB Bb bb BB Bb bb Between loci: Additive (ie. NO epistasis)

AA Aa aaAA Aa aaAA Aa aa BB Bb bb BB Bb bb Between loci: Non-Additive (ie. epistasis)

AA Aa aa BB Bb bb Epistasis or not ?

Statistical epistasis is scale dependent AA Aa aa BB Bb bb Defined epistasis as a departure from the genotype effects expected under an additive model. Crucial assumption: genotype effect is measured on the appropriate scale Epistasis? Linear OR Scale NO YES

AA Aa aa AA Aa aa log (x) No departure from additivity Significant departure from additivity log (x)

Penetrance scale Linear scale RR scale OR scale Epistasis defined as departure from: Additive model Multiplicative model Genotype effects measured on: Additive: Multiplicative: y = LocusA + LocusB y = LocusA × LocusB

4. Designs and methods to detect epistasis

Study designs Family-basedCase-ControlCase-only More robust, fewer assumptions More efficient, powerful

Methods 1. Regression 2. Linkage Disequilibrium 3. Transmission distortion

+ m 3. (LocusA × LocusB) Methods y = m 1.LocusA + m 2.LocusB y = (m 1 + m 3.LocusB).LocusA + m 2.LocusB Effect of LocusA on y is modified by LocusB 1. Regression y Continuous traitLinear regression Disease traitLogistic regression

+ m 3. (LocusA × Env) Methods y = m 1.LocusA + m 2.Env y = (m 1 + m 3.Env).LocusA + m 2.Env Effect of LocusA on y is modified by Env 1. Regression

Methods 2. LD-based Epistasis induces LD in cases, even for unlinked loci: p(a) = 0.2 p(b) = A a B b B b ~ 0 LD Epistasis model AA Aa aa Cases Controls BB Bb bb BB Bb bb BB Bb bb AA Aa aa Genotype frequencies Haplotype frequencies

Methods 2. LD-based BB Bb bb p(a) = 0.2 p(b) = AA Aa aa AA Aa aa A a B b B b ~ 0 ~ 0.02 Cases Controls Genotype frequencies Haplotype frequencies LD Epistasis model BB Bb bb BB Bb bb Epistasis induces LD in cases, even for unlinked loci:

Methods 2. LD-based In the presence of Epistasis: LD cases > 0 LD cases > LD controls Statistics that measure the strength of association (δ) between two loci Case-ControlCase-only H 0 : δ = 0H 0 : δ Cases = δ Controls LD (D, r 2 ) Correlation

DTNBP1 MUTEDPLDNSNAPAPCNO BLOC1S1BLOC1S2 BLOC1S3 -log10(p-value) p=0.05 Dysbindin-1 by itself shows no evidence of association with Scz 373 Irish schizophrenics 812 controls Standard single SNP analyses Genes in 5q GABA cluster S.Purcell ©

Cases (Scz) Controls Genes in 5q GABA cluster Pamela Sklar Tracey Petryshen C&M Pato Pamela Sklar Tracey Petryshen C&M Pato

Methods 3. Transmission distortion aa Aa Aa Subset of BB probands If the effect of locus A on disease risk is modified by Locus B: aa Aa Aa aa Aa Aa 50% Subset of Bb probands 52% Subset of bb probands 56% Same applies for Env instead of Locus B

aa Aa aa aa Aa Aa AA Aa Aa AA Aa AA Subset of bb probandsSubset of BB probands →100% →0% →100% If variants A and B are in LD (common haplotypes AB / ab) False positive interactions (due to linkage or population stratification) TDT requires assumption of independence between loci

Design & Methods Case-ControlCase-onlyFamily-based Regression LD-based  TDT  

Case-only designs offer efficient detection of epistasis

Case-only design isn’t always valid Gene AGene B Gene AGene B stratification 1. Physical distance 2. Population substructure in case sample

LD Fast, often more powerful Less useful for continuous traits and/or family data ProsCons Efficient, powerfulAssumptions Applicable to linked lociLess efficient More robust Few methods that efficiently handle relatives Case-Control Case-only Family-based TDT PLINK Slow(er) Many extensions possible (GxE, covariates, etc) Regression (unlinked loci, no stratification, etc) Assumptions (unlinked loci, no stratification, etc)

5. Application to genome-wide datasets

# SNPs # pairs , , , ,999,750,000 An “all pairs of SNPs” approach to epistasis does not scale well… … but it is feasible! ~1 week, running PLINK using ~200 CPUs.

Multiple testing increases false positives

# SNPs # pairs P-value needed e ,225 4e ,750 4e-7 250,000 31,249,880,000 2e , ,999,750,000 4e-13 P-value required for experiment-wide significance must be adjusted for the number of tests performed

Chromosome

A B C D E F G H I J A 1 A 2 A 3 A 4 A 5 A 6 A 7 A 8 B 1 B 2 B 3 B 4 B 5 B 6 B 7 B 8 ……. J 6 J 7 J 8 A single gene-based test 80 allele-based tests

Further reading Cordell HJ (2002) Human Molecular Genetics 11: –a statistical review of epistasis, methods and definitions Clayton D & McKeigue P (2001) The Lancet, 358, –a critical appraisal of GxE research Marchini J, Donnelly P & Cardon LR (2005) Nature Genetics, 37, –epistasis in whole-genome association studies