Estimating “Heritability” using Genetic Data David Evans University of Queensland.

Slides:



Advertisements
Similar presentations
Gene-environment interaction models
Advertisements

Fitting Bivariate Models October 21, 2014 Elizabeth Prom-Wormley & Hermine Maes
Elizabeth Prom-Wormley & Hermine Maes
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Outline: 1) Basics 2) Means and Values (Ch 7): summary 3) Variance (Ch 8): summary 4) Resemblance between relatives 5) Homework (8.3)
Quantitative Genetics Theoretical justification Estimation of heritability –Family studies –Response to selection –Inbred strain comparisons Quantitative.
The Inheritance of Complex Traits
Lab 13: Association Genetics. Goals Use a Mixed Model to determine genetic associations. Understand the effect of population structure and kinship on.
Fundamental Concepts in Behavioural Ecology. The relationship between behaviour, ecology, and evolution –Behaviour : The decisive processes by which individuals.
Genetic Theory Manuel AR Ferreira Egmond, 2007 Massachusetts General Hospital Harvard Medical School Boston.
Quantitative Genetics
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Chapter 5 Human Heredity by Michael Cummings ©2006 Brooks/Cole-Thomson Learning Chapter 5 Complex Patterns of Inheritance.
Reminder - Means, Variances and Covariances. Covariance Algebra.
Biometrical Genetics Pak Sham & Shaun Purcell Twin Workshop, March 2002.
Gene x Environment Interactions Brad Verhulst (With lots of help from slides written by Hermine and Liz) September 30, 2014.
Introduction to Multivariate Genetic Analysis Kate Morley and Frühling Rijsdijk 21st Twin and Family Methodology Workshop, March 2008.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Copy the folder… Faculty/Sarah/Tues_merlin to the C Drive C:/Tues_merlin.
Quantitative Trait Loci, QTL An introduction to quantitative genetics and common methods for mapping of loci underlying continuous traits:
ConceptS and Connections
Multifactorial Traits
Process of Genetic Epidemiology Migrant Studies Familial AggregationSegregation Association StudiesLinkage Analysis Fine Mapping Cloning Defining the Phenotype.
Chapter 5 Characterizing Genetic Diversity: Quantitative Variation Quantitative (metric or polygenic) characters of Most concern to conservation biology.
Karri Silventoinen University of Helsinki Osaka University.
Gene Mapping Quantitative Traits using IBD sharing References: Introduction to Quantitative Genetics, by D.S. Falconer and T. F.C. Mackay (1996) Longman.
Whole genome approaches to quantitative genetics Leuven 2008.
Family/Kinship Studies Compare individuals with different degrees of genetic relatedness on a specific characteristic or behavior – Exs: adoption studies,
The importance of the “Means Model” in Mx for modeling regression and association Dorret Boomsma, Nick Martin Boulder 2008.
Gene-Environment Interaction & Correlation Danielle Dick & Danielle Posthuma Leuven 2008.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Linkage and association Sarah Medland. Genotypic similarity between relatives IBS Alleles shared Identical By State “look the same”, may have the same.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Genetic Theory Pak Sham SGDP, IoP, London, UK. Theory Model Data Inference Experiment Formulation Interpretation.
Mx modeling of methylation data: twin correlations [means, SD, correlation] ACE / ADE latent factor model regression [sex and age] genetic association.
Mx Practical TC20, 2007 Hermine H. Maes Nick Martin, Dorret Boomsma.
David M. Evans Multivariate QTL Linkage Analysis Queensland Institute of Medical Research Brisbane Australia Twin Workshop Boulder 2003.
Introduction to Genetic Theory
Genetic principles for linkage and association analyses Manuel Ferreira & Pak Sham Boulder, 2009.
Biometrical Genetics Shaun Purcell Twin Workshop, March 2004.
Introduction to Multivariate Genetic Analysis Danielle Posthuma & Meike Bartels.
Principal components analysis
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
QTL Mapping Using Mx Michael C Neale Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University.
March 7, 2012M. de Moor, Twin Workshop Boulder1 Copy files Go to Faculty\marleen\Boulder2012\Multivariate Copy all files to your own directory Go to Faculty\kees\Boulder2012\Multivariate.
Multivariate Genetic Analysis (Introduction) Frühling Rijsdijk Wednesday March 8, 2006.
University of Colorado at Boulder
Extended Pedigrees HGEN619 class 2007.
Heritability, prediction and the genomic
Bivariate analysis HGEN619 class 2006.
Univariate Twin Analysis
Gene-Environment Interaction & Correlation
Quantitative Variation
MRC SGDP Centre, Institute of Psychiatry, Psychology & Neuroscience
Genome Wide Association Studies using SNP
Can resemblance (e.g. correlations) between sib pairs, or DZ twins, be modeled as a function of DNA marker sharing at a particular chromosomal location?
Marker heritability Biases, confounding factors, current methods, and best practices Luke Evans, Matthew Keller.
Regression-based linkage analysis
Linkage in Selected Samples
Beyond GWAS Erik Fransen.
GxE and GxG interactions
Correlation for a pair of relatives
Biometrical model and introduction to genetic analysis
Pak Sham & Shaun Purcell Twin Workshop, March 2002
What are BLUP? and why they are useful?
Exercise: Effect of the IL6R gene on IL-6R concentration
Improved Heritability Estimation from Genome-wide SNPs
Genome-wide Complex Trait Analysis and extensions
Presentation transcript:

Estimating “Heritability” using Genetic Data David Evans University of Queensland

The Majority of Heritability for Most Complex Traits and Diseases is Yet to Be Explained Maher (2009) Nature

Places the Missing Heritability Could be Hiding In the form of common variants of small effect scattered across the genome In the form of low frequency variants only partially tagged by common variants Estimates of heritability from twin models are inflated (GASP!!!)

/software/gcta/

GCTA- The Mixed Model Framework y = Xβ + Wu + ε y1…yny1…yn = x 11 … x 1m … … … x n1 … x nm β1…βmβ1…βm + w 11 … w 1k … … … w n1 … w nk u1…uku1…uk + ε1…εkε1…εk (n x 1)(n x m)(m x 1)(n x k)(k x 1)(n x 1) n is number of individuals m is number of covariates k is number of SNPs where: β contains fixed effects regression coefficients W contains standardized genotype dosages y is a vector of phenotypes X contains covariates u contains random effects coefficients

The Classical Twin Design A1A1 C1C1 E1E1 P1P1 aec A2A2 C2C2 E2E2 P2P2 aec r g(MZ) = 1 r g(DZ) = 0.5 r c = 1 V MZ = a 2 + c 2 + e 2 a 2 + c 2 V DZ = a 2 + c 2 + e 2 ½a 2 + c 2 P 1 = aA 1 + cC 1 + eE 1 P 2 = aA 2 + cC 2 + eE 2

Expected Covariance Matrix Twin Pairs (AE Model) V = Rσ 2 A + Iσ 2 E σ 2 1 σ 12 σ 21 σ 2 2 (2 x 2) = 1 r r 1 (2 x 2) σ2Aσ2A σ2Eσ2E.. (2 x 2) = σ 2 A + σ 2 E r σ 2 A rσ 2 A σ 2 A + σ 2 E (2 x 2) V is the expected phenotypic covariance matrix σ 2 A is the additive genetic variance σ 2 E is the unique environmental variance R is a matrix containing twice the kinship coefficient (r = 1 for MZ, r = 0.5 for DZ)) I is an identity matrix

The GCTA Design- Unrelateds A1A1 E1E1 P1P1 ae A2A2 E2E2 P2P2 ae r g = A ij V = a 2 + e 2 A ij a 2 P 1 = aA 1 + eE 1 P 2 = aA 2 + eE 2

Expected Covariance Matrix - Unrelateds σ 2 1 … σ 1n … … … σ n1 … σ 2 n (n x n) = a 11 … a 1n … … … a n1 … a 2 nn (n x n) (n x n) σ2gσ2g σ2eσ2e.. V = Aσ 2 g + Iσ 2 e V is the expected phenotypic covariance matrix σ 2 g is the additive genetic variance σ 2 e is the unique environmental variance A is a GRM containing average standardized genome-wide IBS between individual i and j I is an identity matrix

GCTA- Genetic Relationship Matrix

Intuitively... If a trait is genetically influenced, then individuals who are more genetically similar should be more phenotypically similar Can be thought of like a Haseman- Elston regression

GCTA Process Two step process Estimate GRM – Exclude one from each pair of individuals who are >2.5% IBS Estimate variance components via “REML”

GCTA- Some Results Adapted from Yang et al. (2011) Nat Genet *

GCTA Interpretation GCTA does not estimate “heritability” GCTA does not estimate the proportion of trait variance due to common SNPs GCTA tells you nothing definitive about the number of variants influencing a trait, their size or their frequency

GCTA- Some Assumptions The GRM accurately reflects the underlying causal variants Underlying variants explain the same amount of variance – Relationship between MAF and effect size Independent effects – Contributions to h 2 overestimated by causal variants in regions of high LD and underestimated in regions of low LD

Extending the Model - Genome Partitioning The genetic component can be partitioned further into e.g. different chromosomes, genic vs non-genic regions A different GRM (A c ) needs to be computed for each of these components V = Σ A c σ 2 g,c + Iσ 2 e c = 1 22

Extending the Model - Genome Partitioning HeightBMI Von Willebrand Factor QT Interval Adapted from Yang et al. (2011) Nat Genet

Extending the Model: Gene- Environment Interaction A ge = A g for pairs of individuals in the same environment and Age = 0 for pairs of individuals in different environments “Environmental” factors could be sex or medical treatment for example V = A g σ 2 g + A ge σ 2 ge + Iσ 2 e

Extending the Model - Binary Traits Assume an underlying normal distribution of liability Transform estimates from the observed scale to the liability scale

Extending the Model – BinaryTraits Estimate GRM – Exclude one from each pair of individuals who are >2.5% IBS Estimate variance components via “REML” Transform from observed scale to liability scale Adjust estimates to take account of ascertainment (i.e. the fact that case-control proportions are not the same as in the population)

Extending the Model – Bivariate Association Estimate the genetic and residual correlation between different traits/diseases Individuals need not be measured on both traits

Identical by Descent Identical by state only Two alleles are IBD if they are descended from the same ancestral allele Extending the Model - Identity By Descent (IBD)

Extending the Model – IBD V = π IBD σ 2 A + Cσ 2 C + Iσ 2 e σ 2 1 σ 12 σ 13 σ 14 σ 21 σ 2 2 σ 23 σ 24 σ 31 σ 32 σ 2 3 σ 34 σ 41 σ 42 σ 43 σ 2 4 (n x n) = +σ2Aσ2A. 1 π π π π 43 1 (n x n) (n x n) σ2Cσ2C σ2eσ2e +

USE IBD variation within SIBS to estimate heritability Use variation in genetic sharing within a relative type rather than different types of relatives Gets around problem of the “Equal Environment” assumption in twin studies

Extending the Model – IBD Estimate GRM – Exclude one from each pair of individuals who are >2.5% IBS Estimate variance components via “REML” Transform from observed scale to liability scale Adjust estimates to take account of ascertainment (i.e. the fact that case-control proportions are not the same as in the population)

Application to Height & BMI

Idea… It should be obvious now, that pretty much all the models that we have touched on this week can be expressed within this GCTA framework Yet only a small proportion of these have been parameterized in GCTA Considerable scope exists for parameterization of the GCTA framework in Mx…