Correlation for a pair of relatives

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
LINEAR REGRESSION: What it Is and How it Works Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r Assumptions.
Estimating “Heritability” using Genetic Data David Evans University of Queensland.
Quantitative Genetics
ACDE model and estimability Why can’t we estimate (co)variances due to A, C, D and E simultaneously in a standard twin design?
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
REGRESSION Predict future scores on Y based on measured scores on X Predictions are based on a correlation from a sample where both X and Y were measured.
Relationships Among Variables
3.5 Solving systems of equations in 3 variables
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Thomas Knotts. Engineers often: Regress data  Analysis  Fit to theory  Data reduction Use the regression of others  Antoine Equation  DIPPR.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Genes in human populations n Population genetics: focus on allele frequencies (the “gene pool” = all the gametes in a big pot!) n Hardy-Weinberg calculations.
Chapter 11 Correlation and Simple Linear Regression Statistics for Business (Econ) 1.
Chapter 6 (cont.) Difference Estimation. Recall the Regression Estimation Procedure 2.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Systems of Equations and Inequalities
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Lecture 22: Quantitative Traits II
Introduction to Multilevel Analysis Presented by Vijay Pillai.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
University of Colorado at Boulder
Regression Analysis AGEC 784.
Genetic Linkage.
Gonçalo Abecasis and Janis Wigginton University of Michigan, Ann Arbor
REGRESSION G&W p
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Regression and Correlation
Signatures of Selection
Kin 304 Regression Linear Regression Least Sum of Squares
Regression.
A Session On Regression Analysis
Genome Wide Association Studies using SNP
BPK 304W Regression Linear Regression Least Sum of Squares
Marker heritability Biases, confounding factors, current methods, and best practices Luke Evans, Matthew Keller.
The Least-Squares Regression Line
Solving Linear Equations
CHAPTER 10 Correlation and Regression (Objectives)
BPK 304W Correlation.
Genetic Linkage.
CIS 2033 based on Dekking et al
3.5 Solving systems of equations in 3 variables
The ‘V’ in the Tajima D equation is:
No notecard for this quiz!!
T test.
Lesson 7.1 How do you solve systems of linear equations by graphing?
Do Now 1/18/12 In your notebook, explain how you know if two equations contain one solution, no solutions, or infinitely many solutions. Provide an example.
What are BLUP? and why they are useful?
Genetic Drift, followed by selection can cause linkage disequilibrium
Genetic Linkage.
OVERVIEW OF LINEAR MODELS
Correlation and Regression
HW# : Complete the last slide
Write Equations of Lines
Linear Regression and Correlation
Nat. Rev. Cardiol. doi: /nrcardio
Mendelian Randomization (Using genes to tell us about the environment)
Linear Regression and Correlation
Substitute either point and the slope into the slope-intercept form.
An Expanded View of Complex Traits: From Polygenic to Omnigenic
Created by Erin Hodgess, Houston, Texas
Multiple Regression Berlin Chen
The Basic Genetic Model
Presentation transcript:

Correlation for a pair of relatives 2 G 1 P h e i g gi = correlation in genetic values for the ith type of relative hi = correlation in environmental values for the ith type of relative

Two Generic Methods: (1) Genetic Epidemiology (2) Molecular approach Unmeasured genotypes Use correlations between informative relatives (e.g., twins adoptees) (2) Molecular approach Measured genotypes (almost always SNPs) Several approaches, two most common = GCTA analysis and LD score regression

Twin Method: 1 Three Structural Equations: 𝑅 𝑀𝑍 =1.0 ℎ 2 + 𝜂 𝑀𝑍 𝑒 2 𝑅 𝐷𝑍 = 𝛾 𝐷𝑍 ℎ 2 + 𝜂 𝐷𝑍 𝑒 2 1= ℎ 2 + 𝑒 2 (1) (2) (3) Problem: 3 Equations but 5 unknowns Solution: Make Assumptions Additive gene action, no assortative mating, therefore 𝛾 𝐷𝑍 =0.5 Equal environments assumption: 𝜂 𝑀𝑍 = 𝜂 𝐷𝑍 = 𝜂

Twin Method: 2 Three Structural Equations Rewritten: Solution: 𝑅 𝑀𝑍 =1.0 ℎ 2 +𝜂 𝑒 2 𝑅 𝐷𝑍 =0.5 ℎ 2 +𝜂 𝑒 2 1= ℎ 2 + 𝑒 2 (1) (2) (3) Solution: (1.A) Subtract Eq (2) from Eq (1) 𝑅 𝑀𝑍 − 𝑅 𝐷𝑍 = ℎ 2 +𝜂 𝑒 2 −0.5 ℎ 2 −𝜂 𝑒 2 =0.5 ℎ 2 (1.B) Multiply both sides by 2 2(𝑅 𝑀𝑍 − 𝑅 𝐷𝑍 )= ℎ 2

Twin Method: 3 Three Structural Equations Rewritten: Solution: 𝑅 𝑀𝑍 =1.0 ℎ 2 +𝜂 𝑒 2 𝑅 𝐷𝑍 =0.5 ℎ 2 +𝜂 𝑒 2 1= ℎ 2 + 𝑒 2 (1) (2) (3) Solution: (2) Substitute the estimate of h2 into Eq (3) 𝑒 2 =1− ℎ 2 (3) Substitite the estimates of h2 and e2 into either Eq (1) or (2) 𝜂=( 𝑅 𝑀𝑍 − ℎ 2 )/ 𝑒 2 𝜂=( 𝑅 𝐷𝑍 − 0.5ℎ 2 )/ 𝑒 2

Adoption Method (1) RBioSibs = .24 = .5h2 + he2 (2) RAdpSibs = .06 = he2 (3) h2 + e2 = 1 Solution: (1) Subtract Equation (2) from Equation (1): RBioSibs = .24 = .5h2 + he2 - RAdpSibs = -.06 = - he2 .18 = .5h2 (2) Multiply this result by 2: 2(.18) = 2(.5h2), so .36 = h2 (3) Substitute this quantity into Equation (3): .36 + e2 = 1, so e2 = .64 (4) Substitute the results from steps (2) and (3) into Equation (1) & solve for h: .06 = h(.64), so h = .06/.64 = .09

GCTA Analysis Select random individuals from the general population Genotype on a large number of loci Compute the genetic similarity for between each pair of individuals Those pairs with high genetic similarity should have more similar phenotypes than those with low genetic similarity 𝑿=𝑨 𝑉 𝐴 +𝑫 𝑋 𝑖𝑗 =( 𝑃 𝑖 −𝜇)( 𝑃 𝑗 −𝜇) 𝐴 𝑖𝑗 = correlation between additive genetic values for ijth pair D = diagonal matrix of residual effects VA = additive genetic variance

Linkage Disequilibrium (LD) Score Regression Logic = a causal variant in a haplotype block in strong disequilibrium is more more likely to have a high association with each loci than one in a block with weak disequilibrium. Block 3: High Probability Block 2: Medium Probability Block 1: Low Probability Bulaik-Sullivan et al. (2015), Nat. Genetics, 47, 291-297

Linkage Disequilibrium (LD) Score Regression So, compute a LD score for each locus. For the ith locus, the LD score equals the sum of the squared correlations with all the loci in the block, or ℓ 𝑖 = 𝑗 𝑟 𝑖𝑗 2 The larger the value of ℓ 𝑖 , the greater the chance of a causal variant, so regress the observed 𝜒 2 for each locus on its ℓ 𝑖 value. Bulaik-Sullivan et al. (2015), Nat. Genetics, 47, 291-297

Linkage Disequilibrium (LD) Score Regression 𝐸 𝜒 2 ℓ 𝑖 = 𝑁 ℎ 2 ℓ 𝑖 𝑀 +𝑁𝑎+1 N = sample size M = number of loci h2 = heritability a = confounding effects (e.g., pop stratification) ℓ 𝑖 = 𝑗 𝑟 𝑖𝑗 2 𝑟 𝑖𝑗 2 = squared correlation for all j loci in LD with the ith locus Bulaik-Sullivan et al. (2015), Nat. Genetics, 47, 291-297

Linkage Disequilibrium (LD) Score Regression 𝐸 𝜒 2 ℓ 𝑖 = 𝑁 ℎ 2 ℓ 𝑖 𝑀 +𝑁𝑎+1 Although it may not look like it, this equation is a linear regression equation. The dependent variable is the 𝜒 2 , the independent variable is ℓ 𝑖 , the intercept is Na + 1, and the slope is Nh2/M. Because we know N and M, we can calculate h2. Bulaik-Sullivan et al. (2015), Nat. Genetics, 47, 291-297