Lecture 17: Likelihood and estimates of variances

Slides:



Advertisements
Similar presentations
Agenda of Week XI Review of Week X Factor analysis Illustration Method of maximum likelihood Principal component analysis Usages, basic model Objective,
Advertisements

Point Estimation Notes of STAT 6205 by Dr. Fan.
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
The General Linear Model. The Simple Linear Model Linear Regression.
Biostatistics-Lecture 10 Linear Mixed Model Ruibin Xi Peking University School of Mathematical Sciences.
Maximum likelihood (ML) and likelihood ratio (LR) test
Maximum likelihood (ML)
Maximum likelihood (ML) and likelihood ratio (LR) test
Linear and generalised linear models
Linear and generalised linear models
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Maximum likelihood (ML)
Random Sampling, Point Estimation and Maximum Likelihood.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Machine Learning 5. Parametric Methods.
M.Sc. in Economics Econometrics Module I Topic 4: Maximum Likelihood Estimation Carol Newman.
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Estimation Econometría. ADE.. Estimation We assume we have a sample of size T of: – The dependent variable (y) – The explanatory variables (x 1,x 2, x.
Statistical Genomics Zhiwu Zhang Washington State University Lecture 19: SUPER.
Statistical Genomics Zhiwu Zhang Washington State University Lecture 29: Bayesian implementation.
Statistical Genomics Zhiwu Zhang Washington State University Lecture 16: CMLM.
Statistical Genomics Zhiwu Zhang Washington State University Lecture 20: MLMM.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Statistical Genomics Zhiwu Zhang Washington State University Lecture 27: Bayesian theorem.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
CS479/679 Pattern Recognition Dr. George Bebis
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
STATISTICS POINT ESTIMATION
LECTURE 06: MAXIMUM LIKELIHOOD ESTIMATION
Probability Theory and Parameter Estimation I
Chapter 4. Inference about Process Quality
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Lecture 22: Marker Assisted Selection
Ch3: Model Building through Regression
Linear Mixed Models in JMP Pro
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
Propagating Uncertainty In POMDP Value Iteration with Gaussian Process
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
'Linear Hierarchical Models'
EC 331 The Theory of and applications of Maximum Likelihood Method
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
OVERVIEW OF LINEAR MODELS
Lecture 16: Likelihood and estimates of variances
OVERVIEW OF LINEAR MODELS
Lecture 11: Power, type I error and FDR
Washington State University
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
Parametric Methods Berlin Chen, 2005 References:
Washington State University
Lecture 27: Bayesian theorem
Lecture 18: Heritability and P3D
Washington State University
Washington State University
Lecture 29: Bayesian implementation
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Washington State University
Applied Statistics and Probability for Engineers
Maximum Likelihood Estimation (MLE)
Presentation transcript:

Lecture 17: Likelihood and estimates of variances Statistical Genomics Lecture 17: Likelihood and estimates of variances Zhiwu Zhang Washington State University

Administration Homework 4, due March 23, Wednesday, 3:10PM Hypotheses, Results, Methods, Conclusion and Discussion Results/Discussion: Reasoning and Explanation Final exam: May 3, 120 minutes (3:10-5:10PM), 50 questions, reasoning

Outline Solve MLM with unknown heritability Likelihood(Uni-variate) Multi-variate likelihood Full and REML EMMA

Teaching model Hypothesis: The current way is the best to teach Work: Reject the null hypothesis Goal: Improve power

𝑋 ′ 𝑋 𝑋 ′ 𝑍 𝑍 ′ 𝑋 𝑍 ′ 𝑍+ 𝜎 𝑒 2 𝜎 𝑎 2 𝐴 −1 𝑏 𝑢 = 𝑋 ′ 𝑦 𝑧 ′ 𝑦 Mixed Model Equation y = Xb + Zu + e 𝑋 ′ 𝑋 𝑋 ′ 𝑍 𝑍 ′ 𝑋 𝑍 ′ 𝑍+ 𝜎 𝑒 2 𝜎 𝑎 2 𝐴 −1 𝑏 𝑢 = 𝑋 ′ 𝑦 𝑧 ′ 𝑦 𝛿= 𝜎 𝑒 2 𝜎 𝑎 2 ℎ 2 = 𝜎 𝑎 2 𝜎 𝑎 2 + 𝜎 𝑒 2 𝛿= 1−ℎ 2 ℎ 2

Unknown heritability ℎ 2 = 𝜎 𝑎 2 𝜎 𝑎 2 + 𝜎 𝑒 2 𝛿= 𝜎 𝑒 2 𝜎 𝑎 2 ℎ 2 = 𝜎 𝑎 2 𝜎 𝑎 2 + 𝜎 𝑒 2 𝛿= 𝜎 𝑒 2 𝜎 𝑎 2 ℎ 2 = 1 1+𝛿

A variable was observed as 95 from a normal distribution A variable was observed as 95 from a normal distribution. The mean and SD of the distribution are most likely to be: 100 and 1 100 and 2 85 and 5 85 and 10

100 and 1: 5 SD from mean, P<<1% By approximation 100 and 1: 5 SD from mean, P<<1% 100 and 2: 2.5 SD from mean, 1%<P<5% 85 and 5 : 2 SD from mean, P=5% 85 and 10 : 1 SD from mean, P=32%

Visualization x=rnorm(10000,100,1) plot(density(x),xlim=c(60,105))

𝑓(𝑥; 𝜇, 𝜎 2 )= 1 2𝜋 𝜎 exp(- 1 2 𝜎 2 (𝑥−𝜇) 2 ) By density function 𝑓(𝑥; 𝜇, 𝜎 2 )= 1 2𝜋 𝜎 exp(- 1 2 𝜎 2 (𝑥−𝜇) 2 ) 1 2𝜋 𝜎 −∞ ∞ 𝑒𝑥𝑝(− 1 2 𝜎 2 (𝑥−𝜇) 2 ) =1

Density function of uni-variate normal distribution 𝑓(𝑥,𝜇, 𝜎 2 )= 1 2𝜋 𝜎 exp(- 1 2 𝜎 2 (𝑥−𝜇) 2 ) dnormal=function(x=0,mean=0,sd=1){ p=1/(sqrt(2*pi)*sd)*exp(-(x-mean)^2/(2*sd^2)) return(p) }

Density function of normal distribution x=c(95,95,95,95) mean=c(100,100,85,85) sd=c(1,2,5,10) dnormal(x,mean,sd) dnorm(x,mean,sd)

Two variables were observed as 95 and 97 from a normal distribution Two variables were observed as 95 and 97 from a normal distribution. The mean and SD of the distribution are most likely to be: 100 and 1 100 and 2 85 and 5 85 and 10 mean=c(100,100,85,85) sd=c(1,2,5,10) x1=rep(95,4) x2=rep(97,4) p1=dnormal(x1,mean,sd) p2=dnormal(x2,mean,sd) p1*p2 A: 6.5pe-09 B: 5.7e-04 C: 4.8e-05 D: 4.7e-04

Three individuals have kinship of and observations of 95, 100 and 70, respectively. The population has mean of 90. Square root of genetic and residual variances are most likely to be: 95 and 5 5 and 95 50 and 50 I do not know

Variance in MLM y = Xb + Zu + e Var(u)=G=2K 𝜎 𝑎 2 =A 𝜎 𝑎 2 Var(y)=V=Var(u)+Var(e) Var(u)=G=2K 𝜎 𝑎 2 =A 𝜎 𝑎 2 Var(e)=R=I 𝜎 𝑒 2

Density function of multi-variate normal distribution 𝑓(𝑥; 𝜇, 𝐻)= 1 ( 2𝜋) 𝑛/2 𝑉 1/2 exp(- 1 2 (𝑥−𝜇) 𝑇 𝑉 −1 (𝑥−𝜇)) dmnormal=function(x=0,mean=0,V=NULL){ n=length(x) p=1/(sqrt(2*pi)^n*sqrt(det(V)))*exp(-t(x-mean)%*%solve(V)%*%(x-mean)/2) return(p) }

Density function of multi-variate normal distribution x=matrix(c(100,95,70),3,1) mean=rep(90,3) K=matrix(c(1,.75,.25,.75,1,.25,.25,.25,1),3,3) va=95 ve=5 V=2*K*va+ve dmnormal(x,mean,V) va=5 ve=95 va=50 ve=50

𝑓(𝑥; 𝜇, 𝐻)= 1 ( 2𝜋) 𝑛/2 𝑉 1/2 exp(- 1 2 (𝑥−𝜇) 𝑇 𝑉 −1 (𝑥−𝜇)) Log Likelihood 𝑓(𝑥; 𝜇, 𝐻)= 1 ( 2𝜋) 𝑛/2 𝑉 1/2 exp(- 1 2 (𝑥−𝜇) 𝑇 𝑉 −1 (𝑥−𝜇)) 𝜎 2 = 𝜎 𝑎 2 , 𝐻= 𝜎 −1 𝑉

Full and REsidual Maximum Likelihood q is rand of X

Differences between Full ML and REML Features Full ML REML Likelihood y residual: y-Xb Fixed effect Depend on Removed Model comparison Fixed effects random effect Bias Negatively Unbiased

Optimization in two dimensions http://www.cc.gatech.edu/~dellaert/FrankDellaert/Notes/Entries/2002/2/1_Expectation_Maximization.html Va Ve

Expectation and Maximization (EM) y = Xb + Zu + e Expectation (E) step 𝑏 𝑢 = 𝑋 ′ 𝑋 𝑋 ′ 𝑍 𝑍 ′ 𝑋 𝑍 ′ 𝑍+ 𝜎 𝑒 2 𝜎 𝑎 2 𝐴 −1 −1 𝑋 ′ 𝑦 𝑍 ′ 𝑦 𝐶 11 𝐶 12 𝐶 21 𝐶 22

Expectation and Maximization (EM) Maximization (M) step

EMMA: two dimensions to one dimension optimization Kang, H. M. et al. Efficient control of population structure in model organism association mapping. Genetics 178, 1709–1723 (2008). Kang 𝐻= 𝜎 −1 𝑉 =𝐴+𝛿𝐼 = 𝑈 𝐹 𝑑𝑖𝑎𝑔( ξ 1 +𝛿, …, ξ 𝑛 +𝛿) 𝑈 𝐹 ′ UF and ξ are eigen vector and values of spectral decomposition of A matrix

Hilight Solve MLM with unknown heritability Likelihood (Uni-variate) Multi-variate likelihood Full and REML EMMA