Download presentation
Presentation is loading. Please wait.
Published byAnnis Golden Modified over 6 years ago
1
Least-squares, Maximum likelihood and Bayesian methods
Xuhua Xia
2
Parameter estimation Conceptual: Given a model and observed data, find the parameters of the model so that the model can best fit the data. Operational criteria: minimizing sum of squared deviation (SS) – LS method minimizing sum of absolute deviation (SAD). maximizing the likelihood (ML method) maximizing the posterior probability (Bayesian method) Slide 2
3
LS and ML method 𝑓 𝑥 𝜇, 𝜎 2 = 1 2 𝜎 2 𝜋 e − 𝑥−𝜇 2 2 𝜎 2
x <- c(1,5,8,9,8,9,10) f <- function(mu,x) sum((x-mu)^2) xmin <- optimize(f,c(0,10),x) xmin Slide 3
4
Least-Square Estimation of Regression Coefficient
Replacing by does not change slope but simplifies the derivation of b
5
The LS method in linear regression
X Y R(Residual) 3 11.5 a+b*3 – 11.5 2 7.5 a+b*2 – 7.5 1 5 a+b*1 – 5 4 14 a+b*4 – 14 y = a + bx SS = (ŷi - yi)2 = (a+b*3–11.5)2 + (a+b*2–7.5)2 + (a+b*1–5)2 + (a+b*4–14)2 SAD =|ŷi - yi| = |a+b*3–11.5| + |a+b*2–7.5| + |a+b*1–5| + |a+b*4–14| Slide 5
6
Least SS or SAD in regression
LS regression using EXCEL's regression function LS regression using EXCEL's solver Regression by minimizing SAD with EXCEL's solver Use resampling methods (bootstrap and jackknife) to estimate standard error for intercept and slope estimated by the least SAD method Demonstration with EXCEL (LS_ML_Goodness_of_fit.xlsx) Slide 6
7
Numerical demonstration
EXCEL (intro_LS_ML.xlsx) R y <- c(1,3,3,3,5) x <- c(1,2,3,4,5) f <- function(Param,x,y) sum((y-(Param[1]+Param[2]*x))^2) o <- optim(c(1,1),f,x=x,y=y) # absolute deviation f2 <- function(Param,x,y) sum(abs(y-(Param[1]+Param[2]*x))) o2 <- optim(c(1,1),f,x=x,y=y) attach(iris) a <- rep(0,100) b <- rep(0,100) for(i in 1:100) { nd <-iris[sample(nrow(iris),nrow(iris),replace=T),] o2 <- optim(c(1,1),f2,x=nd$Sepal.Length,y=nd$Petal.Length) a[i] <- o2$par[1] b[i] <- o2$par[2] } Slide 7
8
R x <- c(1,5,8,9,8,9,10) f <- function(mu,x) sum((x-mu)^2)
xmin <- optimize(f,c(0,10),x) xmin lnL <- function(Par,x) -sum(log(dnorm(x,Par[1],Par[2],FALSE))) o <- optim(c(1,1),lnL,x=x) o 𝑓 𝑥 𝜇, 𝜎 2 = 𝜎 2 𝜋 e − 𝑥−𝜇 𝜎 2 Slide 8
9
R 𝑓 𝑥 𝜇, 𝜎 2 = 1 2 𝜎 2 𝜋 e − 𝑥−𝜇 2 2 𝜎 2 x <- c(1,5,8,9,8,9,10)
lnL <- function(Par,x) -sum(log(dnorm(x,Par[1],Par[2],FALSE))) o <- optim(c(1,1),lnL,x=x) o Slide 9
10
A simple problem Suppose we wish to estimate the proportion of males (p) of a fish population in a large lake. A random sample of N fish contains M males and F females (N = M+F). Any statistics book will tell us that p = M/N and the standard deviation of p, SD = sqrt(pq/N) p = M/N is obvious, but how do we get the variance? Slide 10
11
Mean and variance Fish Sex D_m
The mean of D_m = 3/10 = 0.3, which is p. The variance of D_m = 7( )2/10 + 3( )2/10 = 0.21 Standard deviation (SD) of D_m = We want to know not the SD of D_m but the SD of mean D_m (the SD of p). SD of the mean is defined as standard error (SE). Thus, the standard deviation of p is SD(p) = /sqrt(10) = sqrt(pq/N) The mean of D_m = D_mi/N = M/N = p The variance of D_m = (D_mi - M/N)2/N = F(0 - M/N)2/N + M(1 - M/N)2/N = pq SD(p) = sqrt(pq/N) Slide 11
12
Maximum likelihood illustration
The likelihood approach always needs a model. As a fish is either a male or a female, we use the model of binomial distribution, and the likelihood function is The maximum likelihood method finds the value of p that maximizes the likelihood value. This maximization process is simplified by maximizing the natural logarithm of L instead: The likelihood estimate of the variance of p is the negative reciprocal of the second derivative, Xuhua Xia
13
Quadrat Sampling Xuhua Xia
14
Three Distribution Patterns
Random Even Contagious Xuhua Xia
15
Quadrat Sampling Quadrat N 1 2 2 2 3 3 4 0 5 6 . . 100 1 Mean Variance
1 2 2 2 3 3 4 0 5 6 . . 100 1 Mean Variance Xuhua Xia
16
Three Distribution Patterns
Xuhua Xia
17
Three Probability Distributions
Poisson distribution (random distribution) p(x) = exp(-)ux/x!; 2 = , Binomial distribution (even distribution) 2 < Negative binomial distribution (contagious distribution) 2 > EXCEL demonstration parameter estimation by least-squares and maximum likelihood methods goodness-of-fit test Xuhua Xia
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.