Lecture 5 More maximum likelihood estimation

Slides:



Advertisements
Similar presentations
Introduction to parameter optimization
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Detecting changes in von Bertalanffy growth parameters (K, L ∞ ) in a hypothetical fish stock subject to size-selective fishing Otoliths (ear bones) from.
Design of Experiments Lecture I
Lecture 2: Parameter Estimation and Evaluation of Support.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
458 Fitting models to data – II (The Basics of Maximum Likelihood Estimation) Fish 458, Lecture 9.
5. Estimation 5.3 Estimation of the mean K. Desch – Statistical methods of data analysis SS10 Is an efficient estimator for μ ?  depends on the distribution.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Hui-Hua Lee 1, Kevin R. Piner 1, Mark N. Maunder 2 Evaluation of traditional versus conditional fitting of von Bertalanffy growth functions 1 NOAA Fisheries,
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
8-1 Introduction In the previous chapter we illustrated how a parameter can be estimated from sample data. However, it is important to understand how.
Logistic regression for binary response variables.
Confidence Intervals. Estimating the difference due to error that we can expect between sample statistics and the population parameter.
Machine Learning Saarland University, SS 2007 Holger Bast Max-Planck-Institut für Informatik Saarbrücken, Germany Lecture 9, Friday June 15 th, 2007 (EM.
Chapter 13: Inference in Regression
Likelihood probability of observing the data given a model with certain parameters Maximum Likelihood Estimation (MLE) –find the parameter combination.
Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.
Maximum likelihood estimation of relative transcript abundances Advanced bioinformatics 2012.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
General Confidence Intervals Section Starter A shipment of engine pistons are supposed to have diameters which vary according to N(4 in,
Correlation and Prediction Error The amount of prediction error is associated with the strength of the correlation between X and Y.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
1 Lecture 16: Point Estimation Concepts and Methods Devore, Ch
1 7. What to Optimize? In this session: 1.Can one do better by optimizing something else? 2.Likelihood, not LS? 3.Using a handful of likelihood functions.
NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Machine Learning 5. Parametric Methods.
Logistic regression (when you have a binary response variable)
Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
A correction on notation (Thanks Emma and Melissa)
Lecture 8 Resampling inference Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington.
Conditional Expectation
Review. Common probability distributions Discrete: binomial, Poisson, negative binomial, multinomial Continuous: normal, lognormal, beta, gamma, (negative.
Lecture 5 More maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington.
Lecture 5 More loops Introduction to maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Lecture 7 Confidence intervals
MathematicalMarketing Slide 3c.1 Mathematical Tools Chapter 3: Part c – Parameter Estimation We will be discussing  Nonlinear Parameter Estimation  Maximum.
Bayesian Estimation and Confidence Intervals Lecture XXII.
Least-squares, Maximum likelihood and Bayesian methods
بسم الله الرحمن الرحيم 1.
Probability Theory and Parameter Estimation I
Dr.Fatima Alkhaledy M.B.Ch.B;F.I.C.M.S/C.M
Let’s do a Bayesian analysis
Course Description Algorithms are: Recipes for solving problems.
Biodiversity of Fishes Growth
CHAPTER 29: Multiple Regression*
Estimating
The normal distribution
Modelling data and curve fitting
CSCI B609: “Foundations of Data Science”
Parametric Survival Models (ch. 7)
Individual Growth Population Biomass Recruitment Natural Mortality
10701 / Machine Learning Today: - Cross validation,
Mathematical Foundations of BME
Chengyuan Yin School of Mathematics
ESTIMATION.
Course Description Algorithms are: Recipes for solving problems.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Accuracy of Averages.
Presentation transcript:

Lecture 5 More maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington

Von Bertalanffy growth model Scenario: the predicted length (Lpred) of a fish is related to its age (a) according to this equation: where is the asymptotic maximum length in cm, and K is the growth rate (I will set t0 = 0 so it vanishes) The blue dots are the observed ages (a) and the observed lengths (Lobs). They are the data. Males Females

Questions to answer Estimate the parameters ( , K, ) for this equation separately for males and females Calculate 95% confidence intervals for asymptotic length Can we conclude that asymptotic length differs for males and females?

Normal likelihood: SD = 10 Length (cm) Each curve is a normal distribution with the mean at the model prediction. The height of the curve at each red data point is the likelihood. The likelihood depends on the curve type (normal), the model prediction at that age (mean), the data points, and the standard deviation chosen ( = 10 here) Age (years) 4 LikeDensLines.pdf

Negative log-likelihood The maximum likelihood estimate (MLE) is where the product of the likelihoods is highest The MLE is also where the sum of the negative log-likelihoods is the smallest To answer Question 1, we must find values of , K, and  that produce Li,pred values that minimize –lnLtot

Coding –lnL in R: three methods Manually: NLL <- 0.5*ndata*log(2*pi) + ndata*log(sigma) + 1/(2*sigma*sigma) * sum((lengths-model.predL)^2) Negative log of product of normal likelihoods: NLL2 <- -log(prod(dnorm(x=lengths, mean=model.predL, sd=sigma))) Negative sum of log of normal likelihoods: NLL3 <- -sum(dnorm(x=lengths, mean=model.predL, sd=sigma, log=TRUE))

Exhaustive search for minimum –lnL nLinf <- length(Linfinity) nK <- length(K) Sigma <- 10 NLL <- matrix(nrow=nLinf, ncol=nK,byrow=F) for (i in 1:nLinf) { for (j in 1:nK) { model.predL <- Linfinity[i]* (1-exp(-K[j]*ages)) NLL[i,j] <- -sum(dnorm(x=lengths, mean=model.predL,sd=sigma,log=TRUE)) } filled.contour(x=Linfinity, y=K, z=NLL) A vector of Linf values A vector of K values Fix sigma (should loop over sigma too)

Surface of negative log-likelihood (fixed  = 10) -lnL K (per yr) Linf (cm)

What is optimization? Optimization involves finding the values for the parameters of a function so that the function itself is minimized (or maximized) Mathematically: For us, the function f will be the negative log-likelihood For optimization we can use optim() in the base package or mle2() in the bblme package The main focus here will be on mle2()

Using mle2() to minimize functions mle2(minuslogl, start, method, fixed, lower, upper) Key parameters (there are many more) minuslogl: function to minimize, must return negative log likelihood (it ends in an "el" not a "one") start: list of named starting values of parameters to minimize method: method to use for minimization; "Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN", or "Brent" fixed: list of fixed parameters (data) For the "L-BFGS-B" method you can also specify lower: lower bounds for parameters (including –Inf) upper: upper bounds for parameters (including Inf)

Simple example minim <- function(x, y) { result <- (x-3)^2 + (y-4)^2 return(result) } > mle2(minuslogl=minim, start=list(x=4, y=5)) Call: mle2(minuslogl = minim, start = list(x = 4, y = 5)) Coefficients: x y 3 4 Log-likelihood: 0

Von Bertalanffy –lnL function VB.NLL <- function(Linfinity, K, sigma) { gender <- "Male" LA <- read.csv(file="LengthAge.csv") ages <- LA[LA$Gender==gender,]$Ages lengths <- LA[LA$Gender==gender,]$Lengths model.predL <- Linfinity*(1-exp(-K*ages)) ndata <- length(ages) NLL <- -sum(dnorm(x=lengths, mean=model.predL,sd=sigma,log=TRUE)) return(NLL) } > VB.NLL(Linfinity=100, K=0.2, sigma=10) [1] 202.9833

In-class exercise 1 Copy the code for the VB.NLL() function from the previous slide and download "LengthAge.csv" from the website Use mle2() to find values for Linfinity, K, and sigma that minimize the negative log-likelihood for VB.NLL(), for males Change the code to minimize the function for females Use different starting values and different methods... some may not get the best answer

What about mle2 and data? VB.NLL2 <- function(Linfinity, K, sigma, gender, filename) { LA <- read.csv(file=filename) ages <- LA[LA$Gender==gender,]$Ages ... } mle2(minuslogl=VB.NLL2, ..., data=list(gender="Male", filename="Data//LengthAge.csv"))

Using the optim() function mle2 is based on optim(), which is called somewhat differently All arguments that are being varied need to be in a vector, other arguments can be as before VB.NLL3 <- function(params) { Linfinity <- params[1] K <- params[2] sigma <- params[3] When calling optim(), it requires vectors (not lists) of parameter values, lower, upper, etc. optim(fn=VB.NLL3, par=c(100, 0.2, 10))

In-class exercise 2 Modify the code for the VB.NLL() function, renaming it VB.NLL3() and use optim() to minimize the function for males and females

What about optim and data? VB.NLL4 <- function(params, gender, filename) { Linfinity <- params[1] K <- params[2] sigma <- params[3] ... } optim(fn=VB.NLL4, par=c(100, 0.2, 10), ..., gender="Male", filename="Data//LengthAge.csv") Confusingly, gender and filename are added to the optim() function call as if they were parameters of optim!