Lecture 5 More maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington.

Slides:



Advertisements
Similar presentations
Introduction to parameter optimization
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Detecting changes in von Bertalanffy growth parameters (K, L ∞ ) in a hypothetical fish stock subject to size-selective fishing Otoliths (ear bones) from.
Lecture 13 L1 , L∞ Norm Problems and Linear Programming
Lecture 2: Parameter Estimation and Evaluation of Support.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
FMSP stock assessment tools Training Workshop LFDA Theory.
Multivariate linear models for regression and classification Outline: 1) multivariate linear regression 2) linear classification (perceptron) 3) logistic.
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
458 Fitting models to data – II (The Basics of Maximum Likelihood Estimation) Fish 458, Lecture 9.
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
5. Estimation 5.3 Estimation of the mean K. Desch – Statistical methods of data analysis SS10 Is an efficient estimator for μ ?  depends on the distribution.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Need to know in order to do the normal dist problems How to calculate Z How to read a probability from the table, knowing Z **** how to convert table values.
Hui-Hua Lee 1, Kevin R. Piner 1, Mark N. Maunder 2 Evaluation of traditional versus conditional fitting of von Bertalanffy growth functions 1 NOAA Fisheries,
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
458 Fitting models to data – III (More on Maximum Likelihood Estimation) Fish 458, Lecture 10.
8-1 Introduction In the previous chapter we illustrated how a parameter can be estimated from sample data. However, it is important to understand how.
Maximum likelihood (ML)
Logistic regression for binary response variables.
Confidence Intervals. Estimating the difference due to error that we can expect between sample statistics and the population parameter.
Fundamentals of Statistical Analysis DR. SUREJ P JOHN.
The maximum likelihood method Likelihood = probability that an observation is predicted by the specified model Plausible observations and plausible models.
Likelihood probability of observing the data given a model with certain parameters Maximum Likelihood Estimation (MLE) –find the parameter combination.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
EM and expected complete log-likelihood Mixture of Experts
The Triangle of Statistical Inference: Likelihoood
Model Inference and Averaging
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Correlation and Prediction Error The amount of prediction error is associated with the strength of the correlation between X and Y.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
Likelihood Methods in Ecology November 16 th – 20 th, 2009 Millbrook, NY Instructors: Charles Canham and María Uriarte Teaching Assistant Liza Comita.
1 Lecture 16: Point Estimation Concepts and Methods Devore, Ch
STA Lecture 191 STA 291 Lecture 19 Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm – 9:15pm Location CB 234.
Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All.
Probability = Relative Frequency. Typical Distribution for a Discrete Variable.
Maximum Likelihood Estimation Psych DeShon.
1 7. What to Optimize? In this session: 1.Can one do better by optimizing something else? 2.Likelihood, not LS? 3.Using a handful of likelihood functions.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
Over-fitting and Regularization Chapter 4 textbook Lectures 11 and 12 on amlbook.com.
Machine Learning 5. Parametric Methods.
ES 07 These slides can be found at optimized for Windows)
Logistic regression (when you have a binary response variable)
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Maximum Likelihood Estimates and the EM Algorithms III Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
A correction on notation (Thanks Emma and Melissa)
Lecture 8 Resampling inference Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington.
Estimation Econometría. ADE.. Estimation We assume we have a sample of size T of: – The dependent variable (y) – The explanatory variables (x 1,x 2, x.
Review. Common probability distributions Discrete: binomial, Poisson, negative binomial, multinomial Continuous: normal, lognormal, beta, gamma, (negative.
Nonlinear function minimization (review). Newton’s minimization method Ecological detective p. 267 Isaac Newton We want to find the minimum value of f(x)
Lecture 5 More loops Introduction to maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Lecture 7 Confidence intervals
Lecture 5 More maximum likelihood estimation
CHAPTER 29: Multiple Regression*
Modelling data and curve fitting
Mathematical Foundations of BME
#21 Marginalize vs. Condition Uninteresting Fitted Parameters
Mathematical Foundations of BME
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Presentation transcript:

Lecture 5 More maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington

Von Bertalanffy growth model Scenario: the predicted length (L pred ) of a fish is related to its age (a) according to this equation: where is the asymptotic maximum length in cm, and K is the growth rate (I will set t 0 = 0 so it vanishes) MalesFemales The blue dots are the observed ages (a) and the observed lengths (L obs ). They are the data.

Questions to answer 1.Estimate the parameters (, K,  ) for this equation separately for males and females 2.Calculate 95% confidence intervals for asymptotic length 3.Can we conclude that asymptotic length differs for males and females?

Normal likelihood: SD = 10 Age (years) Length (cm) 4 LikeDensLines.pdf Each curve is a normal distribution with the mean at the model prediction. The height of the curve at each red data point is the likelihood. The likelihood depends on the curve type (normal), the model prediction at that age (mean), the data points, and the standard deviation chosen (  = 10 here)

The maximum likelihood estimate (MLE) is where the product of the likelihoods is highest The MLE is also where the sum of the negative log- likelihoods is the smallest To answer Question 1, we must find values of, K, and  that produce L i,pred values that minimize –lnL tot Negative log-likelihood

Coding –lnL in R: three methods Manually: NLL <- 0.5*ndata*log(2*pi) + ndata*log(sigma) + 1/(2*sigma*sigma) * sum((lengths-model.predL)^2) Negative log of product of normal likelihoods: NLL2 <- -log(prod(dnorm(x=lengths, mean=model.predL, sd=sigma))) Negative sum of log of normal likelihoods: NLL3 <- -sum(dnorm(x=lengths, mean=model.predL, sd=sigma, log=TRUE))

Exhaustive search for minimum –lnL nLinf <- length(Linfinity) nK <- length(K) Sigma <- 10 NLL <- matrix(nrow=nLinf, ncol=nK,byrow=F) for (i in 1:nLinf) { for (j in 1:nK) { model.predL <- Linfinity[i]* (1-exp(-K[j]*ages)) NLL[i,j] <- -sum(dnorm(x=lengths, mean=model.predL,sd=sigma,log=TRUE)) } filled.contour(x=Linfinity, y=K, z=NLL) A vector of L inf values A vector of K values Fix sigma (should loop over sigma too)

Surface of negative log-likelihood (fixed  = 10, using filled.contour) L inf (cm) K (per yr) -lnL

What is optimization? Optimization involves finding the values for the parameters of a function so that the function itself is minimized (or maximized) Mathematically: For us, the function f will be the negative log- likelihood For optimization we can use optim() in the base package or mle2() in the bblme package The main focus here will be on mle2()

Using mle2() to minimize functions mle2(minuslogl, start, method, fixed, lower, upper) Key parameters (there are many more) minuslogl : function to minimize, must return negative log likelihood (it ends in an "el" not a "one") start : list of named starting values of parameters to minimize method : method to use for minimization; "Nelder-Mead", "BFGS", "CG", "L-BFGS-B", "SANN", or "Brent" fixed : list of fixed parameters (data) For the "L-BFGS-B" method you can also specify lower : lower bounds for parameters (including –Inf) upper : upper bounds for parameters (including Inf)

Simple example minim <- function(x, y) { result <- (x-3)^2 + (y-4)^2 return(result) } > mle2(minuslogl=minim, start=list(x=4, y=5)) Call: mle2(minuslogl = minim, start = list(x = 4, y = 5)) Coefficients: x y 3 4 Log-likelihood: 0

Von Bertalanffy –lnL function VB.NLL <- function(Linfinity, K, sigma) { gender <- "Male" LA <- read.csv(file="LengthAge.csv") ages <- LA[LA$Gender==gender,]$Ages lengths <- LA[LA$Gender==gender,]$Lengths model.predL <- Linfinity*(1-exp(-K*ages)) ndata <- length(ages) NLL <- -sum(dnorm(x=lengths, mean=model.predL,sd=sigma,log=TRUE)) return(NLL) } > VB.NLL(Linfinity=100, K=0.2, sigma=10) [1]

In-class exercise 1 Download the code from "6 VB NLL.r" for the VB.NLL() function from the previous slide and download "LengthAge.csv" from the website Use mle2() (package bbmle) to find values for Linfinity, K, and sigma that minimize the negative log-likelihood for VB.NLL(), for males Change the code to minimize the function for females Use different starting values and different methods... some may not get the best answer

What about mle2 and data? VB.NLL2 <- function(Linfinity, K, sigma, gender, filename) { LA <- read.csv(file=filename) ages <- LA[LA$Gender==gender,]$Ages... } mle2(minuslogl=VB.NLL2,..., data=list(gender="Male", filename="Data//LengthAge.csv"))

Using the optim() function mle2 is based on optim(), which is called somewhat differently All arguments that are being varied need to be in a vector, other arguments can be as before VB.NLL3 <- function(params) { Linfinity <- params[1] K <- params[2] sigma <- params[3] When calling optim(), it requires vectors (not lists) of parameter values, lower, upper, etc. optim(fn=VB.NLL3, par=c(100, 0.2, 10))

In-class exercise 2 Modify the code for the VB.NLL() function, renaming it VB.NLL3() and use optim() to minimize the function for males and females

What about optim and data? VB.NLL4 <- function(params, gender, filename) { Linfinity <- params[1] K <- params[2] sigma <- params[3]... } optim(fn=VB.NLL4, par=c(100, 0.2, 10),..., gender="Male", filename="Data//LengthAge.csv") Confusingly, gender and filename are added to the optim() function call as if they were parameters of optim !