R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

The Maximum Likelihood Method
General Linear Model With correlated error terms  =  2 V ≠  2 I.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
The General Linear Model. The Simple Linear Model Linear Regression.
Maximum likelihood (ML) and likelihood ratio (LR) test
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Maximum likelihood (ML)
The Simple Regression Model
Lecture 8 The Principle of Maximum Likelihood. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
5. Estimation 5.3 Estimation of the mean K. Desch – Statistical methods of data analysis SS10 Is an efficient estimator for μ ?  depends on the distribution.
Chi Square Distribution (c2) and Least Squares Fitting
Maximum likelihood (ML)
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Introduction l Example: Suppose we measure the current (I) and resistance (R) of a resistor. u Ohm's law relates V and I: V = IR u If we know the uncertainties.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
880.P20 Winter 2006 Richard Kass 1 Maximum Likelihood Method (MLM) Does this procedure make sense? The MLM answers this question and provides a method.
Model Inference and Averaging
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
R Kass/SP07 P416 Lecture 4 1 Propagation of Errors ( Chapter 3, Taylor ) Introduction Example: Suppose we measure the current (I) and resistance (R) of.
Lab 3b: Distribution of the mean
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
1 Standard error Estimated standard error,s,. 2 Example 1 While measuring the thermal conductivity of Armco iron, using a temperature of 100F and a power.
Machine Learning 5. Parametric Methods.
ES 07 These slides can be found at optimized for Windows)
Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
CHAPTER 4 ESTIMATES OF MEAN AND ERRORS. 4.1 METHOD OF LEAST SQUARES I n Chapter 2 we defined the mean  of the parent distribution and noted that the.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Computacion Inteligente Least-Square Methods for System Identification.
Conditional Expectation
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
MathematicalMarketing Slide 3c.1 Mathematical Tools Chapter 3: Part c – Parameter Estimation We will be discussing  Nonlinear Parameter Estimation  Maximum.
Estimator Properties and Linear Least Squares
The simple linear regression model and parameter estimation
The Maximum Likelihood Method
12. Principles of Parameter Estimation
Parameter Estimation and Fitting to Data
Model Inference and Averaging
The Maximum Likelihood Method
Simple Linear Regression - Introduction
The Maximum Likelihood Method
Chi Square Distribution (c2) and Least Squares Fitting
Lecture 4 Propagation of errors
5.2 Least-Squares Fit to a Straight Line
Simple Linear Regression
Parametric Methods Berlin Chen, 2005 References:
12. Principles of Parameter Estimation
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Regression Models - Introduction
Presentation transcript:

R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity {x 1, x 2, … x n }. u The standard way to estimate x T from our measurements is to calculate the mean value: and set x T =  x. DOES THIS PROCEDURE MAKE SENSE??? The maximum likelihood method (MLM) answers this question and provides a general method for estimating parameters of interest from data. l Statement of the Maximum Likelihood Method u Assume we have made N measurements of x {x 1, x 2, … x n }. u Assume we know the probability distribution function that describes x: f(x,  ). u Assume we want to determine the parameter . MLM: pick  to maximize the probability of getting the measurements (the x i 's) that we did! l How do we use the MLM? u The probability of measuring x 1 is f(x 1,  )dx u The probability of measuring x 2 is f(x 2,  )dx u The probability of measuring x n is f(x n,  )dx u If the measurements are independent, the probability of getting the measurements we did is: Lecture 5 The Maximum Likelihood Method We can drop the dx n term as it is only a proportionality constant

R. Kass/W03 P416 Lecture 5 u We want to pick the  that maximizes L: u Often easier to maximize lnL. L and lnL are both maximum at the same location. Also, maximize lnL rather than L itself because lnL converts the product into a summation. The new maximization condition is: n  could be an array of parameters (e.g. slope and intercept) or just a single variable. n equations to determine  range from simple linear equations to coupled non-linear equations. l Example: u Let f(x,  ) be given by a Gaussian distribution function. u Let  =  be the mean of the Gaussian. We want to use our data+MLM to find the mean, . u We want the best estimate of  from our set of n measurements {x 1, x 2, … x n }. u Let’s assume that  is the same for each measurement. u The likelihood function for this problem is: gaussian pdf

R. Kass/W03 P416 Lecture 5 u Find the  that maximizes the log likelihood function: u If  are different for each data point then  is just the weighted average: Average factor out  since it is a constant Weighted Average don’t forget the factor of n

R. Kass/W03 P416 Lecture 5 l Example u Let f(x,  ) be given by a Poisson distribution. u Let  =  be the mean of the Poisson. u We want the best estimate of  from our set of n measurements {x 1, x 2, … x n }. u The likelihood function for this problem is: u Find  that maximizes the log likelihood function: Some general properties of the Maximum Likelihood Method For large data samples (large n) the likelihood function, L, approaches a Gaussian distribution. Maximum likelihood estimates are usually consistent. For large n the estimates converge to the true value of the parameters we wish to determine. Maximum likelihood estimates are usually unbiased. For all sample sizes the parameter of interest is calculated correctly. Maximum likelihood estimate is efficient: the estimate has the smallest variance. Maximum likelihood estimate is sufficient: it uses all the information in the observations (the x i ’s). The solution from MLM is unique.  Bad news: we must know the correct probability distribution for the problem at hand! Average

R. Kass/W03 P416 Lecture 5 Maximum Likelihood Fit of Data to a Function l Suppose we have a set of n measurements: u Assume each measurement error (  ) is a standard deviation from a Gaussian pdf. u Assume that for each measured value y, there’s an x which is known exactly. u Suppose we know the functional relationship between the y’s and the x’s: n , ...are parameters. MLM gives us a method to determine , ... from our data. l Example: Fitting data points to a straight line: two linear equations with two unkowns u Find  and  by maximizing the likelihood function L likelihood function:

R. Kass/W03 P416 Lecture 5 For the sake of simplicity assume that all  ’s are the same (  1 =  2 =  ): We now have two equations that are linear in the unknowns ,  : in matrix form These equations correspond to Taylor’s Eq

R. Kass/W03 P416 Lecture 5 l EXAMPLE: u A trolley moves along a track at constant speed. Suppose the following measurements of the time vs. distance were made. From the data find the best value for the speed (v) of the trolley. u Our model of the motion of the trolley tells us that: Time t (seconds) Distance d (mm) u We want to find v, the slope (  ) of the straight line describing the motion of the trolley. u We need to evaluate the sums listed in the above formula: d 0 = 0.8 mm best estimate of the speed best estimate of the starting point

R. Kass/W03 P416 Lecture 5 u The line “best” represents our data. u Not all the data points are "on" the line. uThe line minimizes the sum of squares of the deviations (  ) between the line and our data (d i ):  i =data-prediction= d i -(d 0 +vt i ) The MLM fit to the data for d=d 0 +vt