Maximum likelihood estimation

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

“Students” t-test.
Sampling: Final and Initial Sample Size Determination
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
Sampling Distributions (§ )
Confidence Interval Estimation of Population Mean, μ, when σ is Unknown Chapter 9 Section 2.
ELEC 303 – Random Signals Lecture 18 – Statistics, Confidence Intervals Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 10, 2009.
Statistical inference form observational data Parameter estimation: Method of moments Use the data you have to calculate first and second moment To fit.
1 Business 90: Business Statistics Professor David Mease Sec 03, T R 7:30-8:45AM BBC 204 Lecture 22 = More of Chapter “Confidence Interval Estimation”
Chapter 8 Estimation: Single Population
Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.
Fall 2006 – Fundamentals of Business Statistics 1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 7 Estimating Population Values.
Chapter 7 Estimation: Single Population
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
Chapter 7 Estimating Population Values
Exploring Marketing Research William G. Zikmund
Standard error of estimate & Confidence interval.
Confidence Interval A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population.
Review of normal distribution. Exercise Solution.
© 2002 Thomson / South-Western Slide 8-1 Chapter 8 Estimation with Single Samples.
Estimation of Statistical Parameters
Chapter 8 Estimation Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama.
1 Chapter 6. Section 6-1 and 6-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Business Research Methods William G. Zikmund Chapter 17: Determination of Sample Size.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Determination of Sample Size: A Review of Statistical Theory
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
1 Chapter 6 Estimates and Sample Sizes 6-1 Estimating a Population Mean: Large Samples / σ Known 6-2 Estimating a Population Mean: Small Samples / σ Unknown.
What does Statistics Mean? Descriptive statistics –Number of people –Trends in employment –Data Inferential statistics –Make an inference about a population.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
Lesson 9 - R Chapter 9 Review.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
1 Chapter 6. Section 6-1 and 6-2. Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Estimates and Sample Sizes Chapter 6 M A R I O F. T R I O L A Copyright © 1998,
Confidence Interval Estimation of Population Mean, μ, when σ is Unknown Chapter 9 Section 2.
1 Chapter 8 Interval Estimation. 2 Chapter Outline  Population Mean: Known  Population Mean: Unknown  Population Proportion.
Section 6.2 Confidence Intervals for the Mean (Small Samples) Larson/Farber 4th ed.
Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 8-1 Business Statistics, 3e by Ken Black Chapter.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
MATH Section 4.4.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 8 Confidence Interval Estimation Statistics For Managers 5 th Edition.
Introduction to Marketing Research
Advanced Quantitative Techniques
Statistical Estimation
Confidence Intervals and Sample Size
ESTIMATION.
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Chapter 6 Confidence Intervals.
Inference: Conclusion with Confidence
Correlation – Regression
CONCEPTS OF ESTIMATION
Chapter 6 Confidence Intervals.
IE 355: Quality and Applied Statistics I Confidence Intervals
Parametric Methods Berlin Chen, 2005 References:
Elementary Statistics: Picturing The World
Sampling Distributions (§ )
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Chapter 6 Confidence Intervals.
Advanced Algebra Unit 1 Vocabulary
Chapter 8 Estimation.
Statistical Inference for the Mean: t-test
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

Maximum likelihood estimation Michail Tsagris & Ioannis Tsamardinos

Histograms revisited Looks symmetric and unimodal.

Normal distribution 𝑓 𝑥 = 1 2𝜋 𝜎 2 𝑒 − 𝑥−𝜇 2 2𝜎 2

MLE of normal distribution Suppose we have collected some data, RNA expression measurements for example. We do a histogram and we see a nice bell shaped distribution. How do we find the parameters μ and 𝜎 2 ; Given, what we see, we will calculate the most probable values assuming that the sample comes from a normally distributed population. We need to estimate the values of these parameters which are most likely to have produced those data. The values which maximise the likelihoods of observing such data.

MLE of normal distribution So we will calculate the values that maximise the likelihood of having observed these data. Denote the n values we have observed by 𝑥 1 , 𝑥 2 , …, 𝑥 𝑛 . Each of these 𝑥 𝑖𝑠 come from a normal population with some mean μ and some variance σ2. The two values are the same and in addition, the values are independent from each other (independent and identically distributed, iid).

MLE of normal distribution So, each 𝑥 𝑖 can be plugged into the formula (probability density function) of the normal distribution: f (𝑥 𝑖 )= 1 2𝜋 𝜎 2 𝑒 − 𝑥 𝑖 −𝜇 2 2𝜎 2 . Let us take the product of all the 𝑓 𝑖𝑠 : 𝐿(𝜇, 𝜎 2 )= 𝑖=1 𝑛 1 2𝜋 𝜎 2 𝑒 − 𝑥 𝑖 −𝜇 2 2𝜎 2 . The goal is to maximise the above expression. Better to maximise its logarithm. The two points (𝜇, 𝜎 2 ) which maximise the L will maximise the log(L) as well.

MLE of normal distribution ℓ(𝜇, 𝜎 2 )=− 𝑛 2 log 2𝜋 𝜎 2 − 𝑖=1 𝑛 𝑥 𝑖 −𝜇 2 2 𝜎 2 𝜕ℓ 𝜕𝜇 = 𝑖=1 𝑛 𝑥 𝑖 −𝜇 2𝜎 2 =0 ⇒ 𝜇 = 𝑖=1 𝑛 𝑥 𝑖 𝑛 = 𝑥 𝜕ℓ 𝜕 𝜎 2 =− 𝑛 2 𝜎 2 + 𝑖=1 𝑛 𝑥 𝑖 −𝜇 2 2 𝜎 4 =0⇒ 𝜎 2 = 𝑖=1 𝑛 𝑥 𝑖 −𝜇 2 𝑛 .

MLE of mean, median and proportion So, the MLE of the mean is simply the sample mean of the data. The sample median serves as the MLE of the median. What about proportions? Suppose that a dose of 1μL kills 5 of the 30 mice in a wet lab experiment. The estimated rate of killing of the dose of this specific drug is 5/30 = 0.167 or 16.67%.

MLE of mean and median The sample mean can also be seen as the quantity θ that minimises the sum of squares of differences 𝑖=1 𝑛 𝑥 𝑖 −𝜃 2 . The sample median on the other hand is the quantity θ that minimises the sum of absolute differences 𝑖=1 𝑛 𝑥 𝑖 −𝜃 .

Confidence intervals Suppose one calculates the proportion of mice killed at a given drug dose. Is it enough to present just a number? What else can he say? The uncertainty must be quantified via a range of most likely values, an interval of most likely values, an interval of high confidence. The most common sentence is 95% confidence interval. The 5% is the standard, the most popular, but not the unique number.

Confidence intervals We performed a study, found some summary statistics and constructed a 95% confidence interval for the mean or the proportion. This means that with 95% probability the true means lies within the computed range.

Confidence intervals We performed a study, found some summary statistics and constructed a 95% confidence interval for the mean or the proportion. This means that with 95% probability the true means lies within the computed range. Wrong!!!

Confidence intervals We performed a study, found some summary statistics and constructed a 95% confidence interval for the mean or the proportion. But we only did the analysis once. In other words, our confidence interval has a 95% probability (we expect it is one of the 95% intervals) to have included the true mean or proportion or any parameter in general.

Confidence intervals We performed a study, found some summary statistics and constructed a 95% confidence interval for the mean or the proportion. If we were to repeat this analysis many times, we would expect that 95% of the times we would have included the true mean. Correct!!!

Confidence intervals

Confidence intervals

Confidence intervals

(Student’s) t distribution (William Gosset) We have spoken of the normal distribution. Let us now see the t distribution. 𝑓 𝑥;𝜇, 𝜎 2 = 1 2𝜋 𝜎 2 𝑒 − 𝑥−𝜇 2 2𝜎 2 Normal density 𝑓 𝑥;𝑣 = Γ( 𝑣+1 2 ) 𝑣𝜋 Γ( 𝑣 2 ) 1+ 𝑥 2 𝑣 − 𝑣+1 2 t density

(Student’s) t distribution (William Gosset) As v (degrees of freedom) increases the distribution approaches the normal distribution. lim 𝑣→∞ 𝑡 𝑣 −>𝑁(0, 1)

Population and sample revisit Greek letters indicate population parameters English letters correspond to sample estimates. Mean, variance and standard deviation example. 𝜇, 𝜎 2 , 𝜎. 𝑥, 𝑠 2 , 𝑠.

(1-α)% confidence interval for the mean Suppose we have estimated the average glucose concentration to be 𝑥 =86 mg/dL with a variance equal to 𝑠 2 =25 and we want to construct a 95% confidence interval for the true concentration. Our sample consists of 𝑛=31 people. 95% is called confidence level

(1-α)% confidence interval for the mean Suppose we have estimated the average glucose concentration to be 𝑥 =86 mg/dL with a variance equal to 𝑠 2 =25 and we want to construct a 95% confidence interval for the true concentration. Our sample consists of 𝑛=31 people. 𝑥 − 𝑡 1− 𝑎 2 , 𝑛−1 𝑠 𝑛 , 𝑥 + 𝑡 1− 𝑎 2 , 𝑛−1 𝑠 𝑛 . We have everything but the term 𝑡 1− 𝑎 2 , 𝑣−1 .

t distribution tables

(1-α)% confidence interval for the mean 𝑡 1− 𝑎 2 , 𝑣−1 = 𝑡 1− 0.05 2 , 35−1 = 𝑡 0.975, 34 =2.042 86 −2.042 25 31 ,86 2.042 25 31 = 84.166, 87.834 .

(1-α)% confidence interval for the proportion In sample of 132 women who smoke, 58 of them were found to have increased chances of getting breast cancer. 𝑝 = 58 132 =0.4394 or 43.94% The relevant 95% confidence interval is given by 𝑝 − Ζ 1− 𝑎 2 𝑝 (1− 𝑝 ) 𝑛 , 𝑝 + Ζ 1− 𝑎 2 𝑝 (1− 𝑝 ) 𝑛 . What is Ζ 1− 𝑎 2 ?

(1-α)% confidence interval for the proportion Ζ 1− 𝑎 2 = Ζ 1− 0.05 2 = Ζ 0.975 =1.96. 0.4394 −1.96 0.4394 1−0.4394 132 , 0.4394 +1.96 0.4394(1−0.4394) 132 0.4394 −1.96 0.4394 1−0.4394 132 , 0.4394 +1.96 0.4394(1−0.4394) 132 0.3547, 0.5241 or 35.47%, 52.41% .