Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.

Slides:



Advertisements
Similar presentations
Introduction Simple Random Sampling Stratified Random Sampling
Advertisements

Estimation of Means and Proportions
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
Probability and Statistics Basic concepts II (from a physicist point of view) Benoit CLEMENT – Université J. Fourier / LPSC
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Chapter 7. Statistical Estimation and Sampling Distributions
Statistical Estimation and Sampling Distributions
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
POINT ESTIMATION AND INTERVAL ESTIMATION
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
Chap 9: Testing Hypotheses & Assessing Goodness of Fit Section 9.1: INTRODUCTION In section 8.2, we fitted a Poisson dist’n to counts. This chapter will.
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Maximum likelihood (ML) and likelihood ratio (LR) test
Point estimation, interval estimation
Maximum likelihood (ML)
Visual Recognition Tutorial
Statistical Background
Chapter 8 Estimation: Single Population
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Chapter 7 Estimation: Single Population
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
Inferences About Process Quality
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Maximum likelihood (ML)
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Chapter 5 Sampling and Statistics Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Intermediate Econometrics
Simulation Output Analysis
Estimation Basic Concepts & Estimation of Proportions
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Random Sampling, Point Estimation and Maximum Likelihood.
Multinomial Distribution
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Chapter 7 Point Estimation
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Consistency An estimator is a consistent estimator of θ, if , i.e., if
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
CLASS: B.Sc.II PAPER-I ELEMENTRY INFERENCE. TESTING OF HYPOTHESIS.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
1 Probability and Statistical Inference (9th Edition) Chapter 5 (Part 2/2) Distributions of Functions of Random Variables November 25, 2015.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
8.1 Estimating µ with large samples Large sample: n > 30 Error of estimate – the magnitude of the difference between the point estimate and the true parameter.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Week 31 The Likelihood Function - Introduction Recall: a statistical model for some data is a set of distributions, one of which corresponds to the true.
n Point Estimation n Confidence Intervals for Means n Confidence Intervals for Differences of Means n Tests of Statistical Hypotheses n Additional Comments.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Statistical Estimation
STATISTICS POINT ESTIMATION
Sampling Distributions and Estimation
CONCEPTS OF ESTIMATION
Parametric Methods Berlin Chen, 2005 References:
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Presentation transcript:

Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before fitting probability laws to data.

Section 8.2: Fitting the Poisson Distribution to Emissions of Alpha Particles (classical example) Recall: The Probability Mass Function of a Poisson random variable X is given by: From the observed data, we must estimate a value for the parameter

What if the experiment is repeated? The estimate of will be viewed as a random variable which has a probability dist’n referred to as its sampling distribution. The spread of the sampling distribution reflects the variability of the estimate. Chap 8 is about fitting the model to data. Chap 9 will be dealing with testing such a fit.

Assessing Goodness of Fit (GOF): Example: Fit a Poisson dist’n to counts-p240 Informally, GOF is assessed by comparing the Observed (O) and the Expected (E) counts that are grouped (at least 5 each) into the 16 cells. Formally, use a measure of discrepancy such as the Pearson’s chi-square statistic to quantify the comparison of the O and E counts. In this example,

Null dist’n: approximately is a random variable (as a function of random counts) whose probability dist’n is called its null distribution. It can be shown that the null dist’n of is approximately the chi-square dist’n with degrees of freedom df = no. of cells — no. of independent parameters fitted — 1. Notation: df = 16 (cells) –1(parameter ) –1 = 14 The larger the value of, the worse the fit.

p-value: Figure 8.1 on page 242 gives a nice feeling of what a p-value might be. The p-value measures the degree of evidence against the statement “model fits data well == Poisson is the true model.” The smaller the p-value, the worse the fit there is more evidence against the model. The smaller the p-value, the worse the fit or there is more evidence against the model. Small p-value means then rejecting the null or saying that “the model does NOT fit the data well.” How small is small ? when P-value < = ALPHA, where ALPHA is the level of confidence.

8.3: Parameter Estimation: MOM & MLE Let the observed data be a random sample i.e. a sequence of I.I.D. random variables whose joint distribution depends on an unknown parameter (scalar or vector). An estimate of will be a random variable function of the whose dist’n is known as its sampling dist’n. The standard deviation of the sampling dist’n will be termed as its standard error.

8.4: The Method of Moments Definition: the (pop’n) moment of a random variable X is denoted by and its (sample) moment by is viewed as an estimate of Algorithm: MOM estimates parameter(s) by finding expressions for them in terms of the lowest possible (pop’n) moments and then substituting (sample) moments into the expressions.

8.5: The Method of Maximum Likelihood Algorithm: Let be a sequence of I.I.D. random variables. The likelihood function is The MLE of is that value of that maximizes the likelihood function or maximizes the natural logarithm (since the logarithm is monotonic function) The log-likelihood function is then to be maximized to get the MLE.

8.5.1: MLEs of Multinomial Cell Probabilities Suppose that, the counts in cells, follows a multinomial distribution with total count n and cell probabilities Caution: the marginal dist’n of each is binomial BUT the … are not INDEPENDENT i.e. their joint PMF is not the product of the marginal PFMs. The good news is that the MLE still applies. Problem: Estimate the p’s from the x’s.

8.5.1a: MLEs of Multinomial Cell Probabilities (cont’d) we wish to estimate To answer the question, we assume n is given and we wish to estimate From the joint PMF, the log-likelihood becomes: To maximize such a log-likelihood subject to the constraint, we use a Lagrange multiplier to get after maximizing

8.5.1b: MLEs of Multinomial Cell Probabilities (cont’d) Deja vu: note that the sampling dist’n of the is determined by the binomial dist’ns of the Hardy-Weinberg Equilibrium Hardy-Weinberg Equilibrium: GENETICS Here the multinomial cell probabilities are functions of other unknown parameters ; that is Read example A on page Read example A on page

8.5.2: Large Sample Theory for MLEs Let be an estimate of a parameter based on The variance of the sampling dist’n of many estimators decreases as the sample size n increases. An estimate is said to be a consistent estimate of a parameter if approaches as the sample size n approaches infinity. Consistency is a limiting property that does not require any behavior of the estimator for a finite sample size.

8.5.2: Large Sample Theory for MLEs (cont’d) Theorem: Under appropriate smoothness conditions on f, the MLE from an I.I.D sample is consistent and the probability dist’n of tends to N(0,1). In other words, the large sample distribution of the MLE is approximately normal with mean (say, the MLE is asymptotically unbiased ) and its asymptotic variance is where the information about the parameter is:

8.5.3: Confidence Intervals for MLEs Recall that a confidence interval (as seen in Chap.7) is a random interval containing the parameter of interest with some specific probability. Three (3) methods to get CI for MLEs are: Exact CIs Approximated CIs using Section Bootstrap CIs

8.6: Efficiency & Cramer-Rao Lower Bound Problem: Given a variety of possible estimates, the best one to choose should have its sampling distribution highly concentrated about the true parameter. Because of its analytic simplicity, the mean square error, MSE, will be used as a measure of such a concentration.

8.6: Efficiency & Cramer-Rao Lower Bound (cont’d) Unbiasedness means Definition: Given two estimates, and, of a parameter, the efficiency of relative to is defined to be: Theorem: (Cramer-Rao Inequality) Under smooth assumptions on the density of the IID sequence when is an unbiased estimate of, we get the lower bound:

8.7: Sufficiency Is there a function containing all the information in the sample about the parameter ? If so, without loss of information the original data may be reduced to this statistic. Definition: a statistic is said to be sufficient for if the conditional dist’n of, given T = t, does not depend on for any value t In other words, given the value of T, which is called a sufficient statistic, one can gain no more knowledge about the parameter from further investigation with respect to the sample dist’n.

8.7.1: a Factorization Theorem How to get a sufficient statistic? Theorem A: a necessary and sufficient condition for to be sufficient for a parameter is that the joint PDF or PMF factors in the form: Corollary A: if T is sufficient for, then the MLE is a function of T.

8.7.2: The Rao-Blackwell thm The following theorem gives a quantitative rationale for basing an estimator of a parameter on an existing sufficient statistic. Theorem: Rao-Blackwell Theorem Let be an estimator of with for all Suppose that T is sufficient for, and let. Then, for all, The inequality is strict unless

8.8: Conclusion Some key ideas in Chap.7 such as sampling distributions, Confidence Intervals were revisited MOM and MLE were applied to some distributional theory approximations. Theoretical concepts of efficiency, Cramer-Rao lower bound, and efficiency were discussed. Finally, some light was shed in Parametric Bootstrapping.