Parametric Inference.

Slides:



Advertisements
Similar presentations
Generalized Method of Moments: Introduction
Advertisements

SOME GENERAL PROBLEMS.
Point Estimation Notes of STAT 6205 by Dr. Fan.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Statistical Estimation and Sampling Distributions
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
DATA ANALYSIS Module Code: CA660 Lecture Block 6: Alternative estimation methods and their implementation.
Maximum likelihood (ML) and likelihood ratio (LR) test
The Mean Square Error (MSE):. Now, Examples: 1) 2)
First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Statistical Inference Chapter 12/13. COMP 5340/6340 Statistical Inference2 Statistical Inference Given a sample of observations from a population, the.
. Learning Bayesian networks Slides by Nir Friedman.
Maximum likelihood (ML)
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Presenting: Assaf Tzabari
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Visual Recognition Tutorial
July 3, Department of Computer and Information Science (IDA) Linköpings universitet, Sweden Minimal sufficient statistic.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Maximum likelihood (ML)
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Model Inference and Averaging
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Random Sampling, Point Estimation and Maximum Likelihood.
A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.
Modern Navigation Thomas Herring
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Chapter 7 Point Estimation
1 Lecture 16: Point Estimation Concepts and Methods Devore, Ch
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Consistency An estimator is a consistent estimator of θ, if , i.e., if
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Brief Review Probability and Statistics. Probability distributions Continuous distributions.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Week 31 The Likelihood Function - Introduction Recall: a statistical model for some data is a set of distributions, one of which corresponds to the true.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Computacion Inteligente Least-Square Methods for System Identification.
Conditional Expectation
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Stat 223 Introduction to the Theory of Statistics
Statistical Estimation
Stat 223 Introduction to the Theory of Statistics
Visual Recognition Tutorial
Probability Theory and Parameter Estimation I
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Model Inference and Averaging
STATISTICAL INFERENCE PART I POINT ESTIMATION
Maximum Likelihood Estimation
Statistical Assumptions for SLR
Summarizing Data by Statistics
EM for Inference in MV Data
Stat 223 Introduction to the Theory of Statistics
Parametric Methods Berlin Chen, 2005 References:
EM for Inference in MV Data
Chapter 9 Chapter 9 – Point estimation
12. Principles of Parameter Estimation
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Learning Bayesian networks
Presentation transcript:

Parametric Inference

Properties of MLE Consistency True parameter: MLE using n samples: Define Condition 1: Condition 2: Sample distance between θ and θ* True distance between θ and θ* (KLD) Asymptotic convergence of sample to true distance for at least one parameter value Model is identifiable

Properties of MLE Equivariance Condition: g is invertible (see proof) - g is one-to-one and onto

Properties of MLE Asymptotic normality True standard error Approximate standard error : Fisher information at true parameter value θ : Fisher information at MLE parameter value

Fisher information Define score function: Fisher information at θ: Rate of change of log likelihood of X w.r.t. parameter θ Fisher information at θ: Measure of information carried by n IID data points X1, X2, … Xn about the model parameter θ Fact (Cramer Rao bound): Lower bound on the variance of any unbiased estimator of θ

Parametric Bootstrap If τ is any statistic of X1, X2, …., Xn Nonparametric bootstrap Each τb is computed using a sample Xb,1, Xb,2, …., Xb,n ~ (empirical distribution) Parametric bootstrap Each τb is computed using a sample Xb,1, Xb,2, …., Xb,n ~ (MLE or Method of moments parametric distribution)

Sufficient statistic Any function of the data Xn: T(X1, X2, …., Xn) is a statistic Definition 1: T is sufficient for θ: Likelihood functions for data sets xn and yn have the same shape Recall that likelihood function is specific to an observed data set xn !

Sufficient statistic Intuitively T is the connecting link between data and likelihood Sufficient statistic is not unique For example, xn and T(xn) are both sufficient statistics

Sufficient statistic Definition 2: Factorization theorem T is sufficient for θ: Factorization theorem T is sufficient for θ if and only if Distribution of xn is conditionally independent of θ of given T Implies the first definition of sufficient statistic

Sufficient statistic Minimal sufficient T is minimal sufficient if a sufficient statistic function of every other sufficient statistic T is minimal sufficient if Recall T is sufficient if

Sufficient statistic Rao-Blackwell theorem An estimator of θ should depend only on the sufficient statistic T, otherwise it can be improved. Exponential family of distributions one parameter θ multiple parameters

Sufficient statistic Exponential family n IID random variables X1, X2, …., Xn have distribution Examples include Normal, Binomial, Poisson. Also exponential is a sufficient statistic (Factorization theorem)

Iterative MLE Newton-Raphson Start with an initial guess for parameter(s). Obtain improved estimates in subsequent iterations until convergence. Initial parameter value could come from the method of moments estimator. Newton-Raphson Iterative technique to find a local root of a function. MLE is equivalent to finding the root of the derivative of log likelihood function.

Newton-Raphson Taylor series expansion of around current parameter estimate For MLE, Solving for , takes closer to MLE at every iteration Multi-parameter case: where

Newton-Raphson Slope Slope MLE MLE

Expectation Maximization Iterative MLE technique used in missing data problems. Sometimes introducing missing data simplifies maximizing of log likelihood. Two log likelihoods (complete data and incomplete data) Two main steps Compute expectation of complete data log likelihood using current parameters. Maximize the above over parameter space to obtain new parameters.

Expectation Maximization Incomplete Data Log likelihood Complete Data Log likelihood Expected log likelihood

Expectation Maximization Algorithm Start with an initial guess of parameter value(s). Repeat steps 1 and 2 below for j = 0, 1, 2, …. 1. Expectation: Compute variable constant 2. Maximization: Update parameters by maximizing the above expectation over parameter space

Expectation Maximization Fact OR Incomplete data log likelihood increases every iteration! MLE can be reached after a sufficient number of iterations

Thank you!