Parametric Inference
Properties of MLE Consistency True parameter: MLE using n samples: Define Condition 1: Condition 2: Sample distance between θ and θ* True distance between θ and θ* (KLD) Asymptotic convergence of sample to true distance for at least one parameter value Model is identifiable
Properties of MLE Equivariance Condition: g is invertible (see proof) - g is one-to-one and onto
Properties of MLE Asymptotic normality True standard error Approximate standard error : Fisher information at true parameter value θ : Fisher information at MLE parameter value
Fisher information Define score function: Fisher information at θ: Rate of change of log likelihood of X w.r.t. parameter θ Fisher information at θ: Measure of information carried by n IID data points X1, X2, … Xn about the model parameter θ Fact (Cramer Rao bound): Lower bound on the variance of any unbiased estimator of θ
Parametric Bootstrap If τ is any statistic of X1, X2, …., Xn Nonparametric bootstrap Each τb is computed using a sample Xb,1, Xb,2, …., Xb,n ~ (empirical distribution) Parametric bootstrap Each τb is computed using a sample Xb,1, Xb,2, …., Xb,n ~ (MLE or Method of moments parametric distribution)
Sufficient statistic Any function of the data Xn: T(X1, X2, …., Xn) is a statistic Definition 1: T is sufficient for θ: Likelihood functions for data sets xn and yn have the same shape Recall that likelihood function is specific to an observed data set xn !
Sufficient statistic Intuitively T is the connecting link between data and likelihood Sufficient statistic is not unique For example, xn and T(xn) are both sufficient statistics
Sufficient statistic Definition 2: Factorization theorem T is sufficient for θ: Factorization theorem T is sufficient for θ if and only if Distribution of xn is conditionally independent of θ of given T Implies the first definition of sufficient statistic
Sufficient statistic Minimal sufficient T is minimal sufficient if a sufficient statistic function of every other sufficient statistic T is minimal sufficient if Recall T is sufficient if
Sufficient statistic Rao-Blackwell theorem An estimator of θ should depend only on the sufficient statistic T, otherwise it can be improved. Exponential family of distributions one parameter θ multiple parameters
Sufficient statistic Exponential family n IID random variables X1, X2, …., Xn have distribution Examples include Normal, Binomial, Poisson. Also exponential is a sufficient statistic (Factorization theorem)
Iterative MLE Newton-Raphson Start with an initial guess for parameter(s). Obtain improved estimates in subsequent iterations until convergence. Initial parameter value could come from the method of moments estimator. Newton-Raphson Iterative technique to find a local root of a function. MLE is equivalent to finding the root of the derivative of log likelihood function.
Newton-Raphson Taylor series expansion of around current parameter estimate For MLE, Solving for , takes closer to MLE at every iteration Multi-parameter case: where
Newton-Raphson Slope Slope MLE MLE
Expectation Maximization Iterative MLE technique used in missing data problems. Sometimes introducing missing data simplifies maximizing of log likelihood. Two log likelihoods (complete data and incomplete data) Two main steps Compute expectation of complete data log likelihood using current parameters. Maximize the above over parameter space to obtain new parameters.
Expectation Maximization Incomplete Data Log likelihood Complete Data Log likelihood Expected log likelihood
Expectation Maximization Algorithm Start with an initial guess of parameter value(s). Repeat steps 1 and 2 below for j = 0, 1, 2, …. 1. Expectation: Compute variable constant 2. Maximization: Update parameters by maximizing the above expectation over parameter space
Expectation Maximization Fact OR Incomplete data log likelihood increases every iteration! MLE can be reached after a sufficient number of iterations
Thank you!