Maximum Likelihood & Missing data

Slides:



Advertisements
Similar presentations
Handling attrition and non- response in longitudinal data Harvey Goldstein University of Bristol.
Advertisements

Non response and missing data in longitudinal surveys.
Generalized Method of Moments: Introduction
Treatment of missing values
Missing Data Analysis. Complete Data: n=100 Sample means of X and Y Sample variances and covariances of X Y
Some birds, a cool cat and a wolf
CJT 765: Structural Equation Modeling Class 3: Data Screening: Fixing Distributional Problems, Missing Data, Measurement.
Adapting to missing data
How to Handle Missing Values in Multivariate Data By Jeff McNeal & Marlen Roberts 1.
Different chi-squares Ulf H. Olsson Professor of Statistics.

Missing Data in Randomized Control Trials
How to deal with missing data: INTRODUCTION
Modeling Achievement Trajectories When Attrition is Informative Betsy J. Feldman & Sophia Rabe- Hesketh.
The General LISREL MODEL and Non-normality Ulf H. Olsson Professor of Statistics.
Psych 524 Andrew Ainsworth Data Screening 2. Transformation allows for the correction of non-normality caused by skewness, kurtosis, or other problems.
Factor Analysis Ulf H. Olsson Professor of Statistics.
Multiple imputation using ICE: A simulation study on a binary response Jochen Hardt Kai Görgen 6 th German Stata Meeting, Berlin June, 27 th 2008 Göteborg.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
G Lecture 11 G Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)
Estimation Kline Chapter 7 (skip , appendices)
Basics of Data Cleaning
Handling Attrition and Non- response in the 1970 British Cohort Study Tarek Mostafa Institute of Education – University of London.
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam.
SW 983 Missing Data Treatment Most of the slides presented here are from the Modern Missing Data Methods, 2011, 5 day course presented by the KUCRMDA,
1 G Lect 13W Imputation (data augmentation) of missing data Multiple imputation Examples G Multiple Regression Week 13 (Wednesday)
The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning W. Holmes Finch.
Missing Values Raymond Kim Pink Preechavanichwong Andrew Wendel October 27, 2015.
1crmda.KU.edu Todd D. Little University of Kansas Director, Quantitative Training Program Director, Center for Research Methods and Data Analysis Director,
Estimation Kline Chapter 7 (skip , appendices)
Tutorial I: Missing Value Analysis
Pre-Processing & Item Analysis DeShon Pre-Processing Method of Pre-processing depends on the type of measurement instrument used Method of Pre-processing.
A framework for multiple imputation & clustering -Mainly basic idea for imputation- Tokei Benkyokai 2013/10/28 T. Kawaguchi 1.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Chapter 4. The Normality Assumption: CLassical Normal Linear Regression Model (CNLRM)
DATA STRUCTURES AND LONGITUDINAL DATA ANALYSIS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Multiple Imputation in Finite Mixture Modeling Daniel Lee Presentation for MMM conference May 24, 2016 University of Connecticut 1.
Best Practices for Handling Missing Data
HANDLING MISSING DATA.
Missing data: Why you should care about it and what to do about it
Handling Attrition and Non-response in the 1970 British Cohort Study
Rachael Bedford Mplus: Longitudinal Analysis Workshop 26/09/2017
MISSING DATA AND DROPOUT
Correlation, Regression & Nested Models
CH 5: Multivariate Methods
The Centre for Longitudinal Studies Missing Data Strategy
Introduction to Survey Data Analysis
Multiple Imputation.
Multiple Imputation Using Stata
How to handle missing data values
Working Independence versus modeling correlation Longitudinal Example
Presenter: Ting-Ting Chung July 11, 2017
Discrete Event Simulation - 4
The European Statistical Training Programme (ESTP)
CH2. Cleaning and Transforming Data
EM for Inference in MV Data
OVERVIEW OF LINEAR MODELS
Non response and missing data in longitudinal surveys
Analysis of missing responses to the sexual experience question in evaluation of an adolescent HIV risk reduction intervention Yu-li Hsieh, Barbara L.
Multiple Regression Analysis: OLS Asymptotics
Multiple Regression Analysis: OLS Asymptotics
EM for Inference in MV Data
Rachael Bedford Mplus: Longitudinal Analysis Workshop 23/06/2015
Clinical prediction models
Chapter 13: Item nonresponse
Missing data: Is it all the same?
Presentation transcript:

Maximum Likelihood & Missing data Rachael Bedford Mplus: Longitudinal Analysis Workshop 23/06/2015

Overview Maximum Likelihood (ML) Maximum Likelihood: Robust Standard Errors (MLR) Missing data MCAR MAR NMAR Full Information Maximum Likelihood (FIML) Maximum likelihood vs. multiple imputation Summary

Maximum Likelihood (ML) Estimation ML is an estimator (like ordinary least squares) Why is estimation important? Influences quality and validity of estimates Most estimation techniques: Minimise something e.g. OLS Maximise something e.g. ML Simulation

Maximum Likelihood (ML) Estimation Asymptotic consistency – as sample size increases, estimator converges on true value Asymptotic normality – as sample size increases, the estimator distribution is normal Efficiency – small standard errors Wald test give se, mod indicies

Maximum Likelihood What does ML do? How does ML estimate parameters? ML identifies the population parameter values (e.g. population mean) that are most ‘likely’ or consistent with data (e.g. sample mean) It uses the observed data to find parameters with the highest likelihood (best fit) How does ML estimate parameters? By constraining search for parameters within a normal distribution Also good for Missing data

Maximum Likelihood Robust Robust Maximum Likelihood (MLR) still assumes data follow a multivariate normal distribution. BUT can deal with kurtosis “peakedness” of data MLR in Mplus uses a sandwich estimator to give robust standard errors E.g. Use for likert scale data Mahalanobis distance – tests for multivariate outliers

Full Information Maximum Likelihood Can apply ML to incomplete as well as complete data records i.e. where data is missing in response variables This is called Full Information Maximum Likelihood (FIML). If data are missing at random we can use FIML to estimate model parameters. Missing at random DOES NOT MEAN missing at random. Effect of attrition in LDA

Missing Data Missing completely at random (MCAR) missingness does not depend on the values of either observed or latent variables IF this holds can use listwise deletion Missing at random (MAR) (confusing name!) missingness is related to observed, but not latent variables or missing values Non-ignorable missing Missingness of data can be related to both observed but also to missing values and latent variables

FIML missing Craig Enders A case with an IQ 85 would likely have a performance rating of ~9. animations Based on this information, ML adjusts the job performance mean downward to account for the plausible (but missing) performance rating

Multiple imputation Another approach: Multiple Imputation Multiple copies of the data set are generated, each with different estimates of the missing values. Analyses performed on each of the imputed data sets, and the parameter estimates and standard errors are pooled into a single set of results. Maximum Likelihood Multiple imputation Estimate parameters does NOT fill in missing values Does fill in the missing data Usually best for continuous data Best for categorical or item level data In a large sample ML and MI give the same results

References http://vkc.library.uu.nl/vkc/ms/SiteCollectionDocuments/Mpl us- site/1%20nov%2011/Enders%20Pre%20Conference%20Worksh op.pdf http://jonathantemplin.com/files/sem/sem13psyc948/sem1 3psyc948_lecture02.pdf Tech 5 mplus

Considerations Recoverability Bias Power Is it possible to estimate what scores would have been if they were not missing? Bias Are statistics (e.g. means, variances, and covariances/correlations) the same as what they would have been had there not been any missing data? Power Do we have the same or similar rates of power (1 – Type II error rate) as we would without missing data? Item missing and parameter missing Power for specific parameter