Multiple Imputation using SOLAS for Missing Data Analysis

Slides:



Advertisements
Similar presentations
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Advertisements

Latent normal models for missing data Harvey Goldstein Centre for Multilevel Modelling University of Bristol.
Managerial Economics in a Global Economy
Kin 304 Regression Linear Regression Least Sum of Squares
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
Sampling: Final and Initial Sample Size Determination
Simple Linear Regression
CJT 765: Structural Equation Modeling Class 3: Data Screening: Fixing Distributional Problems, Missing Data, Measurement.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
The Simple Linear Regression Model: Specification and Estimation
Sampling Distributions
Multivariate Data Analysis Chapter 4 – Multiple Regression.
1 Inference About a Population Variance Sometimes we are interested in making inference about the variability of processes. Examples: –Investors use variance.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Business Statistics - QBM117 Statistical inference for regression.
Simple Linear Regression and Correlation
Statistical Methods for Missing Data Roberta Harnett MAR 550 October 30, 2007.
Chapter 12 Section 1 Inference for Linear Regression.
Simple Linear Regression Analysis
Multiple imputation using ICE: A simulation study on a binary response Jochen Hardt Kai Görgen 6 th German Stata Meeting, Berlin June, 27 th 2008 Göteborg.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Hypothesis Testing in Linear Regression Analysis
Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions and approximations Imputation method (substituting missing.
Introduction Multilevel Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
SW 983 Missing Data Treatment Most of the slides presented here are from the Modern Missing Data Methods, 2011, 5 day course presented by the KUCRMDA,
The Simple Linear Regression Model: Specification and Estimation ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s.
A REVIEW By Chi-Ming Kam Surajit Ray April 23, 2001 April 23, 2001.
Section 7.2 P1 Means and Variances of Random Variables AP Statistics.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Tutorial I: Missing Value Analysis
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
OLS Regression What is it? Closely allied with correlation – interested in the strength of the linear relationship between two variables One variable is.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Inference about the slope parameter and correlation
Simple Linear Regression
Inference: Conclusion with Confidence
Virtual COMSATS Inferential Statistics Lecture-26
Model Inference and Averaging
Kin 304 Regression Linear Regression Least Sum of Squares
Ch3: Model Building through Regression
CH 5: Multivariate Methods
BPK 304W Regression Linear Regression Least Sum of Squares
Multiple Imputation.
Quantitative Methods Simple Regression.
Multiple Imputation Using Stata
How to handle missing data values
Lecture Slides Elementary Statistics Thirteenth Edition
Review of Hypothesis Testing
Regression Models - Introduction
کارگاه حجم نمونه با نرم افزار G*Power
Sampling Distribution
Sampling Distribution
Section 10.1: Confidence Intervals
Sampling Distribution Models
Simple Linear Regression
Additional notes on random variables
Additional notes on random variables
Simple Linear Regression
Clinical prediction models
Linear Regression Analysis 5th edition Montgomery, Peck & Vining
Implementation of the Bayesian approach to imputation at SORS Zvone Klun and Rudi Seljak Statistical Office of the Republic of Slovenia Oslo, September.
Presentation transcript:

Multiple Imputation using SOLAS for Missing Data Analysis

• • • • • Mean Imputation X2 X2 X1 Relationships between variables are not kept. Will underestimate variance.

• • • • • Regression Imputation X2 X1 Substituting regression predictions will artificially inflate correlations.

Multiple Imputation Multiple Imputation is a technique that replaces each missing datum with a set of M>1 plausible values.

The M versions of the complete data are analysed by standard complete-data methods, and the results are combined using simple rules to yield estimates, standard errors, and p-values that formally incorporate missing data uncertainty. The variation among the M imputations reflects the uncertainty with which the missing values can be predicted from the observed ones.

SOLAS for Missing Data Analysis 3.2 There are two multiple imputation approaches implemented in SOLAS 3.2 Predictive model-based multiple imputation Propensity score-based multiple imputation In both, the multiple imputations are independent repetitions from a posterior predictive distribution for the missing data, i.e. their conditional distribution, given the observed data.

Predictive model-based approach: The predictive information contained in a user specified set of covariates is used to predict the missing values in the variables to be imputed. First the linear regression model is estimated from the observed data. Using this estimated model, a new linear regression model is randomly drawn from its Bayesian posterior distribution. This randomly drawn model is used to generate the imputations, which include random deviations from the model’s predictions. Drawing the model from its posterior distribution ensures that the extra uncertainty about the unknown true model is reflected. {See section 5.3 of Rubin, 1987}

Predictive model-based approach: X2 x x x x x x x x x x X1 x x x x x x x

Propensity score-based approach: Cases are grouped according to their probability of being missing, (i.e. propensity score) and then an approximate Bayesian bootstrap is applied to sample observed values that will be used to impute the missing values. {See Lavori, P., Dawson, R. and Shera, D., Statistics in Medicine, 14 (1995)}