Working with missing Data

Slides:



Advertisements
Similar presentations
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
Advertisements

Social Network Analysis
CJT 765: Structural Equation Modeling Class 3: Data Screening: Fixing Distributional Problems, Missing Data, Measurement.
A) 80 b) 53 c) 13 d) x 2 = : 10 = 3, x 3 = 309.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Adapting to missing data
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Multiple Imputation Stata (ice) How and when to use it.
How to deal with missing data: INTRODUCTION
LECTURE 15 MULTIPLE IMPUTATION
Psych 524 Andrew Ainsworth Data Screening 2. Transformation allows for the correction of non-normality caused by skewness, kurtosis, or other problems.
Survey Experiments. Defined Uses a survey question as its measurement device Manipulates the content, order, format, or other characteristics of the survey.
Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions and approximations Imputation method (substituting missing.
Practical Missing Data Analysis in SPSS (v17 onwards) Peter T. Donnan Professor of Epidemiology and Biostatistics.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 19 Process of Quantitative Data Analysis and Interpretation.
Research Project Statistical Analysis. What type of statistical analysis will I use to analyze my data? SEM (does not tell you level of significance)
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
Multilevel Linear Models Field, Chapter 19. Why use multilevel models? Meeting the assumptions of the linear model – Homogeneity of regression coefficients.
G Lecture 11 G Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)
Web example squares-means-marginal-means-vs.html.
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
Imputation for Multi Care Data Naren Meadem. Introduction What is certain in life? –Death –Taxes What is certain in research? –Measurement error –Missing.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
DATA PREPARATION: PROCESSING & MANAGEMENT Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
» So, I’ve got all this data…what now? » Data screening – important to check for errors, assumptions, and outliers. » What’s the most important? ˃Depends.
Some key developments in data analysis Michael Babyak, PhD.
Missing Values Raymond Kim Pink Preechavanichwong Andrew Wendel October 27, 2015.
A REVIEW By Chi-Ming Kam Surajit Ray April 23, 2001 April 23, 2001.
Model Building and Validation An overview using the discriminant analysis technique.
[Topic 1-Regression] 1/37 1. Descriptive Tools, Regression, Panel Data.
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
Missing Values C5.2 Data Screening. Missing Data Use the summary function to check out the missing data for your dataset. summary(notypos)
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Tutorial I: Missing Value Analysis
Multiple Imputation using SAS Don Miller 812 Oswald Tower
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
ANCOVA (adding covariate) MANOVA (adding more DVs) MANCOVA (adding DVs and covariates) Group Differences: other situations…
The General Linear Model. Estimation -- The General Linear Model Formula for a straight line y = b 0 + b 1 x x y.
Does your logistic regression model suck?. PERFECTION!
A framework for multiple imputation & clustering -Mainly basic idea for imputation- Tokei Benkyokai 2013/10/28 T. Kawaguchi 1.
Multiplication Find the missing value x __ = 32.
Multiple Regression Analysis Bernhard Kittel Center for Social Science Methodology University of Oldenburg.
DATA STRUCTURES AND LONGITUDINAL DATA ANALYSIS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Best Practices for Handling Missing Data
HANDLING MISSING DATA.
Missing data: Why you should care about it and what to do about it
A new R package statTarget Hemi Luan Hong Kong Baptist University.
CHAPTER 10 Comparing Two Populations or Groups
Multiple Imputation using SOLAS for Missing Data Analysis
STAT Single-Factor ANOVA
CJT 765: Structural Equation Modeling
Linear Mixed Models in JMP Pro
CH 5: Multivariate Methods
Maximum Likelihood & Missing data
Microeconometric Modeling
Introduction to Survey Data Analysis
Multiple Imputation.
Multiple Imputation Using Stata
How to handle missing data values
Numerical Descriptives in R
Dealing with missing data
Presenter: Ting-Ting Chung July 11, 2017
مدلسازي تجربي – تخمين پارامتر
CHAPTER 10 Comparing Two Populations or Groups
When the Mean isn’t Enough
Cases. Simple Regression Linear Multiple Regression.
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Clinical prediction models
Figure 1. Mean change in BW (kg) from baseline to end of treatment ± SE in the SUSTAIN 1–3 trials. *P < vs ... Figure 1. Mean change in BW (kg)
Presentation transcript:

Working with missing Data

Missing data General 3 steps for analyzing missing data: Identify patterns/reasons for missing data. Understand the distributions of missing data. Decide on the best method for analysis.

Identify patterns/reasons for missing data Understand your data Are certain groups more likely to have missing values? Are certain responses more likely to be missing?

Method for analysis Deletion methods - List deletion Single Imputation Methods - Mean/mode substitution, dummy variable method, single regression Model based methods - Maximum Likelihood, Multiple imputation, others

Multiple imputation Impute: - Data is “filled in” with imputed values using specified regression model - This step is repeated “m” times, resulting in a separate dataset each time. Analyze: - Analyses performed within each dataset Pooled: - The results pooled into one estimate

Multiple imputation example Plot Rep Treatment Response 1 45 2 3 NA 22 4 18 5 6 34 7 40 8 14 9 10 11 16 12 20

R Studio package (Amelia)