The European Statistical Training Programme (ESTP)

Slides:



Advertisements
Similar presentations
Nonparametric estimation of non- response distribution in the Israeli Social Survey Yury Gubman Dmitri Romanov JSM 2009 Washington DC 4/8/2009.
Advertisements

Treatment of missing values
Missing values problem in Data Mining
Some birds, a cool cat and a wolf
Today Today: More on the Normal Distribution (section 6.1), begin Chapter 8 (8.1 and 8.2) Assignment: 5-R11, 5-R16, 6-3, 6-5, 8-2, 8-8 Recommended Questions:
How to deal with missing data: INTRODUCTION
Modeling Achievement Trajectories When Attrition is Informative Betsy J. Feldman & Sophia Rabe- Hesketh.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
IPSS Ch 2. Selection Problem 2.1. The Nature of the Problem Non-Response, Dropped from Census, Sample Attrition in Longitudinal Survey, Censored Data We.
G Lecture 11 G Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)
Handling Attrition and Non- response in the 1970 British Cohort Study Tarek Mostafa Institute of Education – University of London.
Imputation for Multi Care Data Naren Meadem. Introduction What is certain in life? –Death –Taxes What is certain in research? –Measurement error –Missing.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
SW 983 Missing Data Treatment Most of the slides presented here are from the Modern Missing Data Methods, 2011, 5 day course presented by the KUCRMDA,
© John M. Abowd 2007, all rights reserved General Methods for Missing Data John M. Abowd March 2007.
Multivariate Data Analysis Chapter 2 – Examining Your Data
Missing Values Raymond Kim Pink Preechavanichwong Andrew Wendel October 27, 2015.
A REVIEW By Chi-Ming Kam Surajit Ray April 23, 2001 April 23, 2001.
Pre-Processing & Item Analysis DeShon Pre-Processing Method of Pre-processing depends on the type of measurement instrument used Method of Pre-processing.
DATA STRUCTURES AND LONGITUDINAL DATA ANALYSIS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Independent Samples: Comparing Means Lecture 39 Section 11.4 Fri, Apr 1, 2005.
Missing Data and Selection Bias
Handling Attrition and Non-response in the 1970 British Cohort Study
MISSING DATA AND DROPOUT
How useful is a reminder system in collection of follow-up quality of life data in clinical trials? Dr Shona Fielding.
The Centre for Longitudinal Studies Missing Data Strategy
Maximum Likelihood & Missing data
Introduction to Survey Data Analysis
Multiple Imputation.
Multiple Imputation Using Stata
The European Statistical Training Programme (ESTP)
Dealing with missing data
Chapter 7 Sampling Distributions
STATISTICS INFORMED DECISIONS USING DATA
Daniela Stan Raicu School of CTI, DePaul University
Non-Response Bias in Income Data
Daniela Stan Raicu School of CTI, DePaul University
Data Collection and Sampling
Presenter: Ting-Ting Chung July 11, 2017
Sampling Distribution
Sampling Distribution
The bane of data analysis
The European Statistical Training Programme (ESTP)
The European Statistical Training Programme (ESTP)
Peng Zhang Jinnan Liu Mei-ting Chiang Yin Liu
The European Statistical Training Programme (ESTP)
Chapter 8: Weighting adjustment
Chapter 12: Other nonresponse correction techniques
The European Statistical Training Programme (ESTP)
CH2. Cleaning and Transforming Data
Chapter 10: Selection of auxiliary variables
EM for Inference in MV Data
Sampling Distributions
Sampling Distributions
The European Statistical Training Programme (ESTP)
Missing Data Mechanisms
Chapter: 9: Propensity scores
Analysis of missing responses to the sexual experience question in evaluation of an adolescent HIV risk reduction intervention Yu-li Hsieh, Barbara L.
Chapter 3: Response models
EM for Inference in MV Data
New Techniques and Technologies for Statistics 2017  Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.
The European Statistical Training Programme (ESTP)
Chapter 4: Missing data mechanisms
Clinical prediction models
Chapter 6: Measures of representativity
The European Statistical Training Programme (ESTP)
Chapter 13: Item nonresponse
Chapter 5: The analysis of nonresponse
Missing data: Is it all the same?
Presentation transcript:

The European Statistical Training Programme (ESTP)

Chapter 4: Missing data mechanisms Handbook: chapter 2 Missing data patterns Missing data mechanisms

Missing data mechanisms Missing data patterns Describe which values are observed and which values are missing Different patterns require different methods to deal with the missing data Missing data mechanisms Describe the relationship between the missingness and the variables in the dataset

Missing data patterns Univariate missing data Y represents a group of variables that is either completely observed or completely missing for each sample element Example: Unit nonresponse X1 X2 . . . . . . Xp Y 1 2 . N

Missing data patterns Missing data patterns Monotone missing data Data are ordered in such a way that if Yj is missing for a unit, then Yj+1, …,Yp are missing as well. Example: panel drop out, attrition. Y1 Y2 Y3 … Yp 1 2 . N

Missing data patterns Arbitrary missing data No structure or ordering in missingness Example: item nonresponse Y1 Y2 Y3 … Yp 1 2 . N ? ? ? ?

Missing data mechanisms Any analysis of data involving item- or unit nonresponse requires some assumption about the missing data mechanism Partition Y into an observed and an unobserved part Distribution of missingness is characterized by the conditional distribution of R given Y

Missing Completely At Random (MCAR) The conditional distribution of R given Y does not depend on the data at all. P(Y = missing) is unrelated to missing values of Y or other variables X Let X be a set of auxiliary variables, completely observed. Y is a target variable, partly missing. Z represents causes of missingness unrelated to X and Y. MCAR: Analysis with observed units only (complete case analysis) is still valid. X Z Y R

Missing At Random (MAR) The conditional distribution of missingness depends on the observed data, but not on the missing values; P(Y = missing) is unrelated to missing values, after controlling for other variables X MAR: MAR = MCAR within classes of X Example: Y = Income; X = Property tax Persons with high income may be less willing to reveal income. But within classes of property tax, nonresponse on the income question is random. Income then is MAR; given property tax, the missingness does not depend on income. X Z Y R

Not Missing At Random (NMAR) The distribution of the missingness can not be simplified any further and depends on both the observed and the missing data NMAR: X Z Y R

Missing data mechanisms – An example X = Age, Y = Work status If the probability of providing the work status is the same for all the persons in the survey, regardless of their age or work status, the data are Missing Completely At Random (MCAR). If the probability of providing the work status is varies according to the age of the respondent, but does not vary according to the work status of respondents within an age group, then the data are Missing At Random (MAR). If the probability of providing the work status varies according to the work status within each age group, the data are Not Missing At Random (NMAR).