Download presentation
Presentation is loading. Please wait.
Published byGrant Caldwell Modified over 5 years ago
1
The European Statistical Training Programme (ESTP)
2
Chapter 4: Missing data mechanisms
Handbook: chapter 2 Missing data patterns Missing data mechanisms
3
Missing data mechanisms
Missing data patterns Describe which values are observed and which values are missing Different patterns require different methods to deal with the missing data Missing data mechanisms Describe the relationship between the missingness and the variables in the dataset
4
Missing data patterns Univariate missing data
Y represents a group of variables that is either completely observed or completely missing for each sample element Example: Unit nonresponse X1 X Xp Y 1 2 . N
5
Missing data patterns Missing data patterns Monotone missing data
Data are ordered in such a way that if Yj is missing for a unit, then Yj+1, …,Yp are missing as well. Example: panel drop out, attrition. Y1 Y2 Y3 … Yp 1 2 . N
6
Missing data patterns Arbitrary missing data
No structure or ordering in missingness Example: item nonresponse Y1 Y2 Y3 … Yp 1 2 . N ? ? ? ?
7
Missing data mechanisms
Any analysis of data involving item- or unit nonresponse requires some assumption about the missing data mechanism Partition Y into an observed and an unobserved part Distribution of missingness is characterized by the conditional distribution of R given Y
8
Missing Completely At Random (MCAR)
The conditional distribution of R given Y does not depend on the data at all. P(Y = missing) is unrelated to missing values of Y or other variables X Let X be a set of auxiliary variables, completely observed. Y is a target variable, partly missing. Z represents causes of missingness unrelated to X and Y. MCAR: Analysis with observed units only (complete case analysis) is still valid. X Z Y R
9
Missing At Random (MAR)
The conditional distribution of missingness depends on the observed data, but not on the missing values; P(Y = missing) is unrelated to missing values, after controlling for other variables X MAR: MAR = MCAR within classes of X Example: Y = Income; X = Property tax Persons with high income may be less willing to reveal income. But within classes of property tax, nonresponse on the income question is random. Income then is MAR; given property tax, the missingness does not depend on income. X Z Y R
10
Not Missing At Random (NMAR)
The distribution of the missingness can not be simplified any further and depends on both the observed and the missing data NMAR: X Z Y R
11
Missing data mechanisms – An example
X = Age, Y = Work status If the probability of providing the work status is the same for all the persons in the survey, regardless of their age or work status, the data are Missing Completely At Random (MCAR). If the probability of providing the work status is varies according to the age of the respondent, but does not vary according to the work status of respondents within an age group, then the data are Missing At Random (MAR). If the probability of providing the work status varies according to the work status within each age group, the data are Not Missing At Random (NMAR).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.