Download presentation
Presentation is loading. Please wait.
Published byAdela Fisher Modified over 9 years ago
2
PEAS wprkshop 2 Non-response and what to do about it Gillian Raab Professor of Applied Statistics Napier University
3
PEAS wprkshop 2 What do we mean by non- response Unit non response Item non response Start with the first of these It is a respondent to a survey who we tried to get but did not obtain any response from We may or may not know anything about them or whether they exist
4
PEAS wprkshop 2 What is an acceptable response rate? 99% 90% 80% 70% 50% 40% 30% 20% It depends who you are. It depends on why the response is poor It depends on whether non- responders are like responders
5
PEAS wprkshop 2 An example Postal survey on attitudes to racial discrimination got a 45% response rate 1.Half of the letters were lost by the post- office, but most of the others replied 2.No letters were lost, but a qualitative study after the survey revealed that many people in the study did not reply because they were hostile to immigrant groups
6
PEAS wprkshop 2 Types of ‘missingness’ In the first example missing people might not be thought to be different from others –Missing Completely at Random (MCAR) In the second one the missing people would be likely to have quite different views –Missing Not at Random (MNAR)
7
PEAS wprkshop 2 An intermediate position Missing At random (MAR) –Assumes that within groups we can identify in the survey, the missing people are just like the ones who reply The methods that survey researchers use all make this assumption –But you need good information about those who don’t respond
8
PEAS wprkshop 2 Survey non response is a world-wide problem – here the US – refusal rates in major US surveys
9
PEAS wprkshop 2 Acrostic et al. J of Official Statistics – non contact rates
10
PEAS wprkshop 2 So doing something about it has become important The most commonly used method for unit non-response is weighting Non response weights can be calculated –From data available on the sampling frame –From another source of data for the population If it is the latter it is often called POST- STRATIFICATION
11
PEAS wprkshop 2 An example – Ayr and Arran Health Survey Postal survey based on CHI Response rate about 50% Can’t be sure of response rate because ‘dead wood’ not properly accounted for Population data available for data zones by 5 year age and sex groups
12
PEAS wprkshop 2 How to do it – simple case Age/sex groups only Make a table by age group and sex for the Census data and the survey Reasonable size groups (>50) Calculate ratio of sample numbers to population (overall 1.5% or 0.015) Inverse of this becomes the grossing up weight
13
PEAS wprkshop 2 Why such extreme weights here? The CHI is not a perfect sampling frame It has dead people on it and people who have moved away We think that non-contacts were replaced We did have some data on all addresses used
14
PEAS wprkshop 2
15
Item non-response Ignore cases with missing data Becomes problematic in regression models Use imputation to replace the missing values –Informed inputation –Hot deck imputation –Model based imputation (can be multiple)
16
PEAS wprkshop 2 Informed imputation Mainly used for sub-items when a total is needed –Eg income, housing costs Often requires detailed examination of cases –E.g. finding benefit entitlement –Costs of a particular repair Survey specific
17
PEAS wprkshop 2 Hot deck imputation Often used in census data Can be used for both unit and item non-response For unit non response a missing case is replaced with another one that matches on whatever data are available For item non response another case is selected that may be similar to the case with the missing item on other things that are measured. Can get very messy and difficult and lead to things like pregnant men
18
PEAS wprkshop 2 Model based imputation Assumes some statistical model for the data For example – a multivariate normal distribution Start by relacing missing values by their means Fits the model and then replaces the missing values with a sample from their predictive distribution given the data Do this repeatedly until the pattern stabilises You then have a complete data set to work with
19
PEAS wprkshop 2 It works surprisingly well Even when the data are categories Just analysing the data as they are would give misleading precision But there is an easy adjustment that can be made by running more than one imputation (usually 5) and adding in a bit for the variation between them.
20
PEAS wprkshop 2 It is accessible Theory and practice has been developed by Don Rubin and Jo Schaffer Implemented in several programmes Including SAS PROC MI Once you have the multiple data sets they can be analysed with PROC MIANALYSE
21
PEAS wprkshop 2 Summary Unit non response –Weighting –Hot deck imputation Item non –response –Use available cases –Use imputation –Only time for a sketch of the latter
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.