The European Statistical Training Programme (ESTP)

The European Statistical Training Programme (ESTP)

Chapter 6: Measures of representativity
Handbook: chapter 7 Why measures of representativity? Representative response R-indicators Partial R-indicators Examples Alternative indicators

Why measures for representativity?
The impact of nonresponse on survey quality Bias is potentially increased Variance is increased How to get an indication of the impact? Response rates Measures for representativity

Response rates can be poor indicators Various examples in practice where efforts to increase response rates led to increased nonresponse bias. The response rate bounds the maximal impact of nonresponse under the worst case scenario Example Variable Response mean Estimate After 1 month After 2 months More than 12 hours employed 48.6% 50.4% 49.6% 50.6% Owns a house 63.0% 63.3% 59.1% 59.4% Owns a pc 59.6% 59.8% 57.3% 57.2% Social allowance 10.5% 10.4% 11.6% 11.4% Is non-native 12.9% 12.5% 14.6% 14.4%

How can higher response rates lead to an increased bias? Fixed response model Random response model

Need for indicators that measure “representativity” To use as counterparts of response rates To enable comparative research over time or over surveys To get insight into the quality of the data collection To monitor survey fieldwork To use in allocation of fieldwork efforts → responsive designs What is representativity?

Representative response
How to define representativity? Need mathematical rigorous definition. Adopt Random Response Model, i.e. A response mechanism is strongly representative in case all response propensities are identical, for all i. Definition relates to Missing Completely At Random (MCAR) for all possible survey items In case response mechanism is strong representative, then nonresponse bias is zero for any survey item. Could the strong definition be the basis for indicators? No, definition cannot be tested!

Representative response
How to define representativity? Representativity can only be investigated with respect to available auxiliary information. A response mechanism is (weakly) representative for auxiliary variable X if the average response propensity is the same within classes of X. Could the weak definition be the basis for indicators? Yes, definition can be tested using statistics! Estimate individual response propensities with a multivariate model incorporating available auxiliary information.

R-indicators Desirable properties Easy to interpret
Based on available registry data and survey data only Relevant, i.e. effective in improving response Allow for analysis at different levels of detail Natural candidate is (subgroup) response rate, but: Response rate limits maximal impact of nonresponse only Response rates do not reflect relative size of subgroups Response rates have no meaning at variable level Response rates are univariate

R-indicators R-indicator is based on variation in individual response propensities It can be shown that R-indicator follows Euclidean distance to weakly representative response.

R-indicators Measure : How to estimate in practice?
Scaled population standard deviation of response probabilities Scaled sample standard deviation of response probabilities Scaled sample standard deviation of estimated response probabilities

R-indicators Nonresponse bias of Horvitz-Thompson estimator

R-indicators Bounding R-indicators: response-representativity plots

R-indicators Examples of response rates and R-indicators (including three curves )

Example 1 – Various EU surveys
X = gender, age, urbanization Sample size Response rate R-indicator Health Survey 2005 (Holland) 15,411 67.3% 0.832 ESS 2006 (Belgium) 2,927 61.4% 0.807 ESS (Norway) 2,673 65.6% 0.762 Level of Living 2004 (Norway) 4,837 69.1% 0.872 LFS Quarter 3 – 2007 (Slovenia) 2,219 70.1% 0.854 LFS Quarter 4 – 2007(slovenia) 2,215 69.3%

Example 2 Dutch Survey on Informal Economy
X = age, house value, etnicity, type of household, employment, urban Response group Response rate Representativity measure R Confidence interval Maximal bias Face-to-face 56.7% 77.8% 74.4% % 10,2% Web/paper 33.9% 86.3% 83.1% % 11,2% Web/paper + phone 49.0% 79.3% 75.6% % 11,3%

Example 3 - VAT in time X = wages(t), NACE, VAT(t-12)

Partial R-indicators Partial or conditional representativity
What characteristics relate to deviation from representativity? What groups need special attention during data collection? Define indicators that measure conditional or partial representativity

Partial R-indicators Definition: Response is representative with respect to X if the response propensities are constant for X. Definition: Response is conditionally representative with respect to X given Z if the response propensities are constant for X within strata formed by Z. R-indicator: the variation in response propensities Idea: decompose response propensity variance in between and within variance

Partial R-indicators Partial R-indicators decompose R-indicator based on the impact of single variables total variance = between variance + within variance Unconditional partial R-indicator for a single variable Z: the between variance of response propensities Conditional partial R-indicator for a single variable Z given X: the within variation in response propensities given a stratification on X Both type of indicators should ideally be close to 0 and allow for monitoring of data collection and resource allocation

Example 1 - STS Retail 2007 Example 1 - STS Retail 2007
X = VAT (t-12) x Size

Example 2 – Monitoring response to SCS 2005
Unconditional (univariate) and conditional (multivariate) partial R-indicators for eligibility, contact, cooperation and overall response 6.22

Alternative indicators
Särndal and Lundström (2010): Response is balanced for X if the response means for X are equal to sample means. Indicator: Coefficient of variation of adjustment weights. Indicator is proportional to maximal bias for large sample sizes and depends on choice of X. Wagner (2008)/Andridge&Little (2011): No explicit definition of representative response. Approach taken from multiple imputation. Indicator: Fraction of Missing Information. Indicator depends on target variable Y and choice of X. If relation between X and Y perfect, then equal to zero. If no relation, then equal to nonresponse rate.

The European Statistical Training Programme (ESTP)

Similar presentations

Presentation on theme: "The European Statistical Training Programme (ESTP)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The European Statistical Training Programme (ESTP)

Similar presentations

Presentation on theme: "The European Statistical Training Programme (ESTP)"— Presentation transcript:

Similar presentations

About project

Feedback