ESDS meeting 9 th September 2005 1 P|E|A|S Practical Exemplars on the Analysis of Surveys –Web site to help people analyse surveys –Supported by the ESRC.

Slides:



Advertisements
Similar presentations
High Resolution studies
Advertisements

Sampling: Theory and Methods
Multistage Sampling.
Calculation of Sampling Errors MICS3 Regional Workshop on Data Archiving and Dissemination Alexandria, Egypt 3-7 March, 2007.
Multiple Indicator Cluster Surveys Survey Design Workshop
FDA/Industry Workshop September, 19, 2003 Johnson & Johnson Pharmaceutical Research and Development L.L.C. 1 Uses and Abuses of (Adaptive) Randomization:
1 ESTIMATION IN THE PRESENCE OF TAX DATA IN BUSINESS SURVEYS David Haziza, Gordon Kuromi and Joana Bérubé Université de Montréal & Statistics Canada ICESIII.
BUS 220: ELEMENTARY STATISTICS
Page 1 Measuring Survey Quality through Representativity Indicators using Sample and Population based Information Chris Skinner, Natalie Shlomo, Barry.
Handling attrition and non- response in longitudinal data Harvey Goldstein University of Bristol.
Outline of talk The ONS surveys Why should we weight?
Multistage Sampling Module 3 Session 9.
1 Session 10 Sampling Weights: an appreciation. 2 To provide you with an overview of the role of sampling weights in estimating population parameters.
SADC Course in Statistics Estimating population characteristics with simple random sampling (Session 06)
Overview of Sampling Methods II
Assumptions underlying regression analysis
SADC Course in Statistics Sampling weights: an appreciation (Sessions 19)
SADC Course in Statistics Common complications when analysing survey data Module I3 Sessions 14 to 16.
SADC Course in Statistics Overview of Sampling Methods I (Session 03)
AADAPT Workshop South Asia Goa, December 17-21, 2009 Kristen Himelein 1.
CHAPTER 14: Confidence Intervals: The Basics
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Migration of a large survey onto a micro-economic platform Val Cox April 2014.
1 Review Lecture: Guide to the SSSII Assignment Gwilym Pryce 5 th March 2006.
9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
Estimates and sampling errors for Establishment Surveys International Workshop on Industrial Statistics Beijing, China, 8-10 July 2013.
Sampling.
Why sample? Diversity in populations Practicality and cost.
Chapter 7 Selecting Samples
A new sampling method: stratified sampling
Understanding sample survey data
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
PEAS wprkshop 2 Non-response and what to do about it Gillian Raab Professor of Applied Statistics Napier University.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
How survey design affects analysis Susan Purdon Head of Survey Methods Unit National Centre for Social Research.
Determining Sample Size
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Census Sampling Frames and Sampling Section A 1.
McGraw-Hill/Irwin McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
IB Business and Management
Sampling: Theory and Methods
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
Collecting Samples Chapter 2.3 – In Search of Good Data Mathematics of Data Management (Nelson) MDM 4U.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
1 Hair, Babin, Money & Samouel, Essentials of Business Research, Wiley, Learning Objectives: 1.Understand the key principles in sampling. 2.Appreciate.
MDM4U - Collecting Samples Chapter 5.2,5.3. Why Sampling? sampling is done because a census is too expensive or time consuming the challenge is being.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
1 Chapter Two: Sampling Methods §know the reasons of sampling §use the table of random numbers §perform Simple Random, Systematic, Stratified, Cluster,
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
Understanding Sampling
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 7 Sampling and Sampling Distributions.
Res Meth Workshop Dec 04 Disclosure problems with design information for surveys Gillian Raab Kathy Buckner/Iona Waterston Napier University Susan Purdon.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Part III – Gathering Data
Chapter 10 Sampling: Theories, Designs and Plans.
Introduction to Survey Sampling
Statistics Canada Citizenship and Immigration Canada Methodological issues.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Research and teaching with the SHS data Gillian Raab Professor of Applied Statistics Napier University.
Chapter 3 Surveys and Sampling © 2010 Pearson Education 1.
1 of 22 INTRODUCTION TO SURVEY SAMPLING October 6, 2010 Linda Owens Survey Research Laboratory University of Illinois at Chicago
Background to PEAS project Gillian Raab Professor of Applied Statistics Napier University.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
Collecting Samples Chapter 2.3 – In Search of Good Data Mathematics of Data Management (Nelson) MDM 4U.
AC 1.2 present the survey methodology and sampling frame used
Graduate School of Business Leadership
Meeting-6 SAMPLING DESIGN
2. Stratified Random Sampling.
Presentation transcript:

ESDS meeting 9 th September P|E|A|S Practical Exemplars on the Analysis of Surveys –Web site to help people analyse surveys –Supported by the ESRC research methods programme –Authors Gillian Raab, Napier University Susan Purdon, National Centre for Social Research Kathy Buckner, Napier University Iona Waterston, Web designer

ESDS meeting 9 th September Summary of this presentation Background to the project –Our starting point and basic principles –Important concepts in survey design and analysis –Software for survey analysis –Approaches to missing data What we have learned from the project –Survey methods –Survey software –Missing data challenges Questions

ESDS meeting 9 th September Starting points (1) Survey data has special features that need to be considered in the analysis There is an enormous academic literature on survey analysis Universities in the UK have less expertise in survey analysis than in North America or Europe Most of the expertise lies in survey organisations

ESDS meeting 9 th September Starting points (2) The ESRC makes lots of data available via their survey archive – lots of it from Scotland –Scottish Health Survey –Scottish Household Survey This investment is to encourage use from e.g. University researchers –Government departments, – local authorities a – voluntary organisations But there is limited expertise on how best to analyse survey data

ESDS meeting 9 th September Starting points (3) Basic statistical theory for analysing sample surveys was developed from the 1950s to the 1970s –Cochrane, Kish, Rao The methods calculate confidence intervals and standard errors that take account of the survey design But none of this methodology has found its way into commonly used statistical packages until very recently –STATA – version 8? –SAS version 8 onwards –SPSS version 12 onwards –Splus/R survey packages last two years More recent methods are also available (especially in STATA and R)

ESDS meeting 9 th September Basic principles of what we present on P|E|A|S To illustrate how to use these new survey procedures effectively To help you to use them on your own data To use them to see how effective the design of the survey has been in getting accurate and precise estimates Like driving a car –We don ’ t expect you to understand all the details of how it works –But you do need to know the general principles –How to use the controls effectively –What regular checks you should be doing –What roads you should not be driving down

ESDS meeting 9 th September Survey features Based on current UK practice by ONS and survey organisations –Weighting –Clustering –Stratification Each of these has an impact on the results you get from analysing a survey. Only weighting will affect the estimates But all three will affect the standard errors

ESDS meeting 9 th September Weighting can make a large difference to answers Smoking rates from the 1998 Scottish Health Survey (ex3)

ESDS meeting 9 th September Weighting Why do we do it/ need to do it? –To make the sample match the population Because of selection as part of the design –Different sampling fractions in different areas –Selection of one adult per household To adjust for non-response How does it affect the precision of estimates? –It depends on both the weights and the data being analysed –It can help or hurt –If the weights are not related to the data being analysed then it will hurt to have unequal weights

ESDS meeting 9 th September Effect of weighting on standard errors (ex4) WERS 98 – a survey of workplaces run by the DWP Stratified by workplace size Sampling fractions much larger in strata of large workplaces This is often helps if we want to estimate something like the total numbers of employees with disabilities But for the proportion of workplaces with an equal opportunities policy it hurts

ESDS meeting 9 th September Stratification Divide up the sampling frame into strata (e.g. region, type of area) Take a sample of a fixed number of units from each stratum Stratification can be either proportionate or disproportionate Proportionate stratification means that the sample will match the population BETTER than would be expected by chance So proportionate stratification improves precision If it is disproportionate weights will be needed to estimate population totals Disproportionate stratification may help or hurt precision

ESDS meeting 9 th September Clustering Multi-stage designs very common in government surveys First a sample of clusters (e.g. post-code sectors) – stage 1 Then a sample of households within each cluster – stage 2 If clusters are selected with probability proportional to size and a fixed sample size id taken within each cluster, then no weighting is required Clustering almost always makes survey estimates less precise

ESDS meeting 9 th September Design effects (1) A design effect is a ratio that compares the precision of a survey with what would have been achieved from a simple, unclustered, unweighted, unstratified random sample of the same population. A large design effect is bad A design effect of 2 means that your effective sample size is only half of the responses you have achieved Means, proportions, differences between groups, regression coefficients, hazard ratios should all have design effects and chi-squared tests need adjustment by design effects Design effects are often quite different for subgroups of a sample – often not so bad And differences between groups are often very different from the overall mean – also often much better

ESDS meeting 9 th September Design effects (2) Many surveys publish tables of design effects or design factors for key variables, but rarely more than a page of them and almost never for things like differences, The design factor is just the square root of the design effect The idea is that you can just do an ordinary analysis and multiply your standard error by the design factor. This was for the pre-survey-software days On balance it probably gave standard errors that were too large for a lot of analyses, since people would try to play safe by taking the biggest design effect in the table. We don ’ t need to use design effects like that if we use thew correct software But they are a measure of how well the design has worked to get good answers

ESDS meeting 9 th September To summarise To get unbiased estimates need to use survey weights. To get correct standard errors need to take into account survey design, in particular weighting, clustering and stratification. We can now do this with standard software using survey methods Survey analysis software can also compare groups, carrying out regression analyses etc

ESDS meeting 9 th September Software for survey analysis You need a package that will allow for the survey design Specialist packages (SUDAAN, WESVAR) have been in use for many years STATA was the first general package with survey methods SAS, SPSS (add-on) and Splus/R all now Different ways of getting of describing the survey design And different capabilities in –Variety of methods –What feedback they give you about what you have done –Warning you when things are not going right Latest versions of all four packages will cover almost everything you would expect

ESDS meeting 9 th September Non-response An increasing problem for survey researchers From Alasdair Crockett :Weighting the Social Surveys (ESDS web site)

ESDS meeting 9 th September Two ways of dealing with it Post-stratification –Re-weighting the sample so as to match population totals –Gets a new set of weights Imputation –Fills in the missing values –Different procedures available –Used in censuses (one number) –And most often in longitudinal surveys

ESDS meeting 9 th September Post-stratification Only as good as the totals you are using for the population Will only correct non-response bias if the difference between responders and non-responders is explained by the post- stratification factors Census survey-link scheme informs us about this It has the potential to improve precision (see slide 12 if time) Survey firms and ONS are reluctant to use it because it may interrupt time series But post-stratification of old survey data is also a possibility Some survey packages will do it for you (R/Splus, STATA add on package for version 8, SAS Calmar macro) Analysing a survey to take account of post-stratification needs extra tricks (Splus/R and STATA provide them)

ESDS meeting 9 th September Imputation Most often carried out by census takers using detailed information from the census forms Usually picking up data from other similar individual households or household members More recently model based methods have become popular (books by Little and Rubin, Schafer are Bibles) –Very large literature on this now –And many sets of recommendations –e.g. make your imputation model large –Carry out multiple imputations and combine estimates – software to do this is available in Splus/R, STATA and SAS

ESDS meeting 9 th September Our experience Working with data from the Edinburgh Study of Youth Transitions and Crime (Exemplar 6) –It is tricky to get imputation models right for real data –Things can go horrible wrong especially if models are too big for the data –Its important to check things out –Choice of variables is more important than choice of models –We still have a lot to learn about this –Need to try these methods out on real data, not just simulated data

ESDS meeting 9 th September What we (I) have learned There is a lot more to know about survey design and analysis and new methods that need to be made available The literature still does not provide definitive answers to some questions But a lot of ground rules are well known Survey software is developing and improving fast It will do so even more if more people use it and feed back to the providers Non-response remains an important problem The jury is out as to whether and when post-stratification weighting, imputation or neither is the best approach to deal with non-response