Presentation is loading. Please wait.

Presentation is loading. Please wait.

Horses for courses: Why different QT methods uncover different findings and implications for selecting methods In recent decades there has been an increase.

Similar presentations


Presentation on theme: "Horses for courses: Why different QT methods uncover different findings and implications for selecting methods In recent decades there has been an increase."— Presentation transcript:

1 Horses for courses: Why different QT methods uncover different findings and implications for selecting methods In recent decades there has been an increase in the Question Testing (QT) methods used to assess the quality of survey questions and processes. There is a growing body of literature on different question testing techniques. Despite this little is known about how effective each QT method is, the kinds of problems different methods are likely to find and what can be gained when they are combined… In this presentation I will present preliminary findings from a review of question testing studies, carried out by the Questionnaire Development and Testing (QDT) Hub. The purpose of this review was to gain further insight into the efficacy of different question testing methods and when they should be used. ESRA 2013

2 Background 1.

3 Research questions What types of problem are detected using different QT methods Do different QT methods ever result in contradictory findings or inconsistent recommendations within a study? If so why do these contradictory findings occur? What are the implications for selecting a QT method? In recent decades there has been an increase in the number of Question Testing (QT) methodologies used to assess the quality of survey questions. Various QT methods are being routinely adopted by survey agencies including stakeholder focus groups, cognitive interviews, field-tests, experiments and the validation of data against external sources. Whilst some scholars have written about the different QT methods available (e.g. Madans et al, 2011) little advice is available on how effective each QT method is in different circumstances, what types of problem different QT methods find (or fail to find) or how best to combine the different methods. This paper will presents preliminary findings from a review of a selection of studies, carried out by the Questionnaire Development and Testing (QDT) Hub at the National Centre for Social Research, that have used a combination of different pre-testing methods. The meta-analysis will strive to focus on the kinds of data each method have provided and what this tells us about how such methods can be combined. Might give an insight into how to choose and combine QT methods

4 Methodology 2.

5 Qualitative Meta-analysis
Seven studies identified using 3+ QT methods: Questions on social care Questions on gender identity Questions on receipt of state benefits Questions on extremism Questions on living with a disability Travel diary testing Collecting contact details/ addresses We decided to review our past studies. Not going into details of each study as that would take up the whole slot… Variety in terms of… Topic area Types of question Factual/ behavioural/ attitudinal Data collection method Questions/ Diary Development stage Designing and testing new Qs/ testing existing survey Q Testing aims Specific wording /administration

6 QT Methods used in the studies
Qualitative Respondent focus groups Cognitive interviews Interviewer feedback Pilot/ survey review focus groups Expert panels Data commissioners Subject/ policy experts Translators Survey administrators Survey methodologists Quantitative Analysis of pilot data Non-response Comparison with existing data Experiments Coding of formatting errors in old vs. new Secondary analysis Validation Data linkage/admin data address testing Highlight different audiences in qual- highlight quant is done on a small scale on pilot data, or on a larger scale on survey data QT reports and other documentation for each study reviewed Qualitative summaries of findings entered into a matrix. We compared findings within studies and then across studies. Patterns in findings between QT methods Discrepancies in findings within a QT studies

7 Key findings 2.

8 Different methods have different aims
When we were trying to make like for like comparisons on question performance using different methods. We detect different issues as methods are trying to find out different things.

9 Evaluation (explanatory)
Different aims at different stages Scoping Diagnostic Evaluation (explanatory) Qn refinement Evaluation (open) 1. Pre-survey 2. Post-survey Design Qn Individual QT methods are used in different ways depending on stage of question development A QT method may work well in one development stage but not another Scoping What questions should be asked? How should Q concepts be broken down? How should Qs be administered? Evaluation: (Open) Is the new question working as intended? Refine and re-test Diagnostic: Survey is completed- what problems if any are occurring with question data? Evaluation: (Explanatory) Why are problems occurring? How can they be fixed? Open evaluation more of a ‘blank slate’ whereas explanatory evaluation more focussed.

10 Overlap in findings using qualitative methods
Certain findings were global across focus groups/ cog testing/ interviewer feedback/expert panels Simplify language Remove superfluous words Gaps in respondent knowledge Provide information on purpose and benefits It didn’t matter what type of qual method you used- you should be able to detect these types of problems Focus groups/ cog testing/ expert reviews/ interviewer reviews Could some of these be identified prior to using any QT methods?

11 But…. Wording problems can go undetected in focus groups …
Fewer types of wording issue uncovered compared to cognitive interviewing Poor comprehension can be hidden in a FG setting Problems with question length/ order not detected always in FG Evaluation Unrealistic expectations= Want interviewer matching/ completely open questions/ qualitative approach/ advance warning of all Qs Mention translation here.

12 But…. Findings on administration differ depending on audience
General public may request a ‘non-standardised’ approach Advice more feasible from interviewers or survey practitioners Evaluation Unrealistic expectations= Want interviewer matching/ completely open questions/ qualitative approach/ advance warning of all Qs Mention translation here.

13 Cognitive interviewing
More problems wording issue detected in cognitive interviewing compared to other qualitative methods Specific phrases Selection of answer categories Visual presentation But… willingness in cognitive interviews does not always match willingness in pilot Sampling? Context/ interviewer effect? Question wording Specific clarifications/ ambiguities not picked up on elsewhere Content and order of answer categories Overlap between categories/ multi-code not single codes Problems with agree/disagree scales Visual presentation Placement of instructions/ routing/ navigation The disparity in providing an EM address was even higher (81%/ COG- R=69 and 36% pilot). This occurred even though cog test was much longer (COG=approx one hour/ OMNIBUS =Approx 20 mins)

14 Quantitative methods Quantitative methods more useful at assessing respondent willingness: Item response/ drop-off rates/ validation Characteristics of those who don’t respond Can quantify other data quality issues Piloting collects other types of information not found elsewhere Length of survey, paradata Can detect issues but may not provide evidence of why issues are occurring Triangulation with qualitative findings Quantitative data more useful at assessing respondent willingness: Item response/ drop-off rates/ validation Characteristics of those who don’t respond Can quantify other data quality issues Number of respondents answering incorrect formats Consistency with other data sources Give example of NTS travel diary As might be anticipated the quant evaluations found different types of problem

15 Interviewer feedback under-utilised?
High degree of overlap between interviewer feedback and cognitive testing in repeated survey review Where respondents ask for clarification Reasons for asking for clarification Language used by respondents Respondent willingness/Drop-off triggers FRS interviewer focus groups Can be part of a pilot or a review of an existing survey Useful in explanatory evaluations

16 Reflections and future directions
3. Just a few snippets of top level findings- please ask me questions if you would like further detail

17 Conclusions The combination of QT methods to use is dependant on the aims and the development stage Respondent FGs and expert reviews useful in scoping stage E.g. define areas of interest/ breakdown concepts/pragmatic solutions Quant analysis of survey data required for diagnostic stage E.g. Non-response/ validation Multiple methods can be used at the evaluation stages… Clear aims/ rationale The method selected should match the aim of the testing I will talk more about evaluation stages now…

18 Future Directions Direct comparisons difficult as this was not an experiment… Changes in questions made between methods Only one study compared differences in data pre-evaluation and post-evaluation Others only piloted refined questions (not before and after) Recommendations Experiments testing identical questions with different QT methods Greater experimental comparison of questions pre-testing and post- testing Comparing apples and pears/ statistics and qual data Pragmatics- don’t want to test questions we think won’t work Demonstrate evidence of impact of QT

19 Thank you If you want further information or would like to contact the author, Jo d’Ardenne Senior Survey Methodologist T. (+44) E. Visit us online:


Download ppt "Horses for courses: Why different QT methods uncover different findings and implications for selecting methods In recent decades there has been an increase."

Similar presentations


Ads by Google