Using survey data collection as a tool for improving the survey process Silvia Biffignandi, Antonio Laureti Giulio Perani University of Bergamo Istat Istat EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Outline Introduction Survey Questionnaire Preliminary results Research in progress
Introduction: Paradata and web surveys Web surveys allow for the collection of paradata during web questionnaire completion; paradata are data generated during the fieldwork of the survey. Two types of paradata: server-side paradata are collected by software tools running at the server. They relate mainly to the questionnaire compilation process. This data are contained in the so called logfiles. client-side paradata describe how the respondents are answering the questions (order, questions skipped, keys that have been pressed and so on). (Biffignandi and Bethlehem, 2012; Biffignandi, 2010) Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Analysis of paradata for: questionnaire improvement (identification of the most difficult questions in the survey form, of some missing (or too restrictive) checks ) quantification of the burden on respondents (from the degree of ready availability of R&D data in enterprises, to the time needed to fill in the questionnaire ) timeliness? costs? identification of the characteristics of late respondents. Using survey data collection as a tool for improving the survey process Introduction: Tasks of paradata analysis EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
The survey business R&D activities yearly, census target population: enterprises known, or assumed, to be R&D performers many administrative sources to identify the target population (20000 “potential R&D performers” ) (previous Istat R&D surveys, other Istat business surveys with R&D-related questions, ASIA business register, fiscal data, Italian Register of R&D performing institutions,data on national and EU funding to research projects, patent databases, private business reports) no a cut-off point in term of size, i.e. considered all enterprises (including micro- enterprises, provided that they are employing at least one researcher) reference year 2008; overall response rate 55% Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Future R&D performers (who are just asked to report about their future investment plans) Potential R&D performers Questionnaire Actual R&D performers (those who have to provide extensive information on the R&D activities undertaken in the reference year) Actual R&D performers (those who have to provide extensive information on the R&D activities undertaken in the reference year) Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss Non-R&D performers (no data requested)
R&D personnel Actual R&D performers Actual R&D performers Questionnaire R&D expenditures R&D expenditures Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss Qualitative questions on the R&D projects
Web survey administration Advanced web questionnaire design questionnaire mainly based on previous paper questionnaire innovative software architecture : two physically distinct components electronic form: delivered via Web on the remote PC of the respondent checking tool; resident in a Web-server in the Istat’s premises. This is a tool for the identification of the respondent and of the correct sequencing of the questionnaire’s provision. Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Available paradata Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss Timing of use: Entry and exit date. Days in which the tool has been accessed. Hours of use during the day. Intensity of use: Number and typology of actions (events). Processing time associated to each single action. Number of actions per unit of time (e.g. per day or fraction of day). Outcome of the compilation process. Compilation routes: Sequencing in accessing the questions. Changes to previously compiled questions. Generation of errors: Number and typology of errors. Questions (or groups of questions) mostly affected by errors.
Definitions “ events “ events”: each time a respondents is interacting with the electronic form (basically, by typing a figure or a word, or even, by scrolling down the form itself). We can identify each single “event” by its nature and by the time when it took place, as well as by its duration in time (usually, from a fraction to a few seconds). Events could be either “correct” (according to the rationale behind the structure of the questionnaire) or leading to the generation of “errors”. The questionnaire is “completed” only when all “errors” are properly fixed, the generation of “errors” will have as a result the need for new events aimed at correcting them. In the process of numbering the errors produced by each respondent, only the first appearance of an error type was taken into consideration, even in the event that the same error type would have been repeated in more than one session. Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Definitions Each event – or a group of events – is associated to an “access”. “access”: itself to the questionnaire cannot be identified in a straightforward way as “log-ins” and “log-outs” are not recorded in the system. We know that a respondents is (was) connected to the server because of the activity carried out on the questionnaire. An access without any activity on the questionnaire will not be taken into consideration. An access has to be qualified in terms of time. Conventionally, all events taking place within one hour (or three, or six hours) could form an “access”. “daily accesses” Only “daily accesses” are taken into consideration in this study, i.e. an “access” will be equivalent to the set of all events having taken place in a day. Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss Results. Results. Business R&D survey 2008 data collection. Paradata indicators: Number of respondents*, daily accesses and errors. Paradata indicators Actual R&D performers Future R&D performers Non R&D performers Respondents who have accessed the questionnaire without completing it Total N. of respondents* Number of events Average n. events per respondent 236,17 40,83 28,16 16,28 154,49 Number of daily accesses Average n. accesses per respondent 1,72 1,38 1,25 1,21 1,54 Average number of events per access 137,39 29,49 22,50 13,49 100,28 Number of errors** Average number of errors per respondent 18,57 3,21 1,71 1,44 12,03
Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss Results
Type of errors. Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Daily accesses EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Final delivery of questionnaire EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Actual R&D performers: events accesses and errors Using survey data collection as a tool for improving the survey process EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss
Research in progress EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss deeper analyses on delivering time and on sectoral break-down more detailed cross checking of the errors and mapping data processing of the recent R&D survey edition conclusions on questionnaire improvement conclusions/suggestions for more efficient follow up strategies
EESW11,European Establishment Statistics Workshop, September 2011, Neuchatel, Swiss Suggestions? ….. Thank you for your attention! Thank you for your attention!