Survey Training Pack Session 2 – Data Analysis Plan
Before questionnaire? Activities: – Questionnaire -> Data Collection -> Analysis -> Results Planning: – Objectives -> Analysis Plan -> Questionnaire Plan what you want to produce to ensure you collect the right data
Link to objectives “Information presented as a result of the study must correspond to the information needs specified in the objectives” Revisit objectives as Analysis Plan is developed Iterative process – not simple sequence of steps
Why is it so important? It is a good way to make sure that your tool captures all of the information/indicators that you need It also helps you identify your specialised technical assistance needs for sampling and analysis Data Analysis Plan What do I need to know? How does the information need to be presented? What data points do I therefore need? Does the information help me respond to my original questions?
What is a data analysis plan? A collection of ALL the tables and graphs required to meet study objectives (i.e. indicators) which ideally provide a good representation of the data collection tool
A data analysis plan includes the following elements: – Layout of the tables; detailed specification of the graphs required – Numerator, denominator and unit of analysis as well as the unit of measurement for each table – Data variables that need to be created at the stage of the analysis to produce specified output tables/graphs – What percentage calculations are most useful? (i.e. row and column percentage) – The need for statistical analysis that is more advanced than basic descriptive analysis (frequencies, means, medians, percentages) – Confidence intervals required (generally 95% confidence intervals) What is a data analysis plan?
How is it structured? Your information requirements – i.e. indicators – inform the structure of a data analysis plan and not the structure of the questionnaire Ideally, a DAP is structured in this way: – A list of your study objectives and indicators; – A few tables that describe your sample; – Specific ESSENTIAL tables that address all the indicators you need to measure; – These tables are structured in the way in which you need to report the data; – Sometimes follow the structure of the questionnaire in grouped themes
How do you check your DAP? Testing the link between your analysis and your study objectives – what table(s) allow you to report against the indicators that you have intended for this survey to measure Are there any output tables that are part of your analysis which are not relevant to these indicators? If so, what purpose do these tables serve?
Percentage calculation Land potential Type of training All Trainees PLARMAtT MaleFemaleTotalMaleFemaleTotalMaleFemaleTotal 1%%%%N (%) 2%%%% 3%%%% 4%%%% 5%%%% All land types N (%) N (%) N (%) N (%) N (%) N (%) N (%) N (%) N (100%) Table 74: Percentage of RICE farmers who own mobile phones disaggregated by land potential, sex of respondent and type of training
Example with dummy data Land potential Type of training All Trainees PLARMAtT MFTMFTMFT 1 19 (90.5 %) 2 (9.5% ) 21 (100 %) 14 (87.5 %) 2 (12.5 %) 16 (100 %) 33 (89.2 %) 4 (10.8 %) 37 (100% ) All land types 42 (80.8 %) 10 (19.2 %) 52 (100 %) 82 (84.5 %) 15 (15.5 %) 97 (100 %) 124 (83.2 %) 25 (16.8 %) 149 (100% ) Table 74: Percentage of RICE farmers who own mobile phones disaggregated by land potential, sex of respondent and type of training Row percentages:
Example with dummy data Table 74: Percentage of RICE farmers who own mobile phones disaggregated by land potential, sex of respondent and type of training Column percentages: Land potential Type of training All Trainees PLARMAtT MFTMFTMFT 1 19 (42. 2%) 2 (20%) 21 (40.4 %) 14 (17. 1%) 2 (13.3% ) 16 (16.5 %) 33 (26. 6%) 4 (16.0% ) 37 (24.8% ) All land types 42 (100 %) 10 (100%) 52 (100 %) 82 (100 %) 15 (100%) 97 (100 %) 124 (100 %) 15 (100%) 149 (100% )
Exercise Part 1: review analysis plan for rice survey – For each table, determine if it is needed. If so, how is this table linked to the study objectives. – For needed tables, indicate the denominator. Part 2: survey quality assessment framework – Which questions can you answer with respect to the rice survey?
Feedback on exercise Plenary session on part 1 and 2 Part 1: review analysis plan for rice survey – The University of Reading has reviewed the data analysis plan and provided comments – A proposed revised data analysis plan is included Part 2: survey quality assessment framework – Which questions can you answer with respect to the rice survey?
What is an output table? A table that presents data/information that has been collected using a quantitative data collection process and processed/analysed using statistical tools The purpose of an output table is to organise data/information and facilitate its interpretation Therefore, the way in which you structure the output table is quite important
What is its structure? An output table includes the following elements: – A title and associated subtitles that described the contents of the table (i.e. unit of analysis, numerator, denominator, unit of measurement and disaggregation) – The number of cases (N/n) that are relevant to the table as a whole or sections of the table – Estimates (i.e. percentages, mean, median or ratio) that are relevant to sections of the table Good example Bad example What do you notice about this table?
What about disaggregation? Disaggregation is the process of breaking down and analysing data by subcategories to detect differences and similarities for a given criterion (e.g. sex of rice or sesame trainee) If you choose to disaggregate the data, it means that (1) you are interested in comparing subcategories; or (2) you expect a difference between given subcategories Provide an example of a table in the 2013 rice analysis which is disaggregated because: – We are interested in comparing subcategories – We expect a difference between given subcategories
Word of caution Disaggregation is most meaningful when your levels of precision are similar across your categories When the number of cases is small, comparisons drawn may be inconclusive owing to low levels of precision In this example, it is difficult to compare both groups because the data is not precise enough. MEN WOMEN
Producing an output table Develop an output table for the following indicators: % of directly trained rice and sesame producers adopting at least 3 CRSP(T) promoted technologies during the last season disaggregated by type of crop Average yield for rice and sesame during the last season disaggregated by type of crop and beneficiary status Please specify the unit of analysis, denominator, numerator and disaggregation For each table, how would calculate the %/mean? Please let us know if you think confidence intervals should be included.
Summary Create the data analysis plan before the questionnaire Link the analysis plan to study objectives/indicators Analysis plan and study objectives/indicators guide the development of the questionnaire