Download presentation
Presentation is loading. Please wait.
Published byLuke Norris Modified over 8 years ago
1
Recap of data analysis and procedures Food Security Indicators Training Bangkok 12-17 January 2009
2
Objective: To provide a quick overview of the topics covered during the previous trainings and with the pre-training material. Refresh our mind on key coefficients / procedures
3
Data management in SPSS Import/export data into/from SPSS Merge datasets Label variables / values Clean datasets Recode, Compute Sort and Select cases
4
Data analysis in SPSS Compute descriptives (mean, median, etc.) Run frequencies, crosstabulations, multiple response analysis, etc. Compare means, run T-tests and Anova Analyse associations (r, regression, Chi- square)
5
Basic statistics Two types of variables: 1.Continuous: assume numeric values (e.g., household size) 2.Categorical: Categories are denoted by numbers that do not have numeric value (e.g., ethnic group) oThe type of analysis depends upon the nature of the variables.
6
Continuous Variables: descriptives Mean: sum of all the values divided by the numbers of cases. Measure of central tendency. Median: the middle value of a set of observations ranked in order. Measure of central tendency. Mode: the value of the observation occurring most frequently. Standard deviation: it measures the average distance of the observations from the mean of the distribution. Measure of eterogeneity of the distribution
7
Continuous Variables: descriptives Looking at the age data from 10 individuals… What is the range? 12 to 38 What is the mean? 27.2 What is the median? 28 What is the mode? 28 12345678910 1219232628 343638
8
Simple frequencies: distribution of a variable Cross-tabulations: distribution of a variable within the categories of another other variable. e.g., same total, but also distribution for each region Categorical Variables: descriptives
9
When respondents give more than one answer (e.g., report the 3 main crops cultivated)… we can analyse responses as a set. 1.Percentages based on responses, or 2.Percentages based on cases Categorical Variables: multiple responses
10
The percentage based on cases (HHs) tells us the prevalence (%) of HHs that cultivate a specific crop (disregarding the order) Household is the denominator. E.g., 100% of the HHs cultivate maize (3/3*100). Categorical Variables: multiple responses
11
The percentage based on responses (crops) compares one crop against all the cultivated crops. Here the denominator is all the cultivated crops. E.g., Maize represents 30% of the cultivated crops (3/10*100) Categorical Variables: multiple responses
12
Significance tests Is the relationship observed by chance or because there actually is a relationship between the variables? o(independent) T-test: to see whether two means are different and if the difference is statistically significant (i.e., exist also in the population) oANOVA (and post-hoc tests): to see if there are statistically significant differences between the means computed on the 3 (or more) groups and which means statistically differ. oChi-square: to see if there is a statistically significant association between two categorical variables. It does not tell us of how strong the association is!
13
Independent T-tests works well if: continuous variables groups to compare are composed of different people within each group, variable’s values are normally distributed there is the same level of homogeneity in the 2 groups. T-test: assumptions
14
T-test: SPSS procedure Drag the variables into the proper boxes define values for the independent variable
15
T-test: SPSS output Look at the Levene’s Test … If the Sig. value of the test is less than.05, groups have different variance. Read the row “Equal variances not assumed” If the Sig. value of test is bigger than.05, read the row “labelled Equal variances assumed”
16
ANOVA: SPSS procedure 1.Analyze; compare means; one-way ANOVA 2.Drag the independent and dependent variable into proper boxes 3.Ask for the descriptive 4.Click on ok
17
ANOVA: SPSS output Along with the mean for each group, ANOVA produces the F-statistic. It tells us if there are differences between the means. It does not tell which means are different. Look at the F’s value and at the Sig. level
18
Pairwise comparisons: SPSS output Once you have decided which post-hoc test is appropriate Look at the column “mean difference” to know the difference between each pair Look at the column Sig.: if the value is less than.05 then the means of the two pairs are significantly different
19
Chi square: SPSS output Look at the row labelled ‘Sig.’ If it is higher than 0.05 → the association is not statistically significant If it is lower than 0.05 → the association is statistically significant
20
Association/causality Is there a relationship between two variables? oCorrelation: to measure the association between two continuous variables. Pearsons’ r = -1 → perfect negative relationship Pearsons’ r = 1 → perfect positive relationship Pearsons’ r = 0 → no relationship at all. oRegression analysis: to measure how the change of one unit of an independent continuous variable impacts the value of the dependent continuous variable. Regression equation: Y = a + b x
21
Correlation: SPSS output
22
Simple linear regression: SPSS output Y= FCS a= 38.482 b= 14.101 x= wealth index Using this output, we can use the regression equation (Y = a + b x) to measure the FCS change for each one-unit change of the wealth index.
23
What’s next? Having good data analysis skills is a good starting point, but... one of the objectives of the training will be to apply your data analysis skills to quantitative food security analysis.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.