Some statistics questions answered: "Well, am I significantly more popular than Gordon Brown?"
Requests: How to report statistical results in APA format. How to calculate SD and SE by hand. An explanation of levels of measurement. How to calculate t-tests, and how to interpret ANOVA output.
How to report statistics results: Ultimate authority: American Psychological Association. (2010). Publication Manual of the American Psychological Association (6th ed.). 1. Descriptive statistics: Mean and Standard Deviation: Report numbers to two significant digits; exclude unnecessary zeroes. "The children's reading test performance was fairly poor (M = 45.63, SD = 12.28)". "The children's mean reading test score was 45.63 (SD = 12.28)". Percentages: No decimal places. "Almost all (97%) of the sample liked Sooty".
2. Statistical test results: Varies between journals to some extent. Essentials - name of test; test statistic; d.f. or N (depending on test); probability level. Chi-Square: 2(2, N = 17) = 8.85, p < .05 Pearsons correlation: r(55) = .49, p < .01. Spearman's correlation: rs(95) = .36,p < 0.02 T-tests: t(54) = 5.43, p < .001 ANOVA: F(1, 145) = 5.43, p = .02 (between-groups df, within-groups df) Mann-Whitney: SPSS converts U into a z score, so : Z = 1.97, p < .05 By hand: U(N = 17) = 4.00, p < .005. Wilcoxon: again, SPSS converts result into a z score. Kruskal-Wallis: SPSS converts H into a Chi-Square value, so 2(2, N = 17) = 10.58, p < .01 By hand, with small Ns, report H .
Standard deviation and standard error: Standard deviation (SD): a measure of how much scores are spread out around their mean. The bigger the SD, the less representative the mean is of the set of scores from which it was calculated. 1. Find the mean. 2. Find difference between each score and the mean. 3. Square the differences. 4. Add them up. 5. Divide by number of scores. 6. Take the square root of the result. In practice - use the "square root" function on your calculator!
Standard deviation and standard error: Standard error of the mean (SE): a measure of how much sample means are spread out around the population mean. The bigger the SE, the less confident we can be that the sample mean truly reflects the population mean. 1. Find the standard deviation. 2. Divide it by the square root of the number of scores. means of different samples actual population mean means of different samples actual population mean successive attempts
Tests whose output you should be able to understand: Tests you need to be able to calculate by hand (for section 2 of the exam): Chi-Square test of association , Chi-Square goodness of fit Pearson's r , Spearman's rho Wilcoxon , Mann-Whitney Friedman's , Kruskal-Wallis Repeated-measures t-test , Independent-measures t-test Tests whose output you should be able to understand: One-way independent-measures ANOVA One-way repeated-measures ANOVA
Levels of measurement: Nominal (categorical) data: Each participant contributes to the frequency for a category. Voting: Conservative, Lib-Dem, Labour, Green Handedness: left or right-handed. All we know is how many people fall into each category - head-counts. Ordinal data: In principle, all we can do is place scores in order of magnitude. In Psychology: ratings, e.g. attitude scales, ratings of pain,attractiveness. Interval/ratio data: Share this property - there are equal intervals throughout the measuring scale. Often measures of physical properties - time, length, weight. In Psychology: reaction times, number correct, number of errors.
Levels of measurement: Difference between interval and ratio scales: Relatively unimportant for psychology. Ratio scale - true zero point on scale (representing an absence of the property being measured). e.g. time, accuracy. Interval scale - zero point is arbitrary. e.g Fahrenheit/ Centigrade temperature scales, calendar years. Christian calendar: "0" is an arbitrary point. Each unit (year) is equal, but "2000" is NOT twice as much "time" as "1000" ! 2000 BC 1000 BC 500 BC 0 BC 500 AD 1000 AD 2000 AD
Analysis of Variance: One-way independent-measures ANOVA: Used to compare the means of three or more groups representing different levels of one independent variable. Each participant does one condition only. e.g. effects of age (young, middle-aged, old) on bladder-control (DV = length of time before needing the toilet) One-way repeated-measures ANOVA: Used to compare the means of three or more conditions representng different levels of one independent variable. Each participant does all conditions. e.g. effects of drug (terfenedine, cetirizine, piriton) on hayfever (DV = number of sneezes in an hour).
Output for a one-way independent-measures ANOVA: F-ratio is Mean Squares for between-groups variation Mean squares for within-groups variation The bigger the F-ratio, the bigger the differences between the groups, compared to the differences within the groups. Need to take into account the number of groups and the number of participants. d.f. are the between-groups d.f. (number of groups minus 1) and the within-groups d.f. (number of participants in group A minus 1, plus the number of partcipants in group B minus 1; etc,) F(3, 26) = 19.50, p < .01.
Output for a one-way repeated-measures ANOVA: F-ratio is Mean Squares for between-conditions variation Mean squares for within-conditions variation The bigger the F-ratio, the bigger the differences between the conditions, compared to the differences within the conditions. Need to take into account the number of conditions and the number of participants. d.f. are the between-condtions d.f. (number of conditions minus 1) and the within-conditions d.f. (number of participants in condition A minus 1, plus the number of participants in condition B minus 1; etc,) F(2, 38) = 35.89, p < .01.
Independent-measures t-test: The t value represents the size of the difference between two means (essentially a z-score for small sample sizes). The bigger the value of t, the more confident we can be that the difference between the means is "real", i.e. that it has not occurred just by chance. t is the difference between two means, divided by an estimate of how much this difference is likely to vary from occasion to occasion (estimated standard error of the difference between means). Repeated-measures t-test: Same logic, except that we can capitalise on the fact that the same people did both conditions (and hence random variation in performance is likely to be less).
When do you use a t-test, and when do you use a correlation? Depends on whether you have an experimental or correlational design: t-test: Two groups or conditions (two levels of one IV) Looking for differences between them on a single DV. Does alcohol affect driving performance? IV "alcohol dosage" (sober versus inebriated). DV number of crashes. Correlation: two different DVs (alcohol consumption and number of crashes) looking for a relationship between them.
Answers are shown to two significant digits (e. g. 2. 56, 31. 95 etc Answers are shown to two significant digits (e.g. 2.56, 31.95 etc.) Should we round numbers up during the calculations, or only round up our answer? Round up at the very end, otherwise rounding errors can accumulate to make your final answer inaccurate.
Conservative 10,706,647 (36% of the total vote) Labour 8,604,358 (29% of the total vote) Liberal Democrat 6,827,938 (23% of the total vote) Chi-squared goodness of fit test: Expected frequency = 26,138,943 divided by 3 observed frequencies observed - expected CONSERVATIVE LABOUR LIB-DEM 2(2, N = 26,138,943) = 865,363, p < .05 (and then some...)