Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table I 2 correlation (r) number of correlations 1 Test H 0 : =0 (17.2) Table G 2 Test H 0 : = (17.4) Tables H and A number of means Means Do you know ? 1 Yes z -test (13.1) Table A t -test (13.14) Table D 2 independent samples? Yes Test H 0 : = (15.6) Table D No Test H 0 : D=0 (16.4) Table D More than 2 number of factors 1 1-way ANOVA Ch 20 Table E 2 2-way ANOVA Ch 21 Table E No START HERE
Chapter 18: Testing for difference among three or more groups: One way Analysis of Variance (ANOVA) ABC Means Suppose you wanted to compare the results of three tests (A, B and C) to see if there was any differences difficulty. To test this, you randomly sample these ten scores from each of the three populations of test scores. How would you test to see if there was any difference across the mean scores for these three tests? The first thing is obvious – calculate the mean for each of the three samples of 10 scores. But then what? You could run an two-sample t-test on each of the pairs (A vs. B, A vs. C and B vs. C). Note: we’ll be skipping sections 18.10, 18.12, 18.13, 18.14, 1.15, 18.16, , and from the book
ABC Means You could run an two-sample t-test on each of the pairs (A vs. B, A vs. C and B vs. C). There are two problems with this: 1)The three tests wouldn’t be truly independent of each other, since they contain common values, and 2)We run into the problem of making multiple comparisons: If we use an value of.05, the probability of obtaining at least one significant comparison by chance is 1-(1-.05) 3, or about.14 So how do we test the null hypothesis: H 0 : A = B = C ?
ABC Means In the 1920’s Sir Ronald Fisher developed a method called ‘Analysis of Variance’ or ANOVA to test hypotheses like this. The trick is to look at the amount of variability between the means. So far in this class, we’ve usually talked about variability in terms of standard deviations. ANOVA’s focus on variances instead, which (of course) is the square of the standard deviation. The intuition is the same. The variance of these three mean scores (81, 72 and 75) is 22.5 Intuitively, you can see that if the variance of the means scores is ‘large’, then we should reject H 0. But what do we compare this number 22.5 to? So how do we test the null hypothesis: H 0 : A = B = C ?
ABC Means The variance of these three mean scores (81, 72 and 75) is 22.5 How ‘large’ is 22.5? Suppose we knew the standard deviation of the population of scores ( ). If the null hypothesis is true, then all scores across all three columns are drawn from a population with standard deviation . It follows that the mean of n scores should be drawn from a population with standard deviation: This means multiplying the variance of the means by n gives us an estimate of the variance of the population. So how do we test the null hypothesis: H 0 : A = B = C ? With a little algebra:
ABC Means The variance of these three mean scores (81, 72 and 75) is 22.5 Multiplying the variance of the means by n gives us an estimate of the variance of the population. For our example, We typically don’t know what is. But like we do for t-tests, we can use the variance within our samples to estimate it. The variance of the 10 numbers in each column (61, 94, and 55) should each provide an estimate of . We can combine these three estimates of by taking their average, which is VariancesMean of variances 70 n x Variance of means 225
ABC Means If H 0 : A = B = C is true, we now have two separate estimates of the variance of the population ( ). One is n times the variance of the means of each column. The other is the mean of the variances of each column. If H 0 is true, then these two numbers should be, on average, the same, since they’re both estimates of the same thing ( ). For our example, these two numbers (225 and 70) seem quite different. Remember our intuition that a large variance of the means should be evidence against H 0. Now we have something to compare it to. 225 seems large compared to VariancesMean of variances 70 n x Variance of means 225
ABC Means If H 0 is true, then the value of F should be around 1. If H 0 is not true, then F should be significantly greater than 1. We determine how large F should be for rejecting H 0 by looking up F crit in Table E. F distributions depend on two separate degrees of freedom – one for the numerator and one for the denominator. df for the numerator is k-1, where k is the number of columns or ‘treatments’. For our example, df is 3-1 =2. df for the denominator is N-k, where N is the total number of scores. In our case, df is 30-3 = VariancesMean of variances 70 n x Variance of means 225 Ratio (F) 3.23 F crit for =.05 and df’s of 2 and 27 is Since Fobs = 3.23 is less than F crit, we fail to reject H 0. We cannot conclude that the exam scores come from populations with different means. When conducting an ANOVA, we compute the ratio of these two estimates of . This ratio is called the ‘F statistic’. For our example, 225/70 = 3.23.
F crit for =.05 and df’s of 2 and 27 is Since Fobs = 3.23 is less than F crit, we fail to reject H 0. We cannot conclude that the exam scores come from populations with different means. Instead of finding F crit in Table E, we could have calculated the p-value using our F- calculator. Reporting p-values is standard. Our p-value for F=3.23 with 2 and 27 degrees of freedom is p=.0552 Since our p-value is greater then.05, we fail to reject H 0
Example: Consider the following n=12 samples drawn from k=5 groups. Use an ANOVA to test the hypothesis that the means of the populations that these 5 groups were drawn from are different. Answer: The 5 means and variances are calculated below, along with n x variance of means, and the mean of variances. Our resulting F statistic is Our two dfs are k-1=4 (numerator) and 60-5 = 55(denominator). Table E shows that F crit for 4 and 55 is F obs > F crit so we reject H Means VariancesMean of variances 93 n x Variance of means 1429 Ratio (F) 15.32
What does the probability distribution F(df b,df w ) look like? F(2,5) F(2,10) F(2,50) F(2,100) F(10,5) F(10,10) F(10,50) F(10,100) F(50,5) F(50,10) F(50,50) F(50,100)
For a typical ANOVA, the number of samples in each group may be different, but the intuition is the same - compute F which is the ratio of the variance of the means over the mean of the variances. Formally, the variance is divided up the following way: Given a table of k groups, each containing n i scores (i= 1,2, …, k), we can represent the deviation of a given score, X from the mean of all scores, called the grand mean as: Deviation of X from the grand mean Deviation of X from the mean of the group Deviation of the mean of the group from the grand mean
Total sum of squares: SS total Within-groups sum of squares: SS within Between-groups sum of squares: SS between The total sums of squares can be partitioned into two numbers: SS between (or SS bet ) is a measure of the variability between groups. It is used as the numerator in our F-tests The variance between groups is calculated by dividing SS bet by its degrees of freedom df bet = k-1 s 2 bet =SS bet /df bet and is another estimate of 2 if H 0 is true. This is essentially n times the variance of the means. If H 0 is not true, then s 2 bet is an estimate of 2 plus any ‘treatment effect’ that would add to a difference between the means..
Total sum of squares: SS total Within-groups sum of squares: SS within Between-groups sum of squares: SS between The total sums of squares can be partitioned into two numbers: SS within (or SS w ) is a measure of the variability within each group. It is used as the denominator in all F-tests. The variance within each group is calculated by dividing SS within by its degrees of freedom df w = n total – k s 2 w =SS w /df w This is an estimate of 2 This is essentially the mean of the variances within each group. (It is exactly the mean of variances if our sample sizes are all the same.)
SS total = SS within + SS between SS total SS within SS between df total =n total -1 df within =n total -kdf between k-1 df total = df within + df between s 2 between =SS between /df between s 2 within =SS within /df within The F ratio is calculated by dividing up the sums of squares and df into ‘between’ and ‘within’ Variances are then calculated by dividing SS by df F is the ratio of variances between and within
Finally, the F ratio is the ratio of s 2 bet and s 2 bet We can write all these calculated values in a summary table like this: SourceSSdfs2s2 F Betweenk-1s 2 bet =SS bet /df bet Withinn total -ks 2 w =SS w /df w Totaln total -1
SourceSSdfs2s2 F Between Within Total Means VariancesMean of variances 93 n x Variance of means 1429 Ratio (F) grand mean: Calculating SS total
SourceSSdfs2s2 F Between =41429 Within Total Means VariancesMean of variances 93 n x Variance of means 1429 Ratio (F) Calculating SS bet and s 2 bet
SourceSSdfs2s2 F Between =41429 Within513012x5-5=5593 Total Means VariancesMean of variances 93 n x Variance of means 1429 Ratio (F) Calculating SS w and s 2 w
SourceSSdfs2s2 F Between = Within513012x5-5=5593 Total Means VariancesMean of variances 93 n x Variance of means 1429 Ratio (F) F crit with dfs of 4 and 55 and =.05 is 2.54 Our decision is to reject H 0 since > 2.54 Calculating F
Example: Female students in this class were asked how much they exercised, given the choices: A.Just a little B.A fair amount C.Very much Is there a significant difference in the heights of students across these four categories? (Use =.05) In other words: H 0 : A = B C Summary statistics are: ABCTotal n Mean SS
Means and standard errors for ‘1-way’ ANOVAs can be plotted as bar graphs with error bars indicating ±1 standard error of the mean. ABCTotal n Mean SS
Filling in the table for the ANOVA: SS W and s 2 w SourceSSdfS2S2 F Between Within =669.3 Total =68 SS w = = ABCTotal n Mean SS
Filling in the table for the ANOVA: There are two ways of calculating SS bet SourceSSdfS2S2 F Between =2 Within =669.3 Total =68 ABCTotal n Mean SS
Filling in the table for the ANOVA: There are two ways of calculating SS bet Or, use the fact that SS total = SS within + SS between or SS between = SS total - SS within = = SourceSSdfS2S2 F Between =227.1 Within =669.3 Total =68 ABCTotal n Mean SS
Filling in the table for the ANOVA: F F crit for dfs of 2 and 66 and =.05 is 3.14 Since F crit is greater than our observed value of F, we fail to reject H 0 and conclude that the female student’s average height does not significantly differ across the amount of exercise they get. SourceSSdfS2S2 F Between = Within =669.3 Total =68 ABCTotal n Mean SS