Presentation is loading. Please wait.

Presentation is loading. Please wait.

One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test.

Similar presentations


Presentation on theme: "One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test."— Presentation transcript:

1

2 One-Way Analysis of Variance

3 Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test than either z-tests or t-tests. 2. The solution is to perform an analysis of variance (ANOVA). 3. ANOVA involves the comparison of two estimates for the population variance. 4. One variance estimate captures only the random differences among sampled units, the other these random differences plus the effects of being in the different subsamples. 5. The ratio between the two estimated variances is evaluated using the F-statistic sampling distributions.

4 Recapitulation (continued) Recapitulation (continued) 6. ANOVA is based on the general linear model. 7. The general linear model is: Y ij =  +  j X ij +  ij where X ij is the subgroup difference and  j is a constant estimating its effect on Y ij. 8. When subgroup differences do not exist,  j = 0.0. 9. The null hypothesis is: H 0 :  1 =  2 =  3 =...  j

5 As an example, consider an experiment on worker productivity in an introductory psychology class. Thirty students were randomly selected for the experiment from PSYCH 100 and randomly assigned to one of three subgroups. The productivity measure (Y ij ) was the number of puzzles that these students solved in a fixed period of time. The three experimental conditions (treatments, X ij ) were: left alone to solve puzzles; solving puzzles in the presence of the other nine group members (so that each subject could observe her or his own rate of puzzle solving); and solving puzzles in the presence of other subjects AND in the presence of a monitor, meant to simulate a supervisor. The results look like this:

6 ————————————————————————————————————— Not Monitored Alone Not Monitored Together Monitored Subject Y i,1 Subject Y i,2 Subject Y i,3 ————————————————————————————————————— 1 131 91 8 2 142 112 6 3 103 103 9 4 114 84 7 5 125 105 8 6 106 126 10 7 127 117 8 8 128 108 9 9 139 99 6 10 11 10 10 10 11 ————————————————————————————————————— N 1 = 10 N 2 = 10 N 3 = 10  1 = 118  2 = 100  3 = 82 _ __ Y 1 = 11.8Y 2 = 10.0Y 3 = 8.2 _ Y = 10.0

7 Our hypothesis (H 1 ) is that working conditions affect worker performance in ways that we do not fully understand: H 1 :  1   2   3 Our null hypothesis (H 0 ) is that worker performance is unaffected by working conditions: H 0 :  1 =  2 =  3 Since a comparison of THREE subgroup means is required, t-tests are inappropriate. The approach known generically as the analysis of variance must be used.

8 ————————————————————————————————————— Not Monitored Alone Not Monitored Together Monitored Subject Y i,1 Subject Y i,2 Subject Y i,3 ————————————————————————————————————— 1 131 91 8 2 142 112 6 3 103 103 9 4 114 84 7 5 125 105 8 6 106 126 10 7 127 117 8 8 128 108 9 9 139 99 6 10 11 10 10 10 11 ————————————————————————————————————— N 1 = 10 N 2 = 10 N 3 = 10  1 = 118  2 = 100  3 = 82 _ __ Y 1 = 11.8Y 2 = 10.0Y 3 = 8.2 _ Y = 10.0

9 First we calculate the total sum of squares: We begin with the first score in the first group and continue through the 30th score in the third group, as follows: SS Total = (13 - 10.0) 2 + (14 - 10.0) 2 +... + (11 - 10.0) 2 + (9 - 10.0) 2 +... + (10 - 10.0) 2 + (8 - 10.0) 2 +... + (11 - 10.0) 2 = 116

10 Next we calculate the sum of squares between, as follows : For the first of the three subgroups, we find the difference between the group mean and the grand mean, square that difference then multiply it by the size of the subgroup, then do the same for the other two subgroups. Then we sum these three products, as follows: SS Between = 10(11.8 - 10.0) 2 + 10(10.0 - 10.0) 2 + 10(8.2 - 10.0) 2 = 64.8

11 ————————————————————————————————————— Not Monitored Alone Not Monitored Together Monitored Subject Y i,1 Subject Y i,2 Subject Y i,3 ————————————————————————————————————— 1 131 91 8 2 142 112 6 3 103 103 9 4 114 84 7 5 125 105 8 6 106 126 10 7 127 117 8 8 128 108 9 9 139 99 6 10 11 10 10 10 11 ————————————————————————————————————— N 1 = 10 N 2 = 10 N 3 = 10  1 = 118  2 = 100  3 = 82 _ __ Y 1 = 11.8Y 2 = 10.0Y 3 = 8.2 _ Y = 10.0

12 Finally, we calculate the sum of squares within: This means that we find the squared difference between each of the ten scores in the first group and the mean for the first group, then the squared difference between the ten scores in the second group and the mean for the SECOND group, then the squared difference between the ten scores in the third group and the mean for the THIRD group, and finally add all 30 squared differences together: SS Within = (13 - 11.8) 2 + (14 - 11.8) 2 +... + (11 - 11.8) 2 + (9 - 10.0) 2 +... + (10 - 10.0) 2 + (8 - 8.2) 2 +... + (11 - 8.2) 2 = 51.2

13 To check our calculations, remember the identity Total SS = Between SS + Within SS 116 = 64.8 + 51.2 Next, we need the degrees of freedom. Total degrees of freedom is simply number of cases less one, N - 1. Here, there are 30 cases, so there are 29 total degrees of freedom. For degrees of freedom between, the three subgroup means are treated as scores, so there are J - 1 across subgroups, here 3 - 1, giving us 2 degrees of freedom between. Finally, we lose a degree of freedom by partitioning into subgroups, i.e., N - J. Here we have three subgroups, so we lose a degree of freedom for each giving us 30 - 3 or 27 degrees of freedom within.

14 Analysis of variance results by convention are reported in what is called an "ANOVA summary table": ————————————————————————————————————— SourceSS dfMean Square F ————————————————————————————————————— Between 64.80 2 32.40 17.05 Groups Within 51.20 27 1.90 Groups Total 116.00 29 —————————————————————————————————————

15 We perform a significance test in the usual way, first by selecting alpha, then locating the appropriate sampling distribution, finding the critical value, and comparing this value to the value of the F-statistic. With alpha = 0.05, we find Appendix 3, p. 544. In this example we have 2 and 27 degrees of freedom. The table of critical values has degrees of freedom between as COLUMN headings (n 1 ) and degrees of freedom within as ROW headings (n 2 ). In column 2, row 27 we find the critical value to be 3.35. Since our F-value is 17.05, GREATER than 3.35, we know that it lies well inside the region of rejection, hence we REJECT the null hypothesis at the 0.05 level. Substantively, this means that we infer that the conditions under which one performs a task DO have an effect on performance.

16

17 The F-test is a significance test, an inferential statistic. It tells us only whether or not exposure to the treatment variable has measurable consequences that are different from chance. It does NOT tell us about the strength of association between the treatment (X ij ) and the dependent variable, Y ij. For this we need a measure of association. The sum of squares BETWEEN represents the variance attributable to the treatment variable, X ij. The TOTAL sum of squares expresses the total amount of variance in the dependent variable, Y ij, that is, the total variance "to be explained" statistically. A ratio of the two is a straightforward description of the percentage of variance in Y ij accounted for by its association with X ij. Statistically this is called R-square.

18 From the example above, the sum of squares between is 64.80 and the total sum of squares is 116.00. Thus, R-square is : The F-test tells us that treatment categories (working conditions) differ in ways that cannot be explained as chance. R-square tells us that 56 percent of the variation in task performance is associated with differences in working conditions.

19 Knowing that the treatment variable has a statistically significant effect does not tell us WHICH specific treatment category or categories have greater impact than others. In our example, we know only that AT LEAST ONE of the puzzle-solving conditions differs from one (or both) of the remaining two, but we do not know which. In other words, we do not know which of the following alternative hypotheses is (are) supported:  1   2 =  3  1 =  2   3  1   3 =  2  1   2   3 We need a way to statistically compare the subgroups.

20 There are two strategies: comparisons explicitly planned in advance are called a priori tests; those performed after an initial ANOVA are called post hoc comparison tests. Of the latter, we will use only the method known as the Scheffé test. The Scheffé method creates a “threshold” for comparing subgroup means (once an ANOVA null hypothesis has been rejected) called the “minimum significant difference.” Differences between two subgroup means that exceed this minimum significant difference are statistically significant; that is, their difference appears to be “real” rather than due to chance. The algorithm is in Sirkin (1999), p. 333.

21 where _ _ Y j and Y j+1 are the subsample means being compared df Between is degrees of freedom between in the ANOVA F  is the critical value of F at the significance level (  ) chosen for the comparison MS Within is the ANOVA mean square within and n j and n j+1 are the sizes two subsamples being compared

22 In the puzzle-solving example, _ _ _ Y 1 = 11.8, Y 2 = 10.0, and Y 3 = 8.2 df Between = 2 F  = 2.51 (  =.10, df = 2, 27) MS Within = 1.90 and n 1 = n 2 = n 2 = 10 Hence,

23 The value 1.381 is the “minimum significant difference,” the “threshold” we use to compare subsample means with  set at 0.10. [Sirkin (1999) contains no 0.10 F table.] Here is how Sirkin would organize our comparison tests: _ _ H 0 |Y j – Y j+1 | = Critical Value Conclusion ——————————————————————————  1 =  2 |11.8 – 10.0| = 1.80 > 1.381 Reject H 0  2 =  3 |10.0 – 8.2| = 1.80 > 1.381 Reject H 0  1 =  3 |11.8 – 8.2| = 3.60 > 1.381 Reject H 0

24 Sample SAS Program Puzzle-Solving Example libname old 'a:\'; libname library 'a:\'; options nodate nonumber ps=66; proc glm data=old.example; class setting; model puzzles = setting; means setting / scheffe alpha = 0.1; contrast 'Alone vs. Together' setting 1 -1 0; contrast 'Alone vs. Monitor' setting 1 0 -1; contrast 'Together vs. Monitor' setting 0 1 -1; contrast 'Alone vs. Others' setting 2 -1 -1; contrast 'Together vs. Others' setting -1 2 -1; title1 'ANOVA With Comparison Tests'; run;

25 ANOVA With Comparison Tests General Linear Models Procedure Class Level Information Class Levels Values SETTING 3 (1) alone (2) monitor (3) together Number of observations in data set = 30

26 ANOVA With Comparison Tests General Linear Models Procedure Dependent Variable: PUZZLES Sum of Mean Source DF Squares Square F Value Pr > F Model 2 64.80000000 32.40000000 17.09 0.0001 Error 27 51.20000000 1.89629630 Corrected Total 29 116.00000000 R-Square C.V. Root MSE PUZZLES Mean 0.558621 13.77061 1.3770607 10.000000 Source DF Type I SS Mean Square F Value Pr > F SETTING 2 64.80000000 32.40000000 17.09 0.0001 Source DF Type III SS Mean Square F Value Pr > F SETTING 2 64.80000000 32.40000000 17.09 0.0001

27 ANOVA With Comparison Tests General Linear Models Procedure Scheffe's test for variable: PUZZLES NOTE: This test controls the type I experimentwise error rate but generally has a higher type II error rate than REGWF for all pairwise comparisons Alpha = 0.1 df = 27 MSE = 1.896296 Critical Value of F = 2.51061 Minimum Significant Difference = 1.38 Means with the same letter are not significantly different. Scheffe Grouping Mean N SETTING A 11.8000 10 alone B 10.0000 10 together C 8.2000 10 monitor

28 ANOVA With Comparison Tests General Linear Models Procedure Scheffe's test for variable: PUZZLES NOTE: This test controls the type I experimentwise error rate but generally has a higher type II error rate than REGWF for all pairwise comparisons Alpha= 0.1 df= 27 MSE= 1.896296 Critical Value of F= 2.51061 Minimum Significant Difference= 1.38 Means with the same letter are not significantly different. Scheffe Grouping Mean N SETTING A 11.8000 10 alone B 10.0000 10 together B B 8.2000 10 monitor

29 ANOVA With Comparison Tests General Linear Models Procedure Scheffe's test for variable: PUZZLES NOTE: This test controls the type I experimentwise error rate but generally has a higher type II error rate than REGWF for all pairwise comparisons Alpha= 0.1 df= 27 MSE= 1.896296 Critical Value of F= 2.51061 Minimum Significant Difference= 1.38 Means with the same letter are not significantly different. Scheffe Grouping Mean N SETTING A 11.8000 10 alone A A 10.0000 10 together A A 8.2000 10 monitor

30 ANOVA With Comparison Tests General Linear Models Procedure Dependent Variable: PUZZLES Contrast DF Contrast SS Mean Square F Value Pr > F Alone vs. Together 1 64.80000000 64.80000000 34.17 0.0001 Alone vs. Monitor 1 16.20000000 16.20000000 8.54 0.0069 Together vs. Monitor 1 16.20000000 16.20000000 8.54 0.0069 Alone vs. Others 1 48.60000000 48.60000000 25.63 0.0001 Together vs. Others 1 48.60000000 48.60000000 25.63 0.0001

31

32 One-Way Analysis of Variance Exercise Four groups of randomly selected and randomly assigned students were taught a basic course in statistics by four different methods. A standardized test was given at the end of the semester to all four groups. Evaluate the differences in teaching approaches using the Analysis of Variance. Assume that  = 0.05, and use the F distribution (Appendix 3, p. 544). Group 1Group 2Group 3Group 4 20 15 22 19 22 18 21 23 21 20 24 20 20 18 25 18 19 19 24 15 1. Expressed symbolically, what is the null hypothesis? ______________ 2. What is the value of the sum of squares between? ______________ 3. What is the value of the sum of squares within? ______________ 4. How many degrees of freedom between? ______________ 5. How many degrees of freedom within? ______________ 6. What is the value of the mean square between? ______________ 7. What is the value of the mean square within? ______________ 8. What is the value of the F-ratio? ______________ 9. What is the critical value of F? ______________ 10. Do you reject the null hypothesis? ______________

33 One-Way Analysis of Variance Exercise Answers Four groups of randomly selected and randomly assigned students were taught a basic course in statistics by four different methods. A standardized test was given at the end of the semester to all four groups. Evaluate the differences in teaching approaches using the Analysis of Variance. Assume that  = 0.05, and use the F distribution (Appendix 3, p. 544). Group 1Group 2Group 3Group 4 20 15 22 19 22 18 21 23 21 20 24 20 20 18 25 18 19 19 24 15 1. Expressed symbolically, what is the null hypothesis?  1 =  2 =  3 =  4 2. What is the value of the sum of squares between? 76.55 3. What is the value of the sum of squares within? 64.00 4. How many degrees of freedom between? 3 5. How many degrees of freedom within? 16 6. What is the value of the mean square between? 25.517 7. What is the value of the mean square within? 4.000 8. What is the value of the F-ratio? 6.379 9. What is the critical value of F? 3.24 10. Do you reject the null hypothesis? Yes, Reject


Download ppt "One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test."

Similar presentations


Ads by Google