What if. . . You were asked to determine if psychology and sociology majors have significantly different class attendance (i.e., the number of days a person misses class) You would simply do a two-sample t-test two-tailed Easy!
But, what if. . . You were asked to determine if psychology, sociology, and biology majors have significantly different class attendance You can’t do a two-sample t-test You have three samples No such thing as a three sample t-test!
One-Way ANOVA ANOVA = Analysis of Variance This is a technique used to analyze the results of an experiment when you have more than two groups
Example You measure the number of days 7 psychology majors, 7 sociology majors, and 7 biology majors are absent from class You wonder if the average number of days each of these three groups was absent is significantly different from one another
Hypothesis Alternative hypothesis (H1) H1: The three population means are not all equal
Hypothesis Alternative hypothesis (H1) socio = bio
Hypothesis Alternative hypothesis (H1) socio = psych
Hypothesis Alternative hypothesis (H1) psych = bio
Hypothesis Alternative hypothesis (H1) psych = bio = soc
Hypothesis Alternative hypothesis (H1) Notice: It does not say where this difference is at!!
Hypothesis Null hypothesis (H0) psych = socio = bio In other words, all three means are equal to one another (i.e., no difference between the means)
Results X = 3.00 X = 2.00 X = 1.00
Logic Is the same as t-tests 1) calculate a variance ratio (called an F; like t-observed) 2) Find a critical value 3) See if the the F value falls in the critical area
Between and Within Group Variability Two types of variability Between / Treatment the differences between the mean scores of the three groups The more different these means are, the more variability!
Between Variability Compute S2 on the means X = 3.00 X = 2.00 X = 1.00
Between Variability S2 = 1 X = 3.00 X = 2.00 X = 1.00
Between Variability + 5 X = 3.00 X = 2.00 X = 1.00
Between Variability X = 8.00 X = 2.00 X = 1.00
Between Variability Compute S2 on the means X = 8.00 X = 2.00 X = 1.00
Between Variability S2 = 14.33 X = 8.00 X = 2.00 X = 1.00
Between Group Variability What causes this variability to increase? 1) Effect of the variable (college major) 2) Sampling error
Between and Within Group Variability Two types of variability Within / Error the variability of the scores within each group
Within Variability Compute S2 within each group X = 3.00 X = 2.00
Within Variability S2 =.67 S2 =1.67 S2 =.67 X = 3.00 X = 2.00 X = 1.00
Within Group Variability What causes this variability to increase? 1) Sampling error
Between and Within Group Variability Between-group variability Within-group variability
Between and Within Group Variability sampling error + effect of variable sampling error
Between and Within Group Variability sampling error + effect of variable sampling error Thus, if null hypothesis was true this would result in a value of 1.00
Between and Within Group Variability sampling error + effect of variable sampling error Thus, if null hypothesis was not true this value would be greater than 1.00
Calculating this Variance Ratio
Calculating this Variance Ratio
Calculating this Variance Ratio
Degrees of Freedom dfbetween dfwithin dftotal dftotal = dfbetween + dfwithin
Degrees of Freedom dfbetween = k - 1 (k = number of groups) dfwithin = N - k (N = total number of observations) dftotal = N - 1 dftotal = dfbetween + dfwithin
Degrees of Freedom dfbetween = k - 1 3 - 1 = 2 dfwithin = N - k 21 - 3 = 18 dftotal = N - 1 21 - 1 = 20 20 = 2 + 18
Sum of Squares SSBetween SSWithin SStotal SStotal = SSBetween + SSWithin
Sum of Squares SStotal
Sum of Squares SSWithin
Sum of Squares SSBetween
Sum of Squares Ingredients: X X2 Tj2 N n
To Calculate the SS
X Xs = 21 Xp = 14 XB = 7
X X = 42 Xs = 21 Xp = 14 XB = 7
X2 X = 42 Xs = 21 Xp = 14 XB = 7 X2s = 67 X2P = 38 X2B = 11
X2 X = 42 X2 = 116 Xs = 21 Xp = 14 XB = 7 X2s = 67 X2P = 38
T2 = (X)2 for each group X = 42 X2 = 116 Xs = 21 Xp = 14 XB = 7 T2P = 196 T2B = 49 T2s = 441
Tj2 X = 42 X2 = 116 Tj2 = 686 Xs = 21 Xp = 14 XB = 7 X2s = 67 T2P = 196 T2B = 49 T2s = 441
N X = 42 X2 = 116 Tj2 = 686 N = 21 Xs = 21 Xp = 14 XB = 7 T2P = 196 T2B = 49 T2s = 441
n X = 42 X2 = 116 Tj2 = 686 N = 21 n = 7 Xs = 21 Xp = 14 XB = 7 T2P = 196 T2B = 49 T2s = 441
Ingredients X = 42 X2 = 116 Tj2 = 686 N = 21 n = 7
Calculate SS X = 42 X2 = 116 Tj2 = 686 N = 21 n = 7 SStotal
Calculate SS 42 32 116 21 SStotal X = 42 X2 = 116 Tj2 = 686 N = 21
Calculate SS X = 42 X2 = 116 Tj2 = 686 N = 21 n = 7 SSWithin
Calculate SS 686 18 116 7 SSWithin X = 42 X2 = 116 Tj2 = 686 N = 21
Calculate SS X = 42 X2 = 116 Tj2 = 686 N = 21 n = 7 SSBetween
Calculate SS 14 686 42 7 21 SSBetween X = 42 X2 = 116 Tj2 = 686
Sum of Squares SSBetween SSWithin SStotal SStotal = SSBetween + SSWithin
Sum of Squares SSBetween = 14 SSWithin = 18 SStotal = 32 32 = 14 + 18
Calculating the F value
Calculating the F value
Calculating the F value 14 7 2
Calculating the F value 7
Calculating the F value 7 18 1 18
Calculating the F value 7 7 1
How to write it out
Significance Is an F value of 7.0 significant at the .05 level? To find out you need to know both df
Degrees of Freedom Dfbetween = k - 1 (k = number of groups) dfwithin = N - k (N = total number of observations)
Degrees of Freedom Dfbetween = k - 1 3 - 1 = 2 dfwithin = N - k 21 - 3 = 18 Use F table Dfbetween are in the numerator Dfwithin are in the denominator Write this in the table
Critical F Value F(2,18) = 3.55 The nice thing about the F distribution is that everything is a one-tailed test
Decision Thus, if F value > than F critical Reject H0, and accept H1 If F value < or = to F critical Fail to reject H0
Current Example F value = 7.00 F critical = 3.55 Thus, reject H0, and accept H1
Alternative hypothesis (H1) H1: The three population means are not all equal In other words, psychology, sociology, and biology majors do not have equal IQs Notice: It does not say where this difference is at!!
How to write it out
SPSS
Conceptual Understanding Complete the above table for an ANOVA having 3 levels of the independent variable and n = 20. Test for significant at .05.
Conceptual Understanding Fcrit = 3.18 Complete the above table for an ANOVA having 3 levels of the independent variable and n = 20. Test for significant at .05. Fcrit (2, 57) = 3.15
Conceptual Understanding Distinguish between: Between-group variability and within-group variability
Conceptual Understanding Distinguish between: Between-group variability and within-group variability Between concerns the differences between the mean scores in various groups Within concerns the variability of scores within each group
Between and Within Group Variability Between-group variability Within-group variability
Between and Within Group Variability sampling error + effect of variable sampling error
Conceptual Understanding Under what circumstance will the F ratio, over the long run, approach 1.00? Under what circumstances will the F ratio be greater than 1.00?
Conceptual Understanding Under what circumstance will the F ratio, over the long run, approach 1.00? Under what circumstances will the F ratio be greater than 1.00? F ratio will approach 1.00 when the null hypothesis is true F ratio will be greater than 1.00 when the null hypothesis is not true
Conceptual Understanding Without computing the SS within, what must its value be? Why?
Conceptual Understanding The SS within is 0. All the scores within a group are the same (i.e., there is NO variability within groups)
Example Freshman, Sophomore, Junior, Senior Measure Happiness (1-100)
ANOVA Traditional F test just tells you not all the means are equal Does not tell you which means are different from other means
Why not Do t-tests for all pairs Fresh vs. Sophomore Fresh vs. Junior Fresh vs. Senior Sophomore vs. Junior Sophomore vs. Senior Junior vs. Senior
Problem What if there were more than four groups? Probability of a Type 1 error increases. Maximum value = comparisons (.05) 6 (.05) = .30
Chapter 12 A Priori and Post Hoc Comparisons Multiple t-tests Linear Contrasts Orthogonal Contrasts Trend Analysis Bonferroni t Fisher Least Significance Difference Studentized Range Statistic Dunnett’s Test
Multiple t-tests Good if you have just a couple of planned comparisons Do a normal t-test, but use the other groups to help estimate your error term Helps increase you df
Remember
Note
Proof Candy Gender 5.00 1.00 4.00 1.00 7.00 1.00 6.00 1.00 1.00 2.00 2.00 2.00 3.00 2.00 4.00 2.00
t = 2.667 / .641 = 4.16
t = 2.667 / .641 = 4.16
t = 2.667 / .641 = 4.16
t = 2.667 / .641 = 4.16
Also, when F has 1 df between
Within Variability Within variability of all the groups represents “error” You can therefore get a better estimate of error by using all of the groups in your ANOVA when computing a t-value
Note: This formula is for equal n
Hyp 1: Juniors and Seniors will have different levels of happiness Hyp 2: Seniors and Freshman will have different levels of happiness
Hyp 1: Juniors and Seniors will have different levels of happiness
Hyp 1: Juniors and Seniors will have different levels of happiness
Hyp 1: Juniors and Seniors will have different levels of happiness
Hyp 1: Juniors and Seniors will have different levels of happiness t crit (20 df) = 2.086
Hyp 1: Juniors and Seniors will have different levels of happiness t crit (20 df) = 2.086 Juniors and seniors do have significantly different levels of happiness
Hyp 2: Seniors and Freshman will have different levels of happiness
Hyp 2: Seniors and Freshman will have different levels of happiness
Hyp 2: Seniors and Freshman will have different levels of happiness
Hyp 2: Seniors and Freshman will have different levels of happiness t crit (20 df) = 2.086
Hyp 2: Seniors and Freshman will have different levels of happiness t crit (20 df) = 2.086 Freshman and seniors do not have significantly different levels of happiness
Hyp 1: Juniors and Sophomores will have different levels of happiness Hyp 2: Seniors and Sophomores will have different levels of happiness PRACTICE!