Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE
Psy B07 Chapter 1Slide 2 t-test refresher In chapter 7 we talked about analyses that could be conducted to test whether pairs of means were significantly different. For example, consider an experiment in which we are testing whether using caffeine improves final marks on an exam. We might have two groups, one group (say 12 subjects) who is given normal coffee while they study, another group (say also 12 subjects) who is given the same amount of decaffeinated coffee.
Psy B07 Chapter 1Slide 3 t-test refresher We could now look at the exam marks for those students and compare the means of the two groups using a “between- subjects” (or independent samples) t-test:
Psy B07 Chapter 1Slide 4 t-test refresher
Psy B07 Chapter 1Slide 5 t-test refresher The critical point of the previous example is the following: The basic logic for testing whether or not two means are different is to compare the size of the differences between the groups (which we assume is due to caffeine), relative to the differences within the groups (which we assume is due to random variation.. or error).
Psy B07 Chapter 1Slide 6 t-test refresher This exact logic underlies virtually all statistical tests, including analysis of variance, an analysis that allows us to compare multiple means simultaneously.
Psy B07 Chapter 1Slide 7 Analysis of Variance (ANOVA) – the why? The purpose of analysis of variance is to let us ask whether means are different when we have more than just two means (or, said another way, when our variable has more than two levels). In the caffeine study for example, we were interested in only one variable (caffeine) and we examined two levels of that variable, no caffeine versus some caffeine. Alternately, we might want to test different dosages of caffeine where each dosage would now be considered a “level” of caffeine
Psy B07 Chapter 1Slide 8 Analysis of Variance (ANOVA) – the why? As you’ll see in PsyC08, as you learn about more complicated ANOVAs (and the experimental designs associated with them) we may even be interested in multiple variables, each of which may have more than two levels. For example, we might want to simultaneously consider the effect of caffeine (perhaps several different dose levels) and gender (generally just two levels) on test performance.
Psy B07 Chapter 1Slide 9 Analysis of Variance (ANOVA) – the what? The critical question is, is the variance between the groups significantly bigger than the variance within the groups to allow us to conclude that the between group differences are more than just random variation?
Psy B07 Chapter 1Slide 10 Analysis of Variance (ANOVA) – the what?
Psy B07 Chapter 1Slide 11 Analysis of Variance (ANOVA) – the what?
Psy B07 Chapter 1Slide 12 Analysis of Variance (ANOVA) – the what?
Psy B07 Chapter 1Slide 13 Analysis of Variance (ANOVA) – the how? The textbook presents the logic in a more verbal/statistical manner, and it can’t hurt to think of this in as manner different ways as possible, so, in that style: Let’s say we were interested in testing three doses of caffeine; none, moderate and high.
Psy B07 Chapter 1Slide 14 Analysis of Variance (ANOVA) – the how? First of all, use of analysis of variance assumes that these groups have (1) data that is approximately normally distributed, (2) approximately equal variances, and (3) that the observations that make up each group are independent. Given the first two assumptions, only the means can be different across the groups - thus, if the variable we are interested in is having an affect on performance, we assume it will do so by affecting the mean performance level.
Psy B07 Chapter 1Slide 15 Analysis of Variance (ANOVA) – the how?
Psy B07 Chapter 1Slide 16 Analysis of Variance (ANOVA) – the how? Mean = s 2 = s 2 = s = s =
Psy B07 Chapter 1Slide 17 Analysis of Variance (ANOVA) – the how? From this data, we can generate two estimates of the population variance 2. “Error” estimate ( σ 2 e ): One estimate we can generate makes no assumptions about the veracity (trueness or falseness) of the null hypothesis. Specifically, the variance within each group provides an estimate of σ 2 e.
Psy B07 Chapter 1Slide 18 Analysis of Variance (ANOVA) – the how? Given the assumption of equal variance (all of which provide estimates of 2 ), our best estimate of 2 would be the mean of the group variances. This estimate of the population variance is sometimes called the mean squared error (MS e ) or the mean squared within (MS within ).
Psy B07 Chapter 1Slide 19 Analysis of Variance (ANOVA) – the how? Treatment estimate (σ 2 t ): Alternatively, if we assume the null hypothesis is true (i.e., that there is no difference between the groups), then another way to estimate the population variance is to use the variance of the means across the groups. By the central limit theorem, the variance of our sample means equals the population variance divided by n, where n equals the number of subjects in each group.
Psy B07 Chapter 1Slide 20 Analysis of Variance (ANOVA) – the how? Therefore, employing some algebra: This is also called the mean squared treatment (MS treat ) or mean squared between (MS between ).
Psy B07 Chapter 1Slide 21 Analysis of Variance (ANOVA) – the how? OK, so if the null hypothesis really is true and there is no difference between the groups, then these two estimates will be the same: However, if the treatment is having an effect, this will inflate σ 2 τ as it will not only reflect variance due to random variation, but also variance due to the treatment (or variable).
Psy B07 Chapter 1Slide 22 Analysis of Variance (ANOVA) – the how? The treatment will not affect σ 2 e, therefore, by comparing these two estimates of the population variance, we can assess whether the treatment is having an effect: Measure of Chance Variance + Treatment Effect Measure of Chance Variance Only
Psy B07 Chapter 1Slide 23 Analysis of Variance (ANOVA) – the how? 1) Calculate a SS error, SS treat, and SS total. 2) Calculate a df error, df treat and df total 3) By dividing each SS by its relevant df, we then arrive at MS error and MS treat (and MS total ). 4) Then we divide MS treat by MS error to get our F-ratio, which we then use for hypothesis testing.
Psy B07 Chapter 1Slide 24 Sums of Squares The sum of squares is simply a measure of the sum of the squared deviations of observations from some mean: OK, so rather than directly calculating the MS error and MS treat (which are actually estimates of the variance within and between groups), we can calculate SS error and SS treat.
Psy B07 Chapter 1Slide 25 ANOVA
Psy B07 Chapter 1Slide 26 SS error To calculate SS error, we subtract the mean of each condition from each score, square the differences, and add them up, and then add up all the sums of squares
Psy B07 Chapter 1Slide 27 SS error There is a different way of doing this. First, calculate ΣX 2 for each group For example, for Group 1, the X 2 would equal ( … ) = Once we have them, we then calculate the sum of squares for each group using the computational formula:
Psy B07 Chapter 1Slide 28 SS error For example, for Group 1, the math would be: To get SS error we then sum all the SS errors. SS error = SS 1 +SS 2 +SS 3 = = =
Psy B07 Chapter 1Slide 29 SS treat To calculate SS treat we subtract the grand mean from each group mean, square the differences, sum them up, and multiply by n.
Psy B07 Chapter 1Slide 30 SS treat Again, there is a different way of doing this. Basically, all we need are our three means and the squares of those means. We then calculate the sum of the means, and the sum of the squared means:
Psy B07 Chapter 1Slide 31 SS treat Now we can calculate the SS using a formula similar to the one before: Once again, because we are dealing with means and not observations, we need to multiply this number by the n that went into each mean to get the real SS treat SS treat = 12(39.81) =
Psy B07 Chapter 1Slide 32 SS total The sum of squares total is simply the sum of squares of all of the data points, ignoring the fact that there are separate groups at all. To calculate it, subtract the grand mean from every score, square the differences, and add them up
Psy B07 Chapter 1Slide 33 SS total Surprise, surprise – there is another way of calculating this as well Here you will need the sum of all the data points, and the sum of all the data points squared. An easy way to get this is to just add up the X and the X 2 for the groups: X = X 1 + X 2 + X 3 = = 2602 X = X 1 + X 2 + X 3 = = 2602 X 2 = X X X 2 3 = = X 2 = X X X 2 3 = =
Psy B07 Chapter 1Slide 34 SS total Then, again using a version of the old SS formula: If all is right in the world, then SS total should equal SS within +SS treat. For us, it does.
Psy B07 Chapter 1Slide 35 df OK, so now we have our three sum of squares, step two is to figure the appropriate degrees of freedom for each. Here’s the formulae: df error =k(n-1) df treat =k-1 df total =N-1 where k = the number of groups, n = the number of subjects within each group, and N = the total number of subjects.
Psy B07 Chapter 1Slide 36 From SS to MS to F MS estimates for treatment and within are calculated by dividing the appropriate sum of squares by its associated degrees of freedom. We then compute an F-ratio by dividing the MS treat by the MS error. Finally, we place all these values in a Source Table that clearly shows all the steps leading up to the final F value.
Psy B07 Chapter 1Slide 37 ANOVA source table The source table for our data would look like this: OK, now what?
Psy B07 Chapter 1Slide 38 Hypothesis Testing Now we are finally ready to get back to the notion of hypothesis testing...that is, we are not ready to answer the following question: If there is really no effect of caffeine on performance, what is the probability of observing an F-ratio as large as More specifically, is that probability less that our chosen level of alpha (e.g.,.05).
Psy B07 Chapter 1Slide 39 Sampling distribution of F How do we arrive at the probability of observing some specific F value? Recall our example when we created 3 groups by randomly sampling individuals from the same population and asking them for some piece of data (e.g. age). In this case, the null hypothesis should be true … the means of the three groups should only vary as a result of chance (or error) variation
Psy B07 Chapter 1Slide 40 Sampling distribution of F If we perform an analysis of variance on this data, the F value should be about 1. However, it will not be exactly 1; rather, there will be a distribution with a mean of 1 and some variance around that mean. This distribution is termed the F distribution, and its exact shape varies as a function of df treat and df error. The important point here is that for any given degrees of freedom, the function can be mathematically specified, allowing one to perform calculus and, therefore, to find the probabilities of certain values.
Psy B07 Chapter 1Slide 41 Hypothesis Testing All we really want to know is whether the F we have obtained in our analysis is significantly larger than we would expect by chance. That is, we want to know whether it falls within the extreme “high” 5% of the chance distribution. Thus, all we really need to know is the critical F value that “cuts off” the extreme 5% of the distribution. If our obtained F is larger than the critical F, we know it is in the “rejection region” and, therefore, that the probability of obtaining an F that large is less than 5%.
Psy B07 Chapter 1Slide 42 Finishing the example From the table, F crit (2,33) = 3.32 Since F obt (3.99) > F crit (3.32) we reject the null hypothesis Mean F crit = 1 = 3.32
Psy B07 Chapter 1Slide 43 Finishing the example One thing to keep in mind – all an ANOVA (significant) tells you is that there is a difference between the means. You can’t tell where exactly this difference lies just yet. That’s in chapter 12 – and PsyC08
Psy B07 Chapter 1Slide 44 Violation of Assumptions The textbook discusses this issue in detail and offers a couple of solutions (including some really nasty formulae) for what to do when the variances of the groups are not homogeneous. What I want you to know is the following: 1) If the biggest variance is more than 4 times larger than the smallest variance, you may have a problem. 2) There are things that you can do to calculate an F if the variances are heterogeneous.
Psy B07 Chapter 1Slide 45 The Structural Model Let’s assume that the average height of all people is 5’7”. Let’s also assume that males tend to be 2” taller than females, on average. Given this, I can describe anyone’s height using three components: 1) the mean height of all people, 2) the component due to sex, and 3) individual contributions My height is about 6’0”. I can break this down into: 5’7”+2”+3”
Psy B07 Chapter 1Slide 46 The Structural Model In more general terms, we can write the model out like this:
Psy B07 Chapter 1Slide 47