Download presentation
Presentation is loading. Please wait.
1
Chapter 9: The analysis of variance for simple experiments (single factor, unrelated groups designs).
2
Overview of experimental research Groups start off similar to each other on every possible measure. During the experiment, groups are TREATED DIFFERENTLY Responses thought to be effected by the different treatments are then measured If the group means become more different that they should be if only random sampling fluctuation were at work, the differences may have been caused, in part, by the different ways the groups were treated. Determining whether the differences between group means result simply from sampling fluctuation or are (probably) due in part to the treatment differences is the job of the statistical analysis.
3
Null hypothesis Null hypothesis: The different ways the groups are treated have no effects or all have the same effect. –Corollary: Any differences among the group means at the end of the experiment continue to reflect the same thing as they did before the experiment, random sampling fluctuation.
4
Experimental hypothesis Experimental hypothesis: The different ways the groups are treated affect the groups differently. –Corollary: At the end of the experiment, the group means differ too much for their differences to be explained by random sampling fluctuation. Being treated differently causes the groups to become more different. That is, the group means will be further apart.
5
Groups will always differ simply because of sampling fluctuation caused by ID and MP In this course, we will study the analysis of experiments in which groups treated differently are measured once and compared to each other. Groups differ from each other because they are composed of different people (ID) and because there are always measurement problems (MP) Just by chance, if there are three groups, one will have the highest mean, another the lowest mean, and the third will fall in the middle. Sigma 2 as estimated by MS W is our basic measure of the effects of ID & MP.
6
The logic of the F test We are asking whether the difference between the treatment group means at the end of the experiment is solely the result of random sampling fluctuation (H 0 ) or of random sampling fluctuation plus the effects of treating the groups differently (H 1 ). So we need two measures. One measure should only be able to reflect random sampling fluctuation. The other should be able to reflect random sampling fluctuation plus the effects of the IV.
7
Our two estimates of sigma 2 zOne way to estimate sigma 2 is to find the difference between each score and its group mean, square and sum those differences. This yields a sum of squares within group (SS W ). To estimate sigma 2 you divide SS W by degrees of freedom within group (df W =n-k). This estimate of sigma 2 is called the mean square within groups, MS W. You have been calculating it since Chapter 5. zThe other way to estimate sigma 2 is to find the difference between each participant’s group mean and the overall mean. Then you square and sum the differences. This yields a sum of squares between the group and grand means (SS B ). To estimate sigma 2 you divide SS B by the number of degrees of freedom between groups (df B =k-1). The result is called the mean square between groups,MS B. It is new.
8
What is indexed by sigma 2 and its best estimate:MS W Sigma 2 indexes random sampling fluctuation. Its best estimate is MS W. Remember that everyone in a treatment group is treated the same way. Thus, any differences among the scores within each group, the basis of MS W, can only reflect ID + MP. Thus MS W can not reflect any effects of the independent variable. Because it can only index random individual differences and measurement problems, MS W is always your best estimate of the effects of random sampling fluctuation as indexed by sigma 2, the population variance. We will use MS W as the denominator in the F ratio.
9
What is indexed by the mean square between groups (MS B )? Group means will differ from each other and from mu because they are made up of different people and because there are always measurement problems. This is the basis of random sampling fluctuation. However, since we treat the groups differently, the distance between each group’s mean and the overall mean can also reflect the effects of the independent variable (as well as the effects of random individual differences and random measurement problems).
10
What is indexed by the mean square between groups (MS B )? Thus MS B reflects individual differences, measurement problems, and the possible effect of the independent variable (MS B = ID + MP + (?)IV) If treating the groups differently (the IV) has an effect, it will push the group means apart. When that happens MS B reflects three sources of variation (ID + MP + IV), not just ID + MP. In that case, MS B will overestimate sigma 2 and be larger than MS W. In the single factor analysis of variance (Chapter 9), MS B will serve as the numerator in the F ratio.
11
Testing the Null Hypothesis (H 0 ) H 0 says that the IV has no effect. If H 0 is true, groups differ from each other and from the overall mean only because of sampling fluctuation based on random individual differences and measurement problems (ID + MP). These are the same things that make scores differ from their own group means. So, according to H 0, MS B and MS W are two ways of measuring the same thing (ID + MP) and are both good estimates of sigma 2. Two measurements of the same thing should be about equal to each other and a ratio between them should be about equal to 1.00.
12
In simple experiments (Ch.9), the ratio between MS B and MS W is the Fisher or F ratio. In simple experiments, F=MS B /MS W. H 0 says F should be about 1.00.
13
The Experimental Hypothesis (H 1 ) The experimental hypothesis says that the groups’ means will be made different from each other (pushed apart) by the IV, the independent variable (as well as by random individual differences and measurement problems). If the means are pushed apart, MS B will increase, reflecting the effects of the independent variable (as well as of the random factors). MS W will not. So MS B will be larger than MS W Therefore, H 1 suggests that an F ratio comparing MS B to MS W should be larger than 1.00.
14
As usual, we set up 95% confidence intervals around the prediction of the null. In Ch. 9, the ratio MS B /MS W is called the F ratio. If the F ratio is about 1.00, the prediction of the null is correct. It is rare for the F ratio to be exactly 1.00. At some point, the ratio gets too different from 1.00 to be consistent with the null. We are only interested in the case where the ratio is greater than 1.00 which indicates that the means are further apart than the null suggests. The F table tells us when the difference among the means is too large to be explained as reflecting only ID + MP, sampling fluctuation alone.
15
So, to conduct an F test in a simple experiment You compute MS W and MS B. F= MS B / MS W You compare the results to the F table. If your ratio is less than the critical value for F found in the table, your data are nonsignificant, consistent with H 0. If your ratio equals or exceeds the critical value in the F table, your data are significant. You must reject H 0 and extrapolate your results to the population from which your samples were drawn.
16
REMEMBER, YOU NEVER TEST THE EXPERIMENTAL HYPOTHESIS STATISTICALLY. You can only examine the data in light of the experimental hypothesis after rejecting the null. Good research design makes the experimental hypothesis the only reasonable alternative to the null. Accepting the experimental hypothesis is based on good research design and logic, not statistical tests.
17
What comes next We need to do two things –Learn to compute the F and t tests –Review the logic we’ve just gone through so that you fully understand it We will do them in that order, first analyzing data from typical single factor, unrelated groups experiments. Then we will review the concepts.
18
Analyzing the results of an experiment
19
An experiment Population: Male, self-selected, “social drinkers” Number of participants (9) and groups (3) Design: Single factor, unrelated groups Independent variable: Stress –Level 1: No Stress: Participants labeled 1.1, 1.2, 1.3 –Level 2: Moderate stress: Participants 2.1, 2.2, 2.3 –Level 3: High stress: Participants 3.1, 3.2, 3.3 Dependent variable: ounces alcohol beverage consumed H 0 : Stress does not affect alcohol consumption. H 1 : Stress will cause increased alcohol consumption.
20
Computing MS W and MS B 1.1 1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3 8 10 12 15 13 17 18 999000999999000999 -3 0 3 10 13 16 13 404114914404114914 -2 0 2 2 -3 1 2 10 13 16
21
ANOVA summary table Between Groups Stress level Within Groups Error 54 2 27 26 6 4.67 5.78 SS df MS F p ? We need to look at the F table to determine significance. Divide MS B by MS W to calculate F.
22
Using the F table To use the F table, you must specify the degrees of freedom (df) for the numerator and denominator of the F ratio. In both Ch 9 and Ch 10 the denominator is MS W. As you know, df W = n-k. In Ch 9, the numerator is MS B and df B =k-1.
23
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
24
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 36 4.41 3.26 2.86 2.63 2.48 2.36 2.28 2.21 7.39 5.25 4.38 3.89 3.58 3.35 3.18 3.04 40 4.08 3.23 2.84 2.61 2.45 2.34 2.26 2.19 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 60 4.00 3.15 2.76 2.52 2.37 2.25 2.17 2.10 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 100 3.94 3.09 2.70 2.46 2.30 2.19 2.10 2.03 6.90 4.82 3.98 3.51 3.20 2.99 2.82 2.69 400 3.86 3.02 2.62 2.39 2.23 2.12 2.03 1.96 6.70 4.66 3.83 3.36 3.06 2.85 2.69 2.55 3.84 2.99 2.60 2.37 2.21 2.09 2.01 1.94 6.64 4.60 3.78 3.32 3.02 2.80 2.64 2.51
25
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 These are related to the number of different treatment groups. They relate to the Mean Square between groups. k-1
26
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 These are related to the number of subjects. They relate to the Mean Square within groups. n-k
27
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 The critical values in the top rows are alpha =.05.
28
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 The critical values in the bottom rows are for bragging rights (p<.01).
29
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 In an experiment with 3 treatment groups, we have 2 df between groups (k-1). If we have 9 subjects, and 3 groups, we have 6 df within groups (n-k). Since this is the ratio of MS B to MS W, the variance estimate between groups must be 5.14 times larger than the variance estimate within groups. 5.14 If we find an F ratio of 5.14 or larger, we reject the null hypothesis and declare that there is a treatment effect, significant at the.05 alpha level.
30
ANOVA summary table Between Groups Stress level Within Groups Error 54 2 27 26 6 4.67 5.78 SS df MS F p ?
31
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 5.14 10.92 5.78 is as large or larger than the critical value at the.05 alpha level (5.14). So it is statistically significant. It does not equal or exceed the critical value for.01 F (2,6)=5.78, p<.05
32
Critical values in the F table The critical values in the F table depend on how good MS B and MS W are as estimates of sigma 2. The better the estimates, the closer to 1.00 the null must predict that their ratio will fall. What makes estimates better??? DEGREES OF FREEDOM. Each degree of freedom corrects the sample statistic back towards its population parameter. Thus, the more degrees of freedom for MS W and MS B, the closer the critical value of F will be to 1.00.
33
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 Notice the usual effect of higher df The more df B and df W,... the better our estimates of sigma 2,... the closer F should be to 1.00 when the null is true. H 0 says that F should be about 1.00.
34
Remember the pattern of critical values in the F table If the null is true, as df increase, each mean square becomes a better estimate of sigma 2 and the null must predict an F ratio closer and closer to 1.00. Whether an F ratio is significant depends on df W and df B as well as on the size of the ratio.
35
The F test and the t test
36
Another experiment Population: Female, moderately depressed outpatients Number of participants (10) and groups (2) Design: Single factor, unrelated groups Independent variable: Dose of new drug, Feelbetter –Level 1: Placebo (n=5): Participants 1.1 – 1.5 –Level 2: Moderate dose (n=5): Participants 2.1 – 2.5 Dependent variable: HAM-D scores. Lower scores are better. H 0 : Feelbetter does no more good than placebo. H 1 : The average response in the treatment groups will differ more than they would unless Feelbetter effectively helps depression.
37
The results: The five participants who received the placebo scored 18, 21, 22, 30, 24. The five participants who received Feelbetter scored 11, 15, 13, 17 and 19. Obviously, the Feelbetter group had lower Ham-D scores. But was that attributable to sampling fluctuation or are the groups too dissimilar for that explanation to cover the data? We need to compute F, where F = MS B /MS W.
38
Here is the computation of MS W and MS B : Now you set up an Anova summary table, compute F, determine significance, and write the results properly 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 18 21 22 30 24 11 15 13 17 19 16 4 -4 23 15 19 25 4 1 49 1 16 0 4 16 -5 -2 7 1 -4 0 -2 2 4 23 15
39
ANOVA summary table Between Groups Drug dose Within Groups Error 160 1 160.00 120 8 15.00 10.67 SS df MS F p ? We need to look at the F table to determine significance. Divide MS B by MS W to calculate F.
40
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in denominator 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
41
ANOVA summary table Between Groups Drug dose Within Groups Error 160 1 160.00 120 8 15.00 10.67 SS df MS F p .05 F(1,8)=10.67, p<.05.
42
Now let us use the same data to do a t test. (t for 2, F for more). One way to do a t test is to compare estimates of sigma (rather than of sigma 2 ). So, we take the square roots of MS B and MS W to get s B and s, find their ratio and compare it to the t table.
43
t test summary table Between Groups Drug dose Within Groups Error 160 1 12.65 120 8 3.87 3.226 SS df s t p ? We need to look at the t table to determine significance.
44
df 12345678.05 12.7064.3033.1822.7762.5712.4472.365 2.306.01 63.6579.9255.8414.6044.0323.7073.499 3.355 df 910111213141516.05 2.2622.2282.2012.1792.1602.1452.1312.120.01 3.2503.1693.1063.0553.0122.9972.9472.921 df 1718192021222324.05 2.1102.1012.0932.0862.0802.0742.0692.064.01 2.8982.8782.8612.8452.8312.8192.8072.797 df 2526272829304060.05 2.0602.0562.0522.0482.0452.0422.0212.000.01 2.7872.7792.7712.7632.7562.7502.7042.660 df 1002005001000200010000.05 1.9841.9721.9651.9621.9611.960.01 2.6262.6012.5862.5812.5782.576
45
Example 2 – t test summary table Between Groups Drug dose Within Groups Error 160 1 12.65 120 8 3.87 3.266 SS df s t p .05 With two groups, t = the square root of F. F and t always give you the same level of significance. t(8)=3.266, p<.05.
46
The t Test: t for 2, F for More The t test is a special case of the F ratio. If there are only two levels (groups) of the independent variable, then
47
t table and F table when there are only two groups: df in the numerator is always 1
48
Relationship between t and F tables Because there is always 1 df between groups, the t table is organized only by degrees of freedom within group (df W ). By the way, the values in the t table are the square root of the values in the first column of the F table.
49
df 12345678.05 12.7064.3033.1822.7762.5712.4472.3652.306.01 63.6579.9255.8414.6044.0323.7073.4993.355 df 910111213141516.05 2.2622.2282.2012.1792.1602.1452.1312.120.01 3.2503.1693.1063.0553.0122.9972.9472.921 df 1718192021222324.05 2.1102.1012.0932.0862.0802.0742.0692.064.01 2.8982.8782.8612.8452.8312.8192.8072.797 df 2526272829304060.05 2.0602.0562.0522.0482.0452.0422.0212.000.01 2.7872.7792.7712.7632.7562.7502.7042.660 df 1002005001000200010000.05 1.9841.9721.9651.9621.9611.960.01 2.6262.6012.5862.5812.5782.576 t table from Chapter 6 Let’s look at these values.
50
df 3 4 5.05 3.1822.7762.571.01 5.8414.6044.032 t table from Chapter 6 F table. 3 10.13 34.12 4 7.71 21.20 5 6.61 16.26 1
51
Back to the basic concepts It would be good to understand the logic behind the F test well enough so you could explain how the test is created by the logic of the situation. So we will review the concepts now.
52
Let’s take that one point at a time. At the beginning of an experiment: Participants are randomly selected from a population. Then they are randomly assigned to treatment groups. Thus, at the beginning of the study, each treatment group is a random (sub)sample from a specific population.
53
Groups start off much the same in every possible way Since each treatment group is a random sample from the population, each group’s mean and variance will be similar to that of the population. That is, each group’s mean will be a best estimate of mu, the population mean. And the spread of scores around each group’s mean will yield a best estimate of sigma 2 and sigma.
54
So: At the beginning of an experiment the treatment groups differ only because of random sampling fluctuation. When there are different people in each group, the random sampling fluctuation is caused by 1.) random individual differences and 2.) random measurement problems.
55
Sampling fluctuation is the product of the inherent variability of the data. That is what is indexed by sigma 2, the average squared distance of scores from the population mean, mu.
56
To summarize: Since the group means and variances of random samples will be similar to that of the population, they will be similar to each other. This is true for any and all things you can measure. The only differences among the groups at the beginning of the study on any and all measures will be the (mostly minor) differences associated with random sampling fluctuation caused by the fact that there are different people in each group and that there are always random measurement problems (ID + MP).
57
The ultimate question If we then treat the groups differently, will the treatments make the groups more different from each other at the end of the experiment than if only sampling fluctuation created their differences?
58
In the simplest experiments - (Ch 9) In the simplest experiments, the groups are exposed to treatments that vary on a single dimension. The dimension on which treatments of the groups vary is called the independent variable. We call the specific ways the groups are treated the “levels of the independent variable.”
59
The independent variable An independent variable can be any preplanned difference in the way groups are treated. Which kind of difference you chose relates to the experimental hypothesis, H 1. For example, if you think you have a new medication for bipolar disorder, you would compare the effect of various doses of the new drug to placebo in a random sample of bipolar patients. Thus, the groups would differ in terms of the dose of drug. Proper experimental design would ensure that the differences in dose received is the only way the groups will be systematically treated differently from each other.
60
Why is it called the “independent variable”? Remember, we call the different treatments the “levels” of the independent variable. Who gets which level is random. It is determined solely by the group to which a participant is randomly assigned. So, any difference in the way a person is treated during the experiment is unrelated to or “independent of” the infinite number of pre- existing differences that precluded causal statements in correlational research.
61
The dependent variable Relevant responses (called dependent variables) are then measured to see whether the independent variable caused differences among the treatment conditions beyond those expected given ordinary sampling fluctuation. That is, we want to see whether response are related to (dependent on) the different levels of the independent variable to which the treatment groups were exposed.
62
Differences after the experiment among group means on the dependent variable may well be simple sampling fluctuation! The groups will always differ somewhat from each other on anything you measure due to sampling fluctuation. With 3 groups, one will score highest, one lowest and one in the middle just by chance. In four groups one will score highest, one lowest, with two in the middle, one higher than the other. Etc. So the simple fact that the groups differ somewhat is not enough to determine that the independent variable, the different ways the groups were treated, caused the differences. We have to determine whether the groups are more different than they should be if only sampling fluctuation is at work.
63
H 0 & H 1 : If one is wrong, the other must be right. Either the independent variable would cause differences in responses (the dependent variable) in the population as a whole or it would not. H 0 : The different conditions embodied by the independent variable would have NO EFFECT if administered to the whole population. H 1: The different conditions embodied by the independent variable would produce different responses if administered to the whole population
64
H 0 & H 1 AGAIN: If one is wrong, the other must be right. Either the independent variable would cause differences in responses (the dependent variable) in the population as a whole or it would not. H 0 : The different conditions embodied by the levels of the independent variable would have the same effects as each other on the whole population. Any differences among the means of the treatment groups at the end of the experiment is purely the result of random sampling fluctuation. H 1: The different conditions embodied by the levels of the independent variable would produce different responses, if administered to the whole population.
65
Generalizing from samples to the population they represent
66
The population can be expected to respond to the different levels of the IV similarly to the samples Remember, random samples are representative of the population from which they are drawn. If the different levels of the independent variable cause the groups to differ (more than they would from simple sampling fluctuation), the same thing should be true for the rest of the population.
67
For example: Say a new psychotherapy causes a random sample of anxious patients to become less anxious in comparison to treatment groups given more conventional approaches or pill placebo. Then, we would expect all anxious patients to respond better to the new treatment than to the ones to which it was compared.
68
However! As in the case of correlation, we don’t want to toss out treatments that we know work JUST because the new treatment happens to do better in an experiment. We would want to be sure that any differences among the treatment groups is not just a chance finding based on random sampling fluctuation.
69
The Null Hypothesis The null hypothesis (H 0 ) states that the only reason that the treatment group means are different is sampling fluctuation. It says that the independent variable causes no systematic differences among the groups. A corollary: Try the experiment again and a different group will score highest, another lowest. If that is so, you should not generalize from which group in your study scored highest or lowest to the population from which the samples were drawn. Your best prediction remains that everyone will score at the mean on the dependent variable, that treatment condition will not predict response.
70
THIS QUESTION IS IMPORTANT! If the null hypothesis (H 0 ) is true, what is the relationship between the way the treatment groups respond to different levels of the IV and the way the whole population can be expected to respond to different levels of the independent variable?
71
THIS QUESTION IS IMPORTANT! IF H 0 is TRUE, the way the treatment groups respond to the independent variable has nothing to do with the way the population can be expected to respond.
72
Notice that people often respond to random sampling fluctuation as if something was causing a difference. People take all kinds of food supplements because they believe the supplements (e.g.. Echinacea) will make colds go away more quickly. If you tried it, and it worked wouldn’t you tell your friends? Wouldn’t you try it again with your next cold? Having recovered quickly after taking something provides the evidence. After all, its what happened to you!
73
But did the food supplement really make a difference? When carefully tested, NO food supplement has proved to shorten colds. The mistake lies in taking random variation in the duration of a cold as evidence that the Echinacea (or whatever) did something beneficial. That’s ok if it is just your pocket book that is affected. But what if you were an FDA scientist? Wouldn’t people expect better evidence of efficacy before they gave the food supplement company an enormous amount of their money?
74
Another example of what happens if no one tests the null. Drugs have to be tested by the FDA for safety and efficacy. They must be shown to be safe and better than placebo. Food supplements don’t. NEITHER DOES SURGERY. New operations require only that a qualified surgeon and a hospital board think a new procedure is PLAUSIBLE. So, the surgeon must just persuade a hospital board that an operation should be beneficial for it to be employed..
75
Angioplasty as a placebo! Heart attacks occur when the blood vessels that feed the heart with oxygen and fuel are blocked by cholesterol based plaque that forms on the artery walls. As people age, the coronary arteries often narrow by the plague on the artery walls. So surgeons open up the coronary arteries with tiny balloons and keep them open with a stint, a little mesh screen. This year, over 900,000 Americans will have stints implanted in a procedure known as angioplasty.
76
You must ask, again and again, what is the evidence. Implanting stints causes about 4% of people to have minor heart attacks. That’s over 35,000 heart injuring operations this year. No one ever proved that implanting lowered the risk of heart attacks. It just made sense.
77
You must ask, again and again, what is the evidence? Did they test the null hypothesis? It’s almost certainly not true. The places that the plague breaks off and really blocks blood vessels is in the areas that are still soft, not yet firmly calcified. In calcified areas, the plague just sits there, narrowing the vessels but doing little harm. The stints are placed in the calcified areas. That solves nothing. By the way, drastically lowering blood cholesterol with drugs called “statins” seemingly works a lot better.
78
Comparisons with proper control groups are the answer. But, as you will learn when you study experimental design, when real world problems are studied, suitable control groups are not the easiest thing in the world to obtain for a variety of both scientific and political reasons.
79
We call rejecting a true null hypothesis a “Type 1 Error.” The first rule in science is “Do not increase error.” Scientists don’t like to say something will make a difference when it isn’t true. So, before we toss away proven treatments, approve a new treatment, or say that something will cause illness or health, we want to be fairly sure that we are not just responding to sampling fluctuation.
80
The scientist’s answer: test the null hypothesis So, as we did with correlation and regression, we assume that everything is equal, all treatments have the same effect, unless we can prove otherwise. The null hypothesis says that the treatments do not systematically differ; one is as good as another. As usual, we test the null hypothesis by asking it to make a prediction and then establishing a range of results for the test statistic consistent with that prediction. As usual, that range is a 95% CI for the test statistic.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.