Download presentation
Presentation is loading. Please wait.
Published byMorgan Bradley Modified over 9 years ago
1
Experimental Design STAT E-150 Statistical Methods
2
2 The design of an experiment typically takes place before the data is collected. It is the design of the experiment that determines the model. Some basic vocabulary: The individuals used in the experiment are called experimental units. When the units are human beings, they are called subjects. A specific experimental condition applied to the units is called a treatment. The explanatory variables in the experiment are often called factors. A specific value of a factor is called a level of the factor. A response variable is a measure of the outcome of the experiment.
3
3 The design of an experiment first describes the response variable(s), the factor(s), and the layout of the treatments. Here is an example: What are the effects of repeated exposure to an advertising message? In an experiment designed to investigate this question, undergraduate students viewed a 40-minute television program that included ads for a digital camera. Some students saw a 30-second commercial; others saw a 90-second version. The same commercial was shown either one, three, or five times during the program. After viewing, all of the students answered questions about their recall of the ad, their attitude toward the digital camera, and their intention to purchase it.
4
4 Who were the subjects?
5
5 What are the effects of repeated exposure to an advertising message? In an experiment designed to investigate this question, undergraduate students viewed a 40-minute television program that included ads for a digital camera. Some students saw a 30-second commercial; others saw a 90-second version. The same commercial was shown either one, three, or five times during the program. After viewing, all of the students answered questions about their recall of the ad, their attitude toward the digital camera, and their intention to purchase it. Who were the subjects? The subjects were undergraduate students
6
6 What are the effects of repeated exposure to an advertising message? In an experiment designed to investigate this question, undergraduate students viewed a 40-minute television program that included ads for a digital camera. Some students saw a 30-second commercial; others saw a 90-second version. The same commercial was shown either one, three, or five times during the program. After viewing, all of the students answered questions about their recall of the ad, their attitude toward the digital camera, and their intention to purchase it. What are the factors of the experiment?
7
7 What are the effects of repeated exposure to an advertising message? In an experiment designed to investigate this question, undergraduate students viewed a 40-minute television program that included ads for a digital camera. Some students saw a 30-second commercial; others saw a 90-second version. The same commercial was shown either one, three, or five times during the program. After viewing, all of the students answered questions about their recall of the ad, their attitude toward the digital camera, and their intention to purchase it. What are the factors of the experiment? The factors are the length of the commercial and the number of repetitions.
8
8 What are the effects of repeated exposure to an advertising message? In an experiment designed to investigate this question, undergraduate students viewed a 40-minute television program that included ads for a digital camera. Some students saw a 30-second commercial; others saw a 90-second version. The same commercial was shown either one, three, or five times during the program. After viewing, all of the students answered questions about their recall of the ad, their attitude toward the digital camera, and their intention to purchase it. What are the levels of each factor?
9
9 What are the effects of repeated exposure to an advertising message? In an experiment designed to investigate this question, undergraduate students viewed a 40-minute television program that included ads for a digital camera. Some students saw a 30-second commercial; others saw a 90-second version. The same commercial was shown either one, three, or five times during the program. After viewing, all of the students answered questions about their recall of the ad, their attitude toward the digital camera, and their intention to purchase it. What are the levels of each factor? The length of the commercial has two levels: 30 seconds and 90 seconds. The number of repetitions has three levels: 1, 3, and 5 repetitions.
10
10 What are the effects of repeated exposure to an advertising message? In an experiment designed to investigate this question, undergraduate students viewed a 40-minute television program that included ads for a digital camera. Some students saw a 30-second commercial; others saw a 90-second version. The same commercial was shown either one, three, or five times during the program. After viewing, all of the students answered questions about their recall of the ad, their attitude toward the digital camera, and their intention to purchase it. What is (are) the response variable(s)?
11
11 What are the effects of repeated exposure to an advertising message? In an experiment designed to investigate this question, undergraduate students viewed a 40-minute television program that included ads for a digital camera. Some students saw a 30-second commercial; others saw a 90-second version. The same commercial was shown either one, three, or five times during the program. After viewing, all of the students answered questions about their recall of the ad, their attitude toward the digital camera, and their intention to purchase it. What is (are) the response variable(s)? The response variables are the recall of the ad, the attitude toward the camera, and the intention to purchase the camera.
12
12 Here is a diagram of this experimental design: Factor B: Number of Repetitions 1 time3 times5 times Factor A: Length 30 sec.123 90 sec.456
13
13 The second step of the design is to define how experimental units will be assigned to treatments. Comparison of the effects of treatments can only be valid if the treatments are applied to similar groups of experimental units. This requires at least two groups; one is often a control group.
14
14 The most important principle of statistical design is randomization, which involves using chance to select the sample from the population and to assign subjects to treatment. This assignment, then, does not depend on any characteristic of the subjects and does not rely on the judgment of the experimenter in any way. This protects against bias which favors certain outcomes, allows us to draw cause-and-effect conclusions, and provides a justification for using a probability model.
15
15 Does talking on a hands-free cell phone distract drivers? Undergraduate students "drove" in a driving simulator equipped with a hands-free cell phone. How quickly does the student respond when the car ahead brakes suddenly? Twenty students (the control group) simply drove. Another twenty students (the experimental group) talked on the phone while driving.
16
16 Randomization produces groups of subjects that we expect to be similar in all respects before the treatment is applied. Here is a diagram of the design of this experiment:
17
17 Comparative design (comparing the effects of several treatments) helps ensure that influences other than the treatment (in this case, the cell phone) operate equally on all groups. Any differences, then, must be due either to the treatment or to chance (i.e. the random assignment). Recall that a small p-value is an indication that it is unlikely that the results we see are due to chance alone.
18
18 Randomization F-Test (One-Way ANOVA) 1.Observed F: Compute the value of F as usual. (F Obs ) 2.Randomization distribution: a. Rerandomize = create a random reordering of the data. b. Compute the value of F for the rerandomized data (F Rand ) c. Repeat this many times and record the values of F Rand 3.Find the proportion of values of F Rand that are greater than or equal to F Obs This proportion is an estimate of the p-value. Both the F-test and the Randomization F-test use a quantitative response variable and one categorical predictor, and both test whether the observed differences in group means are too large to be due to chance alone.
19
19 Repeated Measures Analysis of Variance In this analysis, the groups are not independent. Instead, we have one sample of subjects and take multiple, or repeated, measurements on each subject.
20
20 Here’s an example: Consider this data for four subjects over three treatments, where the response variable is number of trials to success on some task: If you look at the treatment means, you see a very slight difference. But look also at the subject means: It is apparent that Subject 1 learns quickly under all conditions, and that Subjects 3 and 4 learn very slowly. These differences among the subjects are responsible for the differences within treatments, but they really have nothing to do with the treatment effect. Treatment Subject123Mean 1247 4.33 210121311.67 322293027.00 430313431.67 Mean16192118.67
21
21 A repeated measure is a multivariate response in which the same variable is measured on each subject several times, possibly under different conditions. The hypotheses are the same as they are for a one-way ANOVA. The analysis is also the same: first a test for the equality of the means, and then appropriate tests to investigate any differences that are found.
22
22 The assumptions are: 1.Independent observations within each treatment 2.Normal populations within each treatment 3.Equal population variances within each treatment 4.Sphericity Note that there is an additional assumption of sphericity, which assumes that the pairwise differences among treatment levels have equal variances. Mauchly's test for sphericity is commonly used to test this assumption.
23
23 Example 1: A study was conducted to examine whether the anxiety a person experiences affects performance on a learning task. Subjects with varying levels of anxiety performed a learning task across a series of four trials and the number of errors made was recorded. Does the number of errors made by subjects change significantly across the four learning trials? H 0 : μ trial1 = μ trial2 = μ trial3 = μ trial4 H a : the means are not all equal
24
24 We will use statistical tests to assess whether the assumptions are met. First, we can use the Kolmogorov-Smirnov results in the Tests of Normality table to assess normality as we did earlier; since all p-values are greater than.05, we do not reject the null hypothesis of normality. Tests of Normality Kolmogorov-Smirnov a Shapiro-Wilk StatisticdfSig.StatisticdfSig. Trial 1.23812.060.89012.117 Trial 2.16912.200 *.94012.495 Trial 3.18212.200 *.94712.595 Trial 4.20112.193.88512.100 a. Lilliefors Significance Correction *. This is a lower bound of the true significance.
25
25 We will use Mauchly's test to test the hypothesis that the variances of the differences between conditions are equal. Since p =.112 we cannot reject the null hypothesis, and can conclude that there are no significant differences between the variances of the differences. We now know that the condition of sphericity is met. Since the conditions for the test have been met, we can continue. Mauchly's Test of Sphericity b Measure:MEASURE_1 Within Subjects EffectMauchly's W Approx. Chi- SquaredfSig. Epsilon a Greenhouse- GeisserHuynh-FeldtLower-bound factor1.3988.9575.112.622.744.333 Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix. a. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of Within-Subjects Effects table. b. Design: Intercept Within Subjects Design: factor1
26
26 We will consider the results in the first row of this table, where sphericity is assumed. The results (F = 127.561, p =.000) indicate that we can reject the null hypothesis and conclude that the number of errors made by subjects did change significantly across the four learning trials. Tests of Within-Subjects Effects Measure:MEASURE_1 Source Type III Sum of SquaresdfMean SquareFSig. factor1Sphericity Assumed991.5003330.500127.561.000 Greenhouse-Geisser991.5001.865531.709127.561.000 Huynh-Feldt991.5002.231444.504127.561.000 Lower-bound991.5001.000991.500127.561.000 Error(factor1)Sphericity Assumed85.500332.591 Greenhouse-Geisser85.50020.5124.168 Huynh-Feldt85.50024.5363.485 Lower-bound85.50011.0007.773
27
27 Example 2: This data was collected in a study investigating factors associated with the risk of developing high blood pressure, or hypertension. The subjects were all college students. We would like to see if there is evidence that diastolic blood pressure changes significantly during three different conditions: resting, doing mental arithmetic, and immersing a hand in cold water. The null hypothesis is that blood pressure does not change under these stressors: H 0 : μ rest = μ arith = μ cold H a : the means are not all equal
28
28 First we can see what the descriptive statistics for this data suggest. It appears that the means are different for the three stressors, but we will need to analyze the data to see if the difference is significant. Descriptive Statistics MeanStd. DeviationN diastolic bp rest67.64577.45186175 diastolic bp mental arithmetic73.77719.42046175 diastolic bp cold pressor77.988614.63816175
29
29 First we will check the conditions for this test. We can use the results in the Tests of Normality table to assess normality as we did earlier; since p =.200, we do not reject the null hypothesis of normality. Tests of Normality Kolmogorov-Smirnov a Shapiro-Wilk StatisticdfSig.StatisticdfSig. diastolic bp rest.046175.200 *.992175.460 diastolic bp mental arithmetic.036175.200 *.997175.985 diastolic bp cold pressor.056175.200 *.981175.017 a. Lilliefors Significance Correction *. This is a lower bound of the true significance.
30
30 However, the results of Mauchly's test are significant (p is close to 0), and we conclude that the condition of sphericity is not met. However, we can still continue with this analysis even when the assumption of sphericity is violated. Mauchly's Test of Sphericity b Measure:MEASURE_1 Within Subjects EffectMauchly's W Approx. Chi- SquaredfSig. Epsilon a Greenhouse- GeisserHuynh-FeldtLower-bound stressor.65473.4152.000.743.748.500 Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix. a. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of Within-Subjects Effects table. b. Design: Intercept Within Subjects Design: stressor
31
31 The table below includes additional information that can be used in this situation: You can use these results if the sphericity assumption is met, as we did in the first example; however, this is not the case in this analysis. The next lines show results for other tests where adjustments to the degrees of freedom have been made because the sphericity assumption was violated. Tests of Within-Subjects Effects Measure:MEASURE_1 Source Type III Sum of SquaresdfMean SquareFSig. stressorSphericity Assumed9467.80624733.90373.754.000 Greenhouse-Geisser9467.8061.4866370.94973.754.000 Huynh-Feldt9467.8061.4966329.11573.754.000 Lower-bound9467.8061.0009467.80673.754.000 Error(stressor)Sphericity Assumed22336.39634864.185 Greenhouse-Geisser22336.396258.58086.381 Huynh-Feldt22336.396260.28985.814 Lower-bound22336.396174.000128.370
32
32 The table below includes additional information that can be used in this situation: The Greenhouse-Geisser test is commonly used in this situation. Those results are significant (F = 73.754, p =.000) and so we reject the null hypothesis. We can conclude that this indicates that diastolic blood pressure changes significantly during the various mental and physical stressors investigated in this study. Tests of Within-Subjects Effects Measure:MEASURE_1 Source Type III Sum of SquaresdfMean SquareFSig. stressorSphericity Assumed9467.80624733.90373.754.000 Greenhouse-Geisser9467.8061.4866370.94973.754.000 Huynh-Feldt9467.8061.4966329.11573.754.000 Lower-bound9467.8061.0009467.80673.754.000 Error(stressor)Sphericity Assumed22336.39634864.185 Greenhouse-Geisser22336.396258.58086.381 Huynh-Feldt22336.396260.28985.814 Lower-bound22336.396174.000128.370
33
33 But where are these differences? We can compare the means using the Bonferroni method: These results indicate that there are no significant differences between the stressors when compared pairwise. All p-values are reported as.000. Pairwise Comparisons Measure:MEASURE_1 (I) stressor(J) stressor Mean Difference (I-J)Std. ErrorSig. a 95% Confidence Interval for Difference a Lower BoundUpper Bound 12-6.131 *.551.000-7.463-4.800 3-10.343 *.960.000-12.664-8.022 216.131 *.551.0004.8007.463 3-4.211 *.988.000-6.599-1.824 3110.343 *.960.0008.02212.664 24.211 *.988.0001.8246.599 Based on estimated marginal means *. The mean difference is significant at the.05 level. a. Adjustment for multiple comparisons: Bonferroni.
34
34 You may also choose to create a graph that illustrates and supports the analysis. This graph supports the conclusion that the means of the three groups are different:
35
35 Another way to investigate differences between pairs of groups is by performing appropriate paired-sample t-tests. For example, you can compare the subjects' diastolic blood pressure in two of the conditions: resting, and after immersing a hand in cold water. Note that in this test, we are testing the mean difference between the groups, not the difference of the means. Paired sample t-tests are used to compare two means when the samples are not independent, so that a value in one sample can be paired with a corresponding value in the second sample. For example, a subject's resting diastolic blood pressure can be paired with the same subject's diastolic blood pressure after submerging a hand in cold water.
36
36 Assumptions and Conditions Paired data assumption: The data must be paired. Independence assumption: The groups are not independent, but the differences are independent. Randomization condition: Pairs are randomly chosen, or treatments are assigned randomly to the members of each pair Normal population assumption: The population of differences is nearly normal (Check with a histogram and/or a Normal probability plot of the differences.)
37
37 When the conditions are met and the null hypothesis is true we can model the sampling distribution of this statistic with a Student’s t-model with n-1 degrees of freedom, where n is the number of pairs. We test the hypothesis H 0 : μ d = Δ 0 where the d’s are pairwise differences, and Δ 0 is almost always 0.
38
38 H 0 : μ d = 0 H 0 : μ d ≠ 0 Here are the results: We will reject the null hypothesis (t = -10.590, p =.000) and conclude that there is a difference between resting diastolic blood pressure and diastolic blood pressure after immersing a hand in cold water. Paired Samples Test Paired Differences tdfSig. (2-tailed) Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the Difference LowerUpper Pair 1diastolic bp rest - diastolic bp cold pressor -10.2025112.88920.96338-12.10364-8.30139-10.590178.000
39
39 SPSS Instructions for Repeated Measures ANOVA 1. Choose > Analyze > Descriptive Statistics > Explore In the Explore dialog box, choose the columns you wish to include in your analysis, and choose Plots as the display. Then click on Plots.
40
40 In the Explore Plots dialog box, select Normality plots with tests, deselect Stem-and-leaf and select None under Boxplots. Click on Continue and then click on OK in the main dialog box.
41
41 2. Choose > Analyze > General Linear Model > Repeated Measures Enter an appropriate Within-Subject Factor name; in this case it is stressor. Type in the number of levels, then click Add. You will see the results in the next box.
42
42 2. Choose > Analyze > General Linear Model > Repeated Measures Now click on Define. You will see the Repeated Measures dialog box like the one shown on the left. Select the columns containing data for the factor levels in order by highlighting the column name and clicking on the arrow that will move the name to the list of variables. Then click on Plots.
43
43 2. Choose > Analyze > General Linear Model > Repeated Measures In the Repeated Measures: Profile Plots dialog box, click on the factor and move it to the Horizontal Axis. Then click on Add to add this selection to the Plots: box. Click on Continue.
44
44 2. Choose > Analyze > General Linear Model > Repeated Measures In the Repeated Measures: Profile Plots dialog box, click on the factor and move it to the Horizontal Axis. Then click on Add to add this selection to the Plots: box. Click on Continue.
45
45 SPSS Instructions for Paired Sample t-tests Choose > Analyze > Compare Means > Paired-Samples t-test Indicate the columns you wish to compare, and then click on OK:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.