Presentation is loading. Please wait.

Presentation is loading. Please wait.

January 7, 2009 - morning session 1 Statistics Micro Mini Multi-factor ANOVA January 5-9, 2008 Beth Ayers.

Similar presentations


Presentation on theme: "January 7, 2009 - morning session 1 Statistics Micro Mini Multi-factor ANOVA January 5-9, 2008 Beth Ayers."— Presentation transcript:

1 January 7, 2009 - morning session 1 Statistics Micro Mini Multi-factor ANOVA January 5-9, 2008 Beth Ayers

2 January 7, 2009 - morning session 2 Thursday Sessions ANOVA ‒ One-way ANOVA ‒ Two-way ANOVA ‒ ANCOVA ‒ With-in subject ‒ Between subject ‒ Repeated measures ‒ MANOVA ‒ etc.

3 January 7, 2009 - morning session 3 What is ANOVA? ANalysis Of VAriance ‒ Partitions the observed variance based on explanatory variables ‒ Compare partitions to test significance of explanatory variables

4 January 7, 2009 - morning session 4 Some Terminology Between subjects design – each subject participates in one and only one group Within subjects design – the same group of subjects serves in more than one treatment ‒ Subject is now a factor Mixed design – a study which has both between and within subject factors Repeated measures – general term for any study in which multiple measurements are measured on the same subject ‒ Can be either multiple treatments or several measurements over time

5 January 7, 2009 - morning session 5 ANOVA Use variances and variance like quantities to study the equality or non- equality of population means So, although it is analysis of variance we are actually analyzing means, not variances There are other methods which analyze the variances between groups

6 January 7, 2009 - morning session 6 ANOVA Typical exploratory analysis includes ‒ Tabulation of the number of subjects in each experimental group ‒ Side-by-side box plots ‒ Statistics about each group ‒ At least mean and standard deviation, can include 5-number summary and information on skewness ‒ Table of means for each experimental group

7 January 7, 2009 - morning session 7 Notation If we have k groups, denote the means of the groups as: ‒ ¹ 1, ¹ 2,..., ¹ k Student i in group j has observation ‒ y ij = ¹ j + ² ij ‒ Where ² ij are independent, distributed N(0, ¾ 2 ) ‒ Can combine this and say subjects from group j have distribution N( ¹ j, ¾ 2 ) With random assignment, the sample mean for any treatment group is representative of the population mean for that group

8 January 7, 2009 - morning session 8 Assumptions 1.The errors ² ij are normally distributed 2.Across the conditions, the errors have equal spread. Often referred to as equal variances. ‒ Rule of thumb: the assumption is met if the largest variance is less than twice the smallest variance ‒ If unequal variances need to make a correction!! This is usually ® /2. 3.The errors are independent from each other

9 January 7, 2009 - morning session 9 Checking the assumptions Use the residuals, which are the estimates of ² ij 1.Look at normal probability plot 2.Look at residual versus fitted plot 3.Hard to check, often assumed from study design For mild violations of the assumptions, there are options for correction When the assumptions are not met – the p-value is simply wrong!!

10 January 7, 2009 - morning session 10 One-way ANOVA One-way ANOVA is used when ‒ Only testing the affect of one explanatory variable ‒ Each subject has only one treatment or condition ‒ Thus a between-subjects design Used to test for differences among two or more independent groups Gives the same results as two-sample T- test if explanatory variable has 2 levels

11 January 7, 2009 - morning session 11 Hypothesis Testing H 0 : ¹ 1 = ¹ 2 =... = ¹ k H 1 : the ¹ ’s are not all equal The alternative hypothesis H 1 : ¹ 1 ≠ … ≠ ¹ k is wrong! The null hypothesis is called the overall null and is the hypothesis tested by ANOVA If the overall null is rejected, must do more specific hypothesis testing to determine which means are different, often referred to as contrasts

12 January 7, 2009 - morning session 12 Terminology The sample variance is the sum of the squared deviations from the mean divided by the degrees of freedom A mean square (MS) is a variance like quantity calculated as a SS/df

13 January 7, 2009 - morning session 13 One-way ANOVA In one-way ANOVA we work with two mean square quantities ‒ MS within – the mean square within-groups ‒ MS between – the mean square between-groups

14 January 7, 2009 - morning session 14 Within vs. Between

15 January 7, 2009 - morning session 15 One-way ANOVA For each individual group we have So the estimate of MS within is And the estimate of MS between is

16 January 7, 2009 - morning session 16 Mean Squares What do these values mean? MS within is considered a true estimate of ¾ 2 that is unaffected by whether the null or alternative hypothesis is true MS between is considered a good estimate of ¾ 2 only when the null hypothesis is true ‒ If the alternative is true, values of MS between tend to be inflated Thus, we can look at the ratio of the two mean square values to evaluate the null hypothesis

17 January 7, 2009 - morning session 17 Testing the Hypothesis The F-test looks at the variation among the group means relative to the variation within the sample The F-statistic tends to be larger if the alternative hypothesis is true than if the null hypothesis is true The test statistic F has an F(k-1, N-k) distribution

18 January 7, 2009 - morning session 18 What does the F ratio tell us? F = MS between / MS within The denominator is always an estimate of ¾ 2 (under both the null and alternative hypotheses) The numerator is either another estimate of ¾ 2 (under the null) or is inflated (under the alternative) If the null is true, values of F are close to 1 If the alternative is true, values of F are larger Large values of F depend on the degrees of freedom

19 January 7, 2009 - morning session 19 The ANOVA table When running an ANOVA, statistical packages will return an ANOVA table summarizing the SS, MS, df, F-statistic, and p-value SSDfMSFSig Group (Treatment, between) SS between df between MS between _________________ MS within P-value Residual (Error, within) SS within df within MS within TotalSS between + SS within df between + df within

20 January 7, 2009 - morning session 20 Example Suppose we want to know if typing speed varies across majors Use 4 majors – Biology, Business, English, and Mathematics H 0 : typing speed is the same for students of all majors ‒ H 0 : ¹ Bio = ¹ Business = ¹ Eng = ¹ Math H 1 : typing speed varies across the majors ‒ H 1 : at least one of the means is different

21 January 7, 2009 - morning session 21 Box plots

22 January 7, 2009 - morning session 22 Summary Majornini MeanVariance Biology2545.324.7 Business2547.625.4 English2555.638.8 Mathematics2545.120.1 The largest variance is less than twice the smallest variance (38.8 < 2 ¢ 20.1 = 40.2). Use ® = 0.05.

23 January 7, 2009 - morning session 23 Degrees of Freedom How many groups do we have? What is the sample size? Using these values: What is df within ? What is df between ?

24 January 7, 2009 - morning session 24 Degrees of Freedom How many groups do we have? ‒ There are k = 4 groups – Biology, English, Business, and Mathematics What is the sample size? ‒ There are N = 100 students Using these values, ‒ What is df between ? ‒ k – 1 = 4 – 1 = 3 ‒ What is df within ? ‒ N – k = 100 – 4 = 96

25 January 7, 2009 - morning session 25 Sample Output SSDfMSFSig Group (Treatment, between) 1807.493602.5022.0910.000 Residual (Error, within) 2618.209627.17 Total4425.6999 Our estimate of ¾ 2 is 27.17 The numerator MS = 602.5 and appears to be highly inflated

26 January 7, 2009 - morning session 26 Results F-statistic = 22.1 P-value: <0.0005 Conclusion – the average words per minute differs for at least one of the majors To make stronger statements need to do further testing

27 January 7, 2009 - morning session 27 Checking the assumptions

28 January 7, 2009 - morning session 28 Further Analysis If H 0 is rejected, we conclude that not all the ¹ ’s are equal Would like to make statements about where there are differences Can use planned or unplanned comparisons (or contrasts) ‒ Planned comparisons are interesting comparisons decided on before analysis ‒ Unplanned comparisons occur after seeing the results ‒ Be careful not to go fishing for results

29 January 7, 2009 - morning session 29 Contrasts A simple contrast hypothesis compares two population means ‒ H O : ¹ 1 = ¹ 5 A complex contrast hypothesis has multiple population means on either side H 0 : ( ¹ 1 + ¹ 2 ) / 2 = ¹ 3 H 0 : ( ¹ 1 + ¹ 2 ) / 2 = ( ¹ 3 + ¹ 4 + ¹ 5 ) / 3

30 January 7, 2009 - morning session 30 Planned Comparisons Most statistical packages allow you to enter custom planned contrast hypotheses The p-values are only valid under strict conditions ‒ The conditions maintain Type-1 error rate across the whole experiment Computer packages assume that you have checked the assumptions of the ANOVA test

31 January 7, 2009 - morning session 31 Conditions for Planned Comparisons Contrasts are selected before looking at the residuals, they are planned – not post-hoc Must be ignored if the overall null is not rejected! Each contrast is based on independent information from other contrasts The number of planned comparisons must not be more than the corresponding degrees of freedom (k-1 in one-way ANOVA)

32 January 7, 2009 - morning session 32 Unplanned Comparisons What if we notice a possible interesting difference when looking at the results? Can do comparisons but need to adjust the ® -level to control for Type-1 error One common method is to use Tukey’s simultaneous confidence intervals to calculate any and all pairs of group population means ‒ This procedure takes multiple comparisons into consideration to preserve the ® level

33 January 7, 2009 - morning session 33 Other Options Bonferroni correction for the number of comparisons done Dunnett’s tests Scheffe procedure

34 January 7, 2009 - morning session 34 Tukey’s Multiple Comparisons for previous example

35 January 7, 2009 - morning session 35 Conclusions In the table on the previous page, ‒ 1 = Biology, 2 = Business, 3 = English, 4 = Mathematics Biology, Business, and Mathematics are all are significantly different from English There are no other significant differences

36 January 7, 2009 - morning session 36 Additional sample output Below is the same output from a different software package

37 January 7, 2009 - morning session 37 Comparison to Regression Sample regression output ‒ Which major is our baseline?

38 January 7, 2009 - morning session 38 Comparison to Regression F-statistic = 22.1, p-value < 0.0005 ‒ This is the same F-statistic and p-value as the ANOVA on slide 25 At least one of the explanatory variables is important in – this corresponds to the rejection of the null, at least one of the means differs

39 January 7, 2009 - morning session 39 Comparison to Regression Note that Biology is the baseline and 45.3 is the mean WPM for Biology students Note that Business and Mathematics are not significant Agrees with post-hoc comparisons that neither Business or Mathematics is significantly different from Biology, but English is not To make further conclusions will need to look at multiple comparisons, such as the previous Tukey intervals

40 January 7, 2009 - morning session 40 Regression The conclusions about the overall null hypothesis will be the same In regression can make statements comparing groups to baseline To make more conclusive statements will need to do more analysis ANOVA and either planned or post-hoc comparisons will do the same thing and is often easier

41 January 7, 2009 - morning session 41 One-way ANOVA Power Two different SAT prep courses charge $1200 for a two month course. An (unethical) experiment would be to randomize students into one of the two courses or take no course What information is needed to calculate power for this one-way ANOVA? ‒ Sample size ‒ Within group variance ( ¾ 2 ) ‒ Estimated or minimally interesting outcome means for each group

42 January 7, 2009 - morning session 42 Estimate of ¾ 2 Based on previous years, we know that 95% of the student scores on SATs fall between 900 and 1500 ¾ = (1500-900)/4 = 150 ¾ 2 = 150^2

43 January 7, 2009 - morning session 43 Minimally interesting outcome What is the minimally average benefit, in points gained, that would justify the program? ‒ The minimally interesting outcome is based on previous knowledge For this example we’ll try several different values

44 January 7, 2009 - morning session 44 sd[treatment] Different applets will define things slightly different. Find an applet you understand. For the applet I will show you, they require sd[treatment]. From their definition this is calculated as ‒ Where ¹ i is the i th group mean ‒ k = the number of groups Ready to go to power applet

45 January 7, 2009 - morning session 45 Calculating the power Let ¾ = 150, n = 50, effect = 50 points ‒ Power = 0.3811 Let ¾ = 150, n = 100, effect = 50 points ‒ Power = 0.6772 Let ¾ = 150, n = 50, effect = 100 points ‒ Power = 0.9367 Let ¾ = 150, n = 50, effect = 25 points ‒ Power = 0.1245

46 January 7, 2009 - morning session 46 Calculating the power Let ¾ = 100, n = 50, effect = 50 points ‒ Power = 0.7276 Let ¾ = 100, n = 100, effect = 50 points ‒ Power = 0.9622 Let ¾ = 100, n = 50, effect = 100 points ‒ Power = 0.997 Let ¾ = 100, n = 50, effect = 25 points ‒ Power = 0.2294

47 January 7, 2009 - morning session 47 Moving past One-way ANOVA What if we have two categorical explanatory variables? What if we have categorical and quantitative explanatory variables? What if subjects have more than one treatment? What if there is more than one response variable? And many other combinations…

48 January 7, 2009 - morning session 48 Two-way ANOVA Suppose we now have two categorical explanatory variables Is there a significant X 1 effect? Is there a significant X 2 effect? Are there significant interaction effects? If X 1 has k levels and X 2 has m levels, then the analysis is often referred to as a “k by m ANOVA” or “k x m ANOVA”

49 January 7, 2009 - morning session 49 Terminology If the interaction is significant, the model is called an interaction model If the interaction is not significant, the model is called an additive model Explanatory variables are often referred to as factors

50 January 7, 2009 - morning session 50 Assumptions The assumptions are the same as in One-way ANOVA 1.The errors ² ij are normally distributed 2.Across the conditions, the errors have equal spread. Often referred to as equal variances. 3.The errors are independent from each other

51 January 7, 2009 - morning session 51 Two-way ANOVA Two-way (or multi-way) ANOVA is an appropriate analysis method for a study with a quantitative outcome and two (or more) categorical explanatory variables. The assumptions are Normality, equal variance, and independent errors.

52 January 7, 2009 - morning session 52 Results Results are again displayed in a ANOVA table Will have one line for each term in the model. For a model with two factors, we will have one line for each factor and one line for the interaction. We will also have a line for the error and the total. See next page.

53 January 7, 2009 - morning session 53 The ANOVA table SSdfMSFSig Factor 1k-1 Factor 2m-1 Interaction(k-1)(m-1) ErrorN-k*m* TotalN-1 The MS(error), denoted by * in the above table, is the true estimate of ¾ 2 The MS in each row is that row’s SS/df The F-statistic is the MS/MS(error)

54 January 7, 2009 - morning session 54 Exploratory Analysis Table of means Interaction or profile plots ‒ An interaction plot is a way to look at outcome means for two factors simultaneously ‒ A plot with parallel lines suggests an additive model ‒ A plot with non-parallel lines suggests an interaction model ‒ Note that an interaction plot should NOT be the deciding factor in whether or not to run an interaction model

55 January 7, 2009 - morning session 55 Example Continuing with the previous example, suppose we’d like to add gender as an explanatory variable X 1 : Major – 4 levels X 2 : Gender – 2 levels Response: words per minute typed We will fit an 4 by 2 ANOVA

56 January 7, 2009 - morning session 56 Table of Means and Counts MaleFemaleOverall Biology45.545.245.4 Business48.646.947.6 English55.355.955.6 Mathematics45.644.645.1 Overall48.947.948.4 MaleFemale Biology1411 Business1015 English1411 Mathematics1213 Note, this table should also include the standard error of each of the means.

57 January 7, 2009 - morning session 57 Interaction plots

58 January 7, 2009 - morning session 58 Interaction plots There are two ways to do an interaction plot. Both are legitimate. Ease of interpretation is the final criteria of which to do. If one explanatory variable has more levels than the other, interpretation is often easier if the explanatory variable with more levels defines the x-axis If one explanatory variable is quantitative but has been categorized and the other is categorical, interpretation is often easier if the categorized quantitative variable defines the x- axis. Example: age, 20-29, 30-39, 40-49, etc.

59 January 7, 2009 - morning session 59 Results Typical output: The last column contains the p- values ‒ Always check interaction first! ‒ If the interaction is not significant, rerun without it

60 January 7, 2009 - morning session 60 Results Updated results Now we can interpret the main effects. We can see that major is significant but that gender is not.

61 January 7, 2009 - morning session 61 Checking the assumptions

62 January 7, 2009 - morning session 62 Notes If the interaction is significant, do not check the main effects. The main effects should always be kept if the interaction is significant. Note that due to the groups of students, you will see vertical lines in the residual versus predicted plot. This is due to the fact that all students with a particular combination of the factors will have the same predicted value.

63 January 7, 2009 - morning session 63 Example 2 Using the same variables, let’s look at a different outcome

64 January 7, 2009 - morning session 64 Table of Means Example 2 MaleFemaleOverall Biology37.945.841.2 Business39.945.043.0 English45.360.051.8 Mathematics41.850.046.1 Overall41.349.851.2

65 January 7, 2009 - morning session 65 Typical SPSS Exploratory Analysis

66 January 7, 2009 - morning session 66 Interaction plots Example 2

67 January 7, 2009 - morning session 67 Results Example 2 Results Note that the interaction is significant ‒ In this case both main effects are also significant, however since the interaction is significant we would keep them even if they were not

68 January 7, 2009 - morning session 68 Example 2

69 January 7, 2009 - morning session 69 Example 2

70 January 7, 2009 - morning session 70 Example 3 Again, using the same variables, let’s look at a different outcome

71 January 7, 2009 - morning session 71 Table of Means Example 3 MaleFemaleOverall Biology47.947.247.6 Business50.248.149.0 English54.862.158.1 Mathematics52.048.450.1 Overall51.351.158.0

72 January 7, 2009 - morning session 72 Interaction Plots Example 3

73 January 7, 2009 - morning session 73 Results Example 3 Results In this case, the interaction and major are significant, but gender is not. Since the interaction is significant, leave gender in the model.

74 January 7, 2009 - morning session 74 Example 3

75 January 7, 2009 - morning session 75 Example – Ginkgo for Memory A study was performed to test the memory effects of the herbal medicine Ginkgo biloba in healthy people. Subjects received a daily dosage (placebo, 120mg, 250mg) for two months. Subjects also received one of two types of mnemonic training. All subjects were given a memory test before the study and again at the end. The response variable is the difference (after – before) in memory test scores. There were 18 subjects randomly assigned to each combination of levels.

76 January 7, 2009 - morning session 76 Exploratory Analysis

77 January 7, 2009 - morning session 77 Exploratory Analysis

78 January 7, 2009 - morning session 78 SPSS ANOVA output Conclusions?

79 January 7, 2009 - morning session 79 ANOVA output Conclusions?

80 January 7, 2009 - morning session 80 Estimated Profile Plot

81 January 7, 2009 - morning session 81 Post-hoc Comparisons Since there are only two levels of training and there is a significant training effect, we don’t need multiple comparisons for training

82 January 7, 2009 - morning session 82 Residual plot No problems

83 January 7, 2009 - morning session 83 Further Analysis If there had been an interaction, we could create a table indicating which differences were significant

84 January 7, 2009 - morning session 84 ANCOVA Analysis of Covariance ‒ At least one quantitative and one categorical explanatory variable ‒ In general, the main interest is the effects of the categorical variable and the quantitative variable is considered to be a control variable ‒ It is a blending of regression and ANOVA

85 January 7, 2009 - morning session 85 Example Suppose that we have two different math tutors and would like to compare performance on the final math test We also have time on tutor and would like to use that as another explanatory variable

86 January 7, 2009 - morning session 86 Exploratory Analysis

87 January 7, 2009 - morning session 87 Compare Regression and ANCOVA Regression ANCOVA

88 January 7, 2009 - morning session 88 Compare Regression and ANOVA Note that the p-value for the interaction is the same in both models The interaction is not significant, drop and rerun

89 89 Compare Regression and ANOVA Regression ANCOVA


Download ppt "January 7, 2009 - morning session 1 Statistics Micro Mini Multi-factor ANOVA January 5-9, 2008 Beth Ayers."

Similar presentations


Ads by Google