Download presentation
Presentation is loading. Please wait.
1
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #13
2
Copyright (c) Bani K. Mallick2 Topics in Lecture #13 Multiple comparisons, especially Fisher’s Least Significant Difference Residuals as a means of checking the normality assumption
3
Copyright (c) Bani K. Mallick3 Book Sections Covered in Lecture #13 Chapter 8.4 (Residuals) Chapter 9.4 (Fisher’s) Chapter 9.1 (the idea of multiple comparisons)
4
Copyright (c) Bani K. Mallick4 Lecture 12 Review: ANOVA Suppose we form three populations on the basis of body mass index (BMI): BMI 28 This forms 3 populations We want to know whether the three populations have the same mean caloric intake, or if their food composition differs.
5
Copyright (c) Bani K. Mallick5 Lecture 12 Review: ANOVA One procedure that is often followed is to do a preliminary test to see whether there are any differences among the populations Then, once you conclude that some differences exist, you allow somewhat more informality in deciding where those differences manifest themselves The first step is the ANOVA F-test
6
Copyright (c) Bani K. Mallick6 Lecture 12 Review: ANOVA The distance of the data to the overall mean is TSS = (Corrected) Total Sum of Squares This has degrees of freedom
7
Copyright (c) Bani K. Mallick7 Lecture 12 Review: ANOVA The sum of squares between groups Corrected Model) is It has t-1 degrees of freedom, so the number of populations is the degrees of freedom between groups + 1.
8
Copyright (c) Bani K. Mallick8 Lecture 12 Review: ANOVA The distance of the observations to their sample means is This is the Sum of Squares for Error It has degrees of freedom
9
Copyright (c) Bani K. Mallick9 Lecture 12 Review: ANOVA Next comes the F-statistic It is the ratio of the mean square for the corrected model to the mean square for error Large values indicate rejection of the null hypothesis
10
Copyright (c) Bani K. Mallick10 Lecture 12 Review: ANOVA The F-statistic is compared to the F- distribution with t-1 and degrees of freedom. See Table 8,which lists the cutoff points in terms of . If the F-statistic exceeds the cutoff, you reject the hypothesis of equality of all the means. SPSS gives you the p-value (significance level) for this test
11
Copyright (c) Bani K. Mallick11 Lecture 12 Review: ANOVA The F-statistic is compared to the F- distribution with df 1 = t-1 and degrees of freedom. For example if you have 3 populations, 6 observations for each population, then there are 18 total observations. The degrees of freedom are 2 and 15. If you want a type I error of 5%, look at df 1 = 2, df 2 = 15, =.05 to get a critical value of 3.68: try this out!
12
Copyright (c) Bani K. Mallick12 Lecture 12 Review: ANOVA If the populations have a common variance 2, the Mean squared error estimates it. You take the square root of the MSE to estimate
13
Copyright (c) Bani K. Mallick13 Lecture 12 Review: ANOVA The critical value of 2 and 181 df for an F-test at Type I error 0.05 is about 3.05 Hence F > 3.05, so the p-value is < 0.05
14
Copyright (c) Bani K. Mallick14 ANOVA in SPSS “Analyze”, “General Linear Model”, “Univariate” “Fixed factor” = the variable defining the populations Always “Save” unstandardized residuals “Posthoc”: Move factor to right and click on LSD
15
Copyright (c) Bani K. Mallick15 ANOVA Table
16
Copyright (c) Bani K. Mallick16 Fisher’s Least Significant Distance (LSD) Suppose that we determine that there are at least some differences among t population means. Fisher’s Least Significant Difference is one way to tell which ones are different The main reason to use it is convenience: all comparisons can be done with the click of a mouse It does not guarantee longer or shorter confidence intervals
17
Copyright (c) Bani K. Mallick17 Fisher’s Least Significant Distance (LSD) For example, suppose there are t = 3 populations. The null hypothesis is The alternative is: But this does not tell you which populations are different, only that some are
18
Copyright (c) Bani K. Mallick18 Fisher’s Least Significant Distance (LSD) The null hypothesis is The alternative is: There are 4 possibilities: Fishers LSD is a way of getting this directly
19
Copyright (c) Bani K. Mallick19 Fisher’s LSD We have done an ANOVA, and now we want to compare two specific populations. Fisher’s LSD differs from our usual 2- population comparisons in two features: The degrees of freedom (n T - t) not n 1 +n 2 -2 The pooled standard deviation (square root of MSE = SSE/(n T - t), not s P
20
Copyright (c) Bani K. Mallick20 Review: Comparing Two Populations If you can reasonably believe that the population sd’s are nearly equal, it is customary to pick the equal variance assumption and estimate the common standard deviation by
21
Copyright (c) Bani K. Mallick21 Comparing Two Populations: Usual and Fisher LSD Usual Fisher
22
Copyright (c) Bani K. Mallick22 ROS Data ROS data has three groups: Fish oil diet, Fish- like oil diet, and Corn Oil We want to compare their responses to butyrate
23
Copyright (c) Bani K. Mallick23 ANOVA ROS data, log scale. What do you see?
24
Copyright (c) Bani K. Mallick24 ANOVA ROS data, log scale. What do you see? Maybe different variances, but sample sizes are small
25
Copyright (c) Bani K. Mallick25 ANOVA ROS data, log scale. No major changes in means?
26
Copyright (c) Bani K. Mallick26 ANOVA ROS data has three groups: Fish oil diet, Fish- like oil diet, and Corn Oil What was the total sample size? n = 30 Tests of Between-Subjects Effects Dependent Variable: log(Butyrate) - log(Control ) 5.188E-02 a 22.594E-02.203.818 5.9571 46.542.000 5.188E-0222.594E-02.203.818 3.45627.128 9.46530 3.50829 Source Corrected Model Intercept DIETGRP Error Total Corrected Total Type III Sum of SquaresdfMean SquareFSig. R Squared =.015 (Adjusted R Squared = -.058) a.
27
Copyright (c) Bani K. Mallick27 ANOVA ROS data: any evidence that the population means are different in their change after butyrate exposure?
28
Copyright (c) Bani K. Mallick28 ANOVA ROS data: any evidence that the population means are different in their change after butyrate exposure? No, the p-value is 0.818! This matches the box plots
29
Copyright (c) Bani K. Mallick29 ROS Data Testing for Normality in ANOVA I use the General Linear Model to define these residuals Form the residuals, which are simply the differences of the data with their group sample mean Then do a q-q plot Useful if you have many groups with a small number of observations per group
30
Copyright (c) Bani K. Mallick30 ANOVA Here is the Q-Q plot. How’s it look?
31
Copyright (c) Bani K. Mallick31 ROS Data Testing for Normality in ANOVA: Illustrate saving residuals: “general linear model”, “univariate”, “save” (select “unstandardized” to create the residual variable ) Illustrate q-q- plot on residuals Illustrate editing a chart object to change titles and the like
32
Copyright (c) Bani K. Mallick32 ROS Data Fisher’s LSD. Note how all p-values are > 0.10. Multiple Comparisons Dependent Variable: log(Butyrate) - log(Control) LSD 6.825E-02.1600.673-.2600.3965 9.960E-02.1600.539-.2287.4279 -6.8255E-02.1600.673-.3965.2600 3.135E-02.1600.846-.2969.3596 -9.9605E-02.1600.539-.4279.2287 -3.1350E-02.1600.846-.3596.2969 (J) Diet Group Fish oil diet Corn oil diet FAEE oil diet Corn oil diet FAEE oil diet Fish oil diet (I) Diet Group FAEE oil diet Fish oil diet Corn oil diet Mean Difference (I-J)Std. Error Pvalues Sig.Lower BoundUpper Bound 95% Confidence Interval Based on observed means.
33
Copyright (c) Bani K. Mallick33 ROS Data: Compare Fish to Corn oil Mean for fish – mean for corn = Multiple Comparisons Dependent Variable: log(Butyrate) - log(Control) LSD 6.825E-02.1600.673-.2600.3965 9.960E-02.1600.539-.2287.4279 -6.8255E-02.1600.673-.3965.2600 3.135E-02.1600.846-.2969.3596 -9.9605E-02.1600.539-.4279.2287 -3.1350E-02.1600.846-.3596.2969 (J) Diet Group Fish oil diet Corn oil diet FAEE oil diet Corn oil diet FAEE oil diet Fish oil diet (I) Diet Group FAEE oil diet Fish oil diet Corn oil diet Mean Difference (I-J)Std. Error Pvalues Sig.Lower BoundUpper Bound 95% Confidence Interval Based on observed means.
34
Copyright (c) Bani K. Mallick34 ROS Data: Compare Fish to Corn oil Mean for fish – mean for corn = 0.03135 Standard error = Multiple Comparisons Dependent Variable: log(Butyrate) - log(Control) LSD 6.825E-02.1600.673-.2600.3965 9.960E-02.1600.539-.2287.4279 -6.8255E-02.1600.673-.3965.2600 3.135E-02.1600.846-.2969.3596 -9.9605E-02.1600.539-.4279.2287 -3.1350E-02.1600.846-.3596.2969 (J) Diet Group Fish oil diet Corn oil diet FAEE oil diet Corn oil diet FAEE oil diet Fish oil diet (I) Diet Group FAEE oil diet Fish oil diet Corn oil diet Mean Difference (I-J)Std. Error Pvalues Sig.Lower BoundUpper Bound 95% Confidence Interval Based on observed means.
35
Copyright (c) Bani K. Mallick35 ROS Data: Compare Fish to Corn oil Mean for fish – mean for corn = 0.03135 Standard error = 0.1600 CI (95%) = Multiple Comparisons Dependent Variable: log(Butyrate) - log(Control) LSD 6.825E-02.1600.673-.2600.3965 9.960E-02.1600.539-.2287.4279 -6.8255E-02.1600.673-.3965.2600 3.135E-02.1600.846-.2969.3596 -9.9605E-02.1600.539-.4279.2287 -3.1350E-02.1600.846-.3596.2969 (J) Diet Group Fish oil diet Corn oil diet FAEE oil diet Corn oil diet FAEE oil diet Fish oil diet (I) Diet Group FAEE oil diet Fish oil diet Corn oil diet Mean Difference (I-J)Std. Error Pvalues Sig.Lower BoundUpper Bound 95% Confidence Interval Based on observed means.
36
Copyright (c) Bani K. Mallick36 ROS Data: Compare Fish to Corn oil Mean for fish – mean for corn = 0.03135 Standard error = 0.1600 CI (95%) = -2969 to.3596 Multiple Comparisons Dependent Variable: log(Butyrate) - log(Control) LSD 6.825E-02.1600.673-.2600.3965 9.960E-02.1600.539-.2287.4279 -6.8255E-02.1600.673-.3965.2600 3.135E-02.1600.846-.2969.3596 -9.9605E-02.1600.539-.4279.2287 -3.1350E-02.1600.846-.3596.2969 (J) Diet Group Fish oil diet Corn oil diet FAEE oil diet Corn oil diet FAEE oil diet Fish oil diet (I) Diet Group FAEE oil diet Fish oil diet Corn oil diet Mean Difference (I-J)Std. Error Pvalues Sig.Lower BoundUpper Bound 95% Confidence Interval Based on observed means.
37
Copyright (c) Bani K. Mallick37 Concho Water Snake Illustration A numerical example will help illustrate this idea. I’ll consider comparing tail lengths of female Concho Water Snakes with age classes 2,3, and 4. Sample sizes Sample sd: Sample means:
38
Copyright (c) Bani K. Mallick38 Female Concho Water Snakes, Ages 2-4, Tail Length
39
Copyright (c) Bani K. Mallick39 Female Concho Water Snakes, Ages 2-4, Tail Length
40
Copyright (c) Bani K. Mallick40 Female Concho Water Snakes, Ages 2-4, Tail Length: are they different in population means?
41
Copyright (c) Bani K. Mallick41 Concho Water Snake Example Multiple Comparisons Dependent Variable: Tail Length LSD -19.4171*5.3907.001-30.3724-8.4618 -40.8485*6.2616.000-53.5736-28.1233 19.4171*5.3907.0018.461830.3724 -21.4314*5.7429.001-33.1023-9.7604 40.8485*6.2616.00028.123353.5736 21.4314*5.7429.0019.760433.1023 (J) Age 3.00 4.00 2.00 4.00 2.00 3.00 (I) Age 2.00 3.00 4.00 Mean Difference (I-J)Std. ErrorSig.Lower BoundUpper Bound 95% Confidence Interval Based on observed means. The mean difference is significant at the.05 level. *.
42
Copyright (c) Bani K. Mallick42 Concho Water Snake Illustration: Hand Calculations Sample size factor for comparing the age groups Sample mean difference
43
Copyright (c) Bani K. Mallick43 Concho Water Snake Illustration n T – t = 34 degrees of freedom for error MSE = 194.08, = 0.05 = 9.76 to 33.10: compare with output
44
Copyright (c) Bani K. Mallick44 Female Concho Water Snakes, Ages 2-4, Tail Length We need a method that allows for non- normal data!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.