Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every unit change in x Y-intercept or the value of y when x = 0.
Interpretation of parameters The regression slope is the average change in Y when X increases by 1 unit The intercept is the predicted value for Y when X = 0 If the slope = 0, then X does not help in predicting Y (linearly)
General ANOVA Setting Comparisons of 2 or more means Investigator controls one or more independent variables –Called factors (or treatment variables) –Each factor contains two or more levels (or groups or categories/classifications) Observe effects on the dependent variable –Response to levels of independent variable Experimental design: the plan used to collect the data
Logic of ANOVA Each observation Mean is different from the Grand (total sample) Mean by some amount There are two sources of variance from the mean: –1) That due to the treatment or independent variable –2) That which is unexplained by our treatment
One-Way Analysis of Variance Evaluate the difference among the means of two or more groups Examples: Accident rates for 1 st, 2 nd, and 3 rd shift Expected mileage for five brands of tires Assumptions –Populations are normally distributed –Populations have equal variances –Samples are randomly and independently drawn
Hypotheses of One-Way ANOVA -All population means are equal -i.e., no treatment effect (no variation in means among groups) -At least one population mean is different -i.e., there is a treatment effect -Does not mean that all population means are different (some pairs may be the same) - ANOVA does not tell you where the difference lies. For this reason,you need another test, either the Scheffe' or Tukey post ANOVA test.
One-Factor ANOVA All Means are the same: The Null Hypothesis is True (No Treatment Effect)
One-Factor ANOVA At least one mean is different: The Null Hypothesis is NOT true (Treatment Effect is present) or (continued)
Partitioning the Variation Total variation can be split into two parts: SST = Total Sum of Squares (Total variation) SSA = Sum of Squares Among Groups (Among-group variation) SSW = Sum of Squares Within Groups (Within-group variation) SST = SSA + SSW
Partitioning the Variation Total Variation = the aggregate dispersion of the individual data values across the various factor levels (SST) Within-Group Variation = dispersion that exists among the data values within a particular factor level (SSW) Among-Group Variation = dispersion between the factor sample means (SSA) SST = SSA + SSW (continued)
Partition of Total Variation Variation Due to Factor (SSA) Variation Due to Random Sampling (SSW) Total Variation (SST) Commonly referred to as: Sum of Squares Within Sum of Squares Error Sum of Squares Unexplained Within-Group Variation Commonly referred to as: Sum of Squares Between Sum of Squares Among Sum of Squares Explained Among Groups Variation = + d.f. = n – 1 d.f. = c – 1d.f. = n – c
Among-Group Variation Variation Due to Differences Among Groups Mean Square Among = SSA/degrees of freedom (continued)
Within-Group Variation Summing the variation within each group and then adding over all groups Mean Square Within = SSW/degrees of freedom (continued)
One-Way ANOVA Table Source of Variation dfSS MS (Variance) Among Groups SSAMSA = Within Groups n - cSSWMSW = Totaln - 1 SST = SSA+SSW c - 1 MSA MSW F ratio c = number of groups n = sum of the sample sizes from all groups df = degrees of freedom SSA c - 1 SSW n - c F =
One-Way ANOVA: F Test Statistic Test statistic MSA is mean squares among groups MSW is mean squares within groups Degrees of freedom –df 1 = c – 1 (c = number of groups) –df 2 = n – c (n = sum of sample sizes from all populations) H 0 : μ 1 = μ 2 = … = μ c H 1 : At least two population means are different
Interpreting One-Way ANOVA F Statistic The F statistic is the ratio of the among estimate of variance and the within estimate of variance –The ratio must always be positive – df 1 = c -1 will typically be small – df 2 = n - c will typically be large Decision Rule: Reject H 0 if F > F U, otherwise do not reject H 0 0 =.05 Reject H 0 Do not reject H 0 FUFU
One-Way ANOVA : F Test Example You want to see if cholesterol level is different in three groups. You randomly select five patients. Measure their cholesterol levels. At the 0.05 significance level, is there a difference in mean cholesterol? Gp 1 Gp 2 Gp
One-Way ANOVA Example: Scatter Diagram Cholesterol Gp 1 Gp 2 Gp Groups 1 2 3
One-Way ANOVA Example Computations Gp 1 Gp 2 Gp X 1 = X 2 = X 3 = X = n 1 = 5 n 2 = 5 n 3 = 5 n = 15 c = 3 SSA = 5 (249.2 – 227) (226 – 227) (205.8 – 227) 2 = SSW = (254 – 249.2) 2 + (263 – 249.2) 2 +…+ (204 – 205.8) 2 = MSA = / (3-1) = MSW = / (15-3) = 93.3
F = One-Way ANOVA Example Solution H 0 : μ 1 = μ 2 = μ 3 H 1 : μ j not all equal = 0.05 df 1 = 2 df 2 = 12 Test Statistic: Decision: Conclusion: Reject H 0 at = 0.05 There is evidence that at least one μ j differs from the rest 0 =.05 F U = 3.89 Reject H 0 Do not reject H 0 Critical Value: F U = 3.89
Significant and Non-significant Differences Significant: Between > Within Non-significant: Within > Between
ANOVA (summary) Null hypothesis is that there is no difference between the means. Alternate hypothesis is that at least two means differ. Use the F statistic as your test statistic. It tests the between-sample variance (difference between the means) against the within-sample variance (variability within the sample). The larger this is the more likely the means are different. Degrees of freedom for numerator is k-1 (k is the number of treatments) Degrees of freedom for the denominator is n-k (n is the number of responses) If test F is larger than critical F, then reject the null. If p-value is less than alpha, then reject the null.
ANOVA (summary) WHEN YOU REJECT THE NULL For an one-way ANOVA after you have rejected the null, you may want to determine which treatment yielded the best results. Must do follow-on analysis to determine if the difference between each pair of means if significant (post ANOVA test).
One-way ANOVA (example) The study described here is about measuring cortisol levels in 3 groups of subjects : Healthy (n = 16) Depressed: Non-melancholic depressed (n = 22) Depressed: Melancholic depressed (n = 18)
Results Results were obtained as follows Source DF SS MS F P Grp Error Total Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev (------*------) (-----*-----) (------*------) Pooled StDev =
Multiple Comparison of the Means - 1 Several methods are available depending upon whether one wishes to compare means with a control mean (Dunnett) or just overall comparison (Tukey and Fisher) Dunnett's comparisons with a control Critical value = 2.27 Control = level (1) of Grp. Intervals for treatment mean minus control mean Level Lower Center Upper ( * ) ( * )
Multiple Comparison of Means - 2 Tukey's pair wise comparisons Intervals for (column level mean) − (row level mean) Fisher's pair wise comparisons Intervals for (column level mean) − (row level mean) The End