STATISTICAL INFERENCE PART IX HYPOTHESIS TESTING - APPLICATIONS – MORE THAN TWO POPULATION
INFERENCES ABOUT POPULATION MEANS Example: H o : 1 = 2 = 3 where 1 = population mean for group 1 2 = population mean for group 2 3 = population mean for group 3 H 1 : Not all are equal. 2
Assumptions Each of the populations are normally distributed (or large enough sample sizes to use CLT) with equal variances Populations are independent Cases within each sample are independent 3
Difference in means small relative to overall variability Difference in means large relative to overall variability Larger F-values typically yield more significant results. How large is large enough? We will compare with the tabulated value. INFERENCES ABOUT POPULATION MEANS - ANOVA F tends to be small F tends to be large 4
INFERENCES ABOUT POPULATION MEANS If F test shows that there are significant differences between means, then, apply paired t-tests to see which one(s) are different. Apply multiple testing correction to control for Type I error 5
Example Kenton Food Company wants to test 4 different package designs for a new product. Designs are introduced in 20 randomly selected markets. These markets are similar to each other in terms of location and sales records. Due to a fire incidence, one of these markets are removed from the study, leading to an unbalanced study design. 6 Example is taken from: Neter, J., Kutner, M.H., Nachtsheim, C.J., & Wasserman, W., (1996) Applied Linear Statistical Models, 4th edition, Irwin.
Example Market (j) Design (i) Is there a difference among designs in terms of their average sales? 7
Example > va1=read.table("VAT1.txt",header=T) > head(va1,3) Case Design Market Sales > aov1 = aov(Sales ~ Design,data=va1) > summary(aov1) Df Sum Sq Mean Sq F value Pr(>F) Design e-05 *** Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Degrees of freedoms are wrong! Since there are 4 different designs, d.f. should be 3. 8
Example > class(va1[,2]) [1] "integer" > va1[,2]=as.factor(va1[,2]) > aov1 = aov(Sales ~ Design,data=va1) > summary(aov1) Df Sum Sq Mean Sq F value Pr(>F) Design e-05 *** Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # or, alternatively: > aov1 = aov(Sales ~factor(Design),data=va1) 4 designs have different mean sales. But, which one(s) are different? 9
Example > library(multcomp) > c1=glht(aov1, linfct = mcp(Design = "Tukey")) > summary(c1) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: aov(formula = Sales ~ Design, data = va1) Linear Hypotheses: EstimateStd. Errort value Pr(>|t|) == == == <0.001 *** == == <0.001 *** == * Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Adjusted p values reported -- single-step method) 4 th design has higher average sales than all other designs. 3 rd design is slightly significantly better than 2 nd design. 10
Example # or, alternatively > TukeyHSD(aov1, "Tasarim", conf.level=0.9) There are many functions in R available for multiple testing correction. For instance, you can look into “p.adjust” function in stats library for other types of corrections (e.g. Bonferroni). Supply raw p-values obtain adjusted p-values. Different ANOVA types (e.g. 2-factor, repeated,…) in R; reference: Ilk, O. (2011) R Yazilimina Giris, ODTU, Chp. 7 11