Download presentation
Presentation is loading. Please wait.
Published byRosaline Nicholson Modified over 9 years ago
1
STATISTICAL INFERENCE PART IX HYPOTHESIS TESTING - APPLICATIONS – MORE THAN TWO POPULATION
2
INFERENCES ABOUT POPULATION MEANS Example: H o : 1 = 2 = 3 where 1 = population mean for group 1 2 = population mean for group 2 3 = population mean for group 3 H 1 : Not all are equal. 2
3
Assumptions Each of the populations are normally distributed (or large enough sample sizes to use CLT) with equal variances Populations are independent Cases within each sample are independent 3
4
Difference in means small relative to overall variability Difference in means large relative to overall variability Larger F-values typically yield more significant results. How large is large enough? We will compare with the tabulated value. INFERENCES ABOUT POPULATION MEANS - ANOVA F tends to be small F tends to be large 4
5
INFERENCES ABOUT POPULATION MEANS If F test shows that there are significant differences between means, then, apply paired t-tests to see which one(s) are different. Apply multiple testing correction to control for Type I error 5
6
Example Kenton Food Company wants to test 4 different package designs for a new product. Designs are introduced in 20 randomly selected markets. These markets are similar to each other in terms of location and sales records. Due to a fire incidence, one of these markets are removed from the study, leading to an unbalanced study design. 6 Example is taken from: Neter, J., Kutner, M.H., Nachtsheim, C.J., & Wasserman, W., (1996) Applied Linear Statistical Models, 4th edition, Irwin.
7
Example Market (j) Design (i) 12345 11117161415 21210151911 323201817 42733222628 Is there a difference among designs in terms of their average sales? 7
8
Example > va1=read.table("VAT1.txt",header=T) > head(va1,3) Case Design Market Sales 1 1 1 1 11 2 2 1 2 17 3 3 1 3 16 > aov1 = aov(Sales ~ Design,data=va1) > summary(aov1) Df Sum Sq Mean Sq F value Pr(>F) Design 1 483.08 483.08 31.186 3.289e-05 *** Residuals 17 263.34 15.49 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Degrees of freedoms are wrong! Since there are 4 different designs, d.f. should be 3. 8
9
Example > class(va1[,2]) [1] "integer" > va1[,2]=as.factor(va1[,2]) > aov1 = aov(Sales ~ Design,data=va1) > summary(aov1) Df Sum Sq Mean Sq F value Pr(>F) Design 3 588.22 196.074 18.591 2.585e-05 *** Residuals 15 158.20 10.547 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # or, alternatively: > aov1 = aov(Sales ~factor(Design),data=va1) 4 designs have different mean sales. But, which one(s) are different? 9
10
Example > library(multcomp) > c1=glht(aov1, linfct = mcp(Design = "Tukey")) > summary(c1) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: aov(formula = Sales ~ Design, data = va1) Linear Hypotheses: EstimateStd. Errort value Pr(>|t|) 2 - 1 == 0 -1.200 2.054-0.584 0.9352 3 - 1 == 0 4.900 2.1792.249 0.1545 4 - 1 == 0 12.6002.0546.135 <0.001 *** 3 - 2 == 0 6.1002.1792.800 0.0584. 4 - 2 == 0 13.8002.0546.719 <0.001 *** 4 - 3 == 0 7.7002.1793.534 0.0141 * Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Adjusted p values reported -- single-step method) 4 th design has higher average sales than all other designs. 3 rd design is slightly significantly better than 2 nd design. 10
11
Example # or, alternatively > TukeyHSD(aov1, "Tasarim", conf.level=0.9) There are many functions in R available for multiple testing correction. For instance, you can look into “p.adjust” function in stats library for other types of corrections (e.g. Bonferroni). Supply raw p-values obtain adjusted p-values. Different ANOVA types (e.g. 2-factor, repeated,…) in R; reference: Ilk, O. (2011) R Yazilimina Giris, ODTU, Chp. 7 11
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.