STAT Single-Factor ANOVA

Slides:



Advertisements
Similar presentations
Dr. AJIT SAHAI Director – Professor Biometrics JIPMER, Pondicherry
Advertisements

Lesson #24 Multiple Comparisons. When doing ANOVA, suppose we reject H 0 :  1 =  2 =  3 = … =  k Next, we want to know which means differ. This does.
ANOVA and Linear Models. Data Data is from the University of York project on variation in British liquids. Data is from the University of York project.
Analysis of Variance and Multiple Comparisons Comparing more than two means and figuring out which are different.
Part I – MULTIVARIATE ANALYSIS
Chapter 7 Analysis of ariance Variation Inherent or Natural Variation Due to the cumulative effect of many small unavoidable causes. Also referred to.
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 9: One Way ANOVA Between Subjects
Design of experiment: ANOVA and testing hypotheses
One-way Between Groups Analysis of Variance
Nemours Biomedical Research Statistics March 26, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Comparing Several Means: One-way ANOVA Lesson 14.
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
Chapter 14 Inferential Data Analysis
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
QNT 531 Advanced Problems in Statistics and Research Methods
Intermediate Applied Statistics STAT 460
Chapter 14: Repeated-Measures Analysis of Variance.
STATISTICAL INFERENCE PART IX HYPOTHESIS TESTING - APPLICATIONS – MORE THAN TWO POPULATION.
Chapter 14 Nonparametric Statistics. 2 Introduction: Distribution-Free Tests Distribution-free tests – statistical tests that don’t rely on assumptions.
Statistics 11 Confidence Interval Suppose you have a sample from a population You know the sample mean is an unbiased estimate of population mean Question:
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Chapter 15 – Analysis of Variance Math 22 Introductory Statistics.
One-way ANOVA: - Comparing the means IPS chapter 12.2 © 2006 W.H. Freeman and Company.
Principles of Biostatistics ANOVA. DietWeight Gain (grams) Standard910 8 Junk Food Organic Table shows weight gains for mice on 3 diets.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
Linear Models One-Way ANOVA. 2 A researcher is interested in the effect of irrigation on fruit production by raspberry plants. The researcher has determined.
NON-PARAMETRIC STATISTICS
The t-test: When you use it When you wish to compare the difference between two groups for a continuous (or discrete) variable Examples Blood pressure.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Chapter 11: The ANalysis Of Variance (ANOVA)
When the means of two groups are to be compared (where each group consists of subjects that are not related) then the excel two-sample t-test procedure.
ANOVA: Analysis of Variation
Comparing Multiple Groups:
ANOVA: Analysis of Variation
CHAPTER 7 Linear Correlation & Regression Methods
Chapter 13 Nonlinear and Multiple Regression
Non-Parametric Tests 12/1.
Statistical Data Analysis - Lecture /04/03
Statistical Significance Test
STAT Single-Factor ANOVA
Multiple comparisons
Comparing Three or More Means
Hypothesis testing using contrasts
Comparing ≥ 3 Groups Analysis of Biological Data/Biometrics
Chapter 10: Analysis of Variance: Comparing More Than Two Means
ANalysis Of VAriance (ANOVA)
Planned Comparisons & Post Hoc Tests
Differences Among Group Means: One-Way Analysis of Variance
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
Analysis of Treatment Means
Comparing Groups.
Single-Factor Studies
Single-Factor Studies
Chapter 11: The ANalysis Of Variance (ANOVA)
Analysis of Variance (ANOVA)
Do you know population SD? Use Z Test Are there only 2 groups to
1-Way Analysis of Variance - Completely Randomized Design
I. Statistical Tests: Why do we use them? What do they involve?
STAT Z-Tests and Confidence Intervals for a
One Way ANOVAs One Way ANOVAs
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Model Diagnostics and Tests
CHAPTER 6 Statistical Inference & Hypothesis Testing
Analysis of Treatment Means
Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven.
1-Way Analysis of Variance - Completely Randomized Design
Analysis of Variance (ANOVA)
Presentation transcript:

STAT 312 10.1 - Single-Factor ANOVA Chapter 10 - Analysis of Variance (ANOVA) Introduction 10.1 - Single-Factor ANOVA 10.2 - Multiple Comparisons in ANOVA 10.3 - More on Single-Factor ANOVA

1 2 k = H0: …etc… Idea: Test all possible pairwise comparisons, each via a two-sample t-test. Example : Suppose there are k = 5 treatment groups. There are such comparisons. PROBLEM???

SPURIOUS SIGNIFICANCE!!! Suppose we wish to make n independent comparisons, each at significance level . As the number of comparisons increases, so does the probability of rejection, just by chance! SPURIOUS SIGNIFICANCE!!! When n = 14, this probability is > 50%.

SPURIOUS SIGNIFICANCE!!! Suppose we wish to make n independent comparisons, each at significance level . As the number of comparisons increases, so does the probability of rejection, just by chance! SPURIOUS SIGNIFICANCE!!! When n = 14, this probability is > 50%. One remedy: Make each t-test comparison more conservative, i.e., harder to reject!

SPURIOUS SIGNIFICANCE!!!  p-value Suppose we wish to make n independent comparisons, each at significance level . As the number of comparisons increases, so does the probability of rejection, just by chance! SPURIOUS SIGNIFICANCE!!! When n = 14, this probability is > 50%. One remedy: Make each t-test comparison more conservative, i.e., harder to reject! * =  /n

p-value  * =  /n H0: = PROBLEM??? …etc… 1 2 k = H0: …etc… Idea: Test all possible pairwise comparisons, each via a two-sample t-test. Example : Suppose there are k = 5 treatment groups. There are such comparisons. PROBLEM??? * =  /n

* = .05 /10 = .005 H0: = PROBLEM??? …etc… 1 2 k = H0: …etc… Idea: Test all possible pairwise comparisons, each via a two-sample t-test. Example : Suppose there are k = 5 treatment groups. There are such comparisons. PROBLEM??? * = .05 /10 = .005

BONFERRONI CORRECTION 1 2 k = H0: …etc… Idea: Test all possible pairwise comparisons, each via a two-sample t-test. Example : Suppose there are k = 5 treatment groups. There are such comparisons. BONFERRONI CORRECTION Holm-Bonferroni Tukey’s Honest Significant Difference,… * = .05 /10 = .005

Analysis of Variance (ANOVA) Alternate method ~ Analysis of Variance (ANOVA) Main Idea: Among several (k  2) independent, equivariant, normally-distributed “treatment groups”… MODEL ASSUMPTIONS? 1 2 k = H0:

Analysis of Variance (ANOVA) Alternate method ~ Analysis of Variance (ANOVA) Main Idea: Among several (k  2) independent, equivariant, normally-distributed “treatment groups”… Equivariance can be tested via very similar “two variances” F-test in 6.2.2 (but this is very sensitive to normality assumption), or others. If violated, can extend Welch Test for two means. 1 2 k = H0:

Analysis of Variance (ANOVA) Alternate method ~ Analysis of Variance (ANOVA) Main Idea: Among several (k  2) independent, equivariant, normally-distributed “treatment groups”… Normality can be tested via usual methods. If violated, use nonparametric Kruskal-Wallis Test. 1 2 k = H0:

Analysis of Variance (ANOVA) Alternate method ~ Analysis of Variance (ANOVA) Main Idea: Among several (k  2) independent, equivariant, normally-distributed “treatment groups”… Extensions of ANOVA for data in matched “blocks” designs, repeated measures, multiple factor levels within groups, etc. 1 2 k = H0:

Analysis of Variance (ANOVA) Alternate method ~ Analysis of Variance (ANOVA) Main Idea: Among several (k  2) independent, equivariant, normally-distributed “treatment groups”… How to identify significant group(s)? Pairwise testing, with correction (e.g., Bonferroni) for spurious significance. Example: k = 5 groups result in 10 such tests, so let each α* = α / 10. 1 2 k = H0:

“spurious significance”

Example Different methods of curing concrete are being compared. A total of 27 slabs are selected; 9 are randomly assigned to one of three curing methods. The number of days it takes to be fully cured is recorded: Cure A: 4 5 4 3 2 4 3 4 4 Cure B: 6 8 4 5 4 6 5 8 6 Cure C: 6 7 6 6 7 5 6 5 5 > days = c(4,5,4,3,2,4,3,4,4,6,8,4,5,4,6,5,8,6,6,7,6,6,7,5,6,5,5) > cure = c(rep("A",9), rep("B",9), rep("C",9)) > concrete = data.frame(days,cure) > concrete days cure 1 4 A 2 5 A 3 4 A 4 3 A ... 26 5 C 27 5 C

Example Different methods of curing concrete are being compared. A total of 27 slabs are selected; 9 are randomly assigned to one of three curing methods. The number of days it takes to be fully cured is recorded: Cure A: 4 5 4 3 2 4 3 4 4 Cure B: 6 8 4 5 4 6 5 8 6 Cure C: 6 7 6 6 7 5 6 5 5 > days = c(4,5,4,3,2,4,3,4,4,6,8,4,5,4,6,5,8,6,6,7,6,6,7,5,6,5,5) > cure = c(rep("A",9), rep("B",9), rep("C",9)) > concrete = data.frame(days,cure) > boxplot(days ~ cure, data = concrete)

Example Different methods of curing concrete are being compared. A total of 27 slabs are selected; 9 are randomly assigned to one of three curing methods. The number of days it takes to be fully cured is recorded: Cure A: 4 5 4 3 2 4 3 4 4 Cure B: 6 8 4 5 4 6 5 8 6 Cure C: 6 7 6 6 7 5 6 5 5 > days = c(4,5,4,3,2,4,3,4,4,6,8,4,5,4,6,5,8,6,6,7,6,6,7,5,6,5,5) > cure = c(rep("A",9), rep("B",9), rep("C",9)) > concrete = data.frame(days,cure) > boxplot(days ~ cure, data = concrete) Method A seems the fastest; B and C seem comparable, both taking a longer time.

Example Different methods of curing concrete are being compared. A total of 27 slabs are selected; 9 are randomly assigned to one of three curing methods. The number of days it takes to be fully cured is recorded: Cure A: 4 5 4 3 2 4 3 4 4 Cure B: 6 8 4 5 4 6 5 8 6 Cure C: 6 7 6 6 7 5 6 5 5 STRONGLY REJECTED > days = c(4,5,4,3,2,4,3,4,4,6,8,4,5,4,6,5,8,6,6,7,6,6,7,5,6,5,5) > cure = c(rep("A",9), rep("B",9), rep("C",9)) > concrete = data.frame(days,cure) > results = aov(days ~ cure, data = concrete) > summary(results) Df Sum Sq Mean Sq F value Pr(>F) cure 2 28.222 14.1111 11.906 0.0002559 *** Residuals 24 28.444 1.1852 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Pairwise comparisons using t tests with pooled SD Example Different methods of curing concrete are being compared. A total of 27 slabs are selected; 9 are randomly assigned to one of three curing methods. The number of days it takes to be fully cured is recorded: Cure A: 4 5 4 3 2 4 3 4 4 Cure B: 6 8 4 5 4 6 5 8 6 Cure C: 6 7 6 6 7 5 6 5 5 A B C STRONGLY REJECTED > days = c(4,5,4,3,2,4,3,4,4,6,8,4,5,4,6,5,8,6,6,7,6,6,7,5,6,5,5) > cure = c(rep("A",9), rep("B",9), rep("C",9)) > concrete = data.frame(days,cure) > results = aov(days ~ cure, data = concrete) Multiple Comparisons? > pairwise.t.test(days, cure, p.adjust="bonferroni") Pairwise comparisons using t tests with pooled SD data: days and cure A B B 0.00119 - C 0.00068 1.00000 P value adjustment method: bonferroni

Example Different methods of curing concrete are being compared. A total of 27 slabs are selected; 9 are randomly assigned to one of three curing methods. The number of days it takes to be fully cured is recorded: Cure A: 4 5 4 3 2 4 3 4 4 Cure B: 6 8 4 5 4 6 5 8 6 Cure C: 6 7 6 6 7 5 6 5 5 A B C STRONGLY REJECTED > days = c(4,5,4,3,2,4,3,4,4,6,8,4,5,4,6,5,8,6,6,7,6,6,7,5,6,5,5) > cure = c(rep("A",9), rep("B",9), rep("C",9)) > concrete = data.frame(days,cure) > results = aov(days ~ cure, data = concrete) Multiple Comparisons? Alt Method: Tukey’s Honest Significant Difference (HSD) > TukeyHSD(results, conf.level = 0.95) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = days ~ cure, data = concrete) $cure diff lwr upr p adj B-A 2.1111111 0.8295028 3.392719 0.0011107 C-A 2.2222222 0.9406139 3.503831 0.0006453 C-B 0.1111111 -1.1704972 1.392719 0.9745173