Lecture 12 One-way Analysis of Variance (Chapter 15.2)

Slides:



Advertisements
Similar presentations
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 10 The Analysis of Variance.
Advertisements

Ch 14 實習(2).
Lecture 11 One-way analysis of variance (Chapter 15.2)
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Analysis of Variance (ANOVA) ANOVA can be used to test for the equality of three or more population means We want to use the sample results to test the.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Part I – MULTIVARIATE ANALYSIS
Analysis of Variance Chapter Introduction Analysis of variance compares two or more populations of interval data. Specifically, we are interested.
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Analysis of Variance: Inferences about 2 or More Means
Comparing Means.
Analysis of Variance Chapter Introduction Analysis of variance compares two or more populations of interval data. Specifically, we are interested.
Lecture 13 Multiple comparisons for one-way ANOVA (Chapter 15.7)
Analysis of Variance Chapter 15 - continued Two-Factor Analysis of Variance - Example 15.3 –Suppose in Example 15.1, two factors are to be examined:
Lecture 10 Inference about the difference between population proportions (Chapter 13.6) One-way analysis of variance (Chapter 15.2)
One-way Between Groups Analysis of Variance
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Comparing Means.
Lecture 13: Tues., Feb. 24 Comparisons Among Several Groups – Introduction (Case Study 5.1.1) Comparing Any Two of the Several Means (Chapter 5.2) The.
Chapter 12: Analysis of Variance
F-Test ( ANOVA ) & Two-Way ANOVA
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Hypothesis testing – mean differences between populations
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
QNT 531 Advanced Problems in Statistics and Research Methods
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 13 Experimental Design and Analysis of Variance nIntroduction to Experimental Design.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
Analysis of Variance Chapter 12 Introduction Analysis of variance compares two or more populations of interval data. Specifically, we are interested.
More About Significance Tests
Analysis of Variance ( ANOVA )
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Analysis of Variance ST 511 Introduction n Analysis of variance compares two or more populations of quantitative data. n Specifically, we are interested.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
Economics 173 Business Statistics Lectures 9 & 10 Summer, 2001 Professor J. Petry.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
Chapter 10 Analysis of Variance.
ANOVA (Analysis of Variance) by Aziza Munir
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Analysis of Variance.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Chapter 15 Analysis of Variance ( ANOVA ). Analysis of Variance… Analysis of variance is a technique that allows us to compare two or more populations.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
Comparing Three or More Means ANOVA (One-Way Analysis of Variance)
1 Analysis of Variance Chapter 14 2 Introduction Analysis of variance helps compare two or more populations of quantitative data. Specifically, we are.
Chapter 8 1-Way Analysis of Variance - Completely Randomized Design.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
ANOVA P OST ANOVA TEST 541 PHL By… Asma Al-Oneazi Supervised by… Dr. Amal Fatani King Saud University Pharmacy College Pharmacology Department.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd.. 1 Slide Slide Slides Prepared by Juei-Chao Chen Fu Jen Catholic University Slides Prepared.
Analysis of Variance STAT E-150 Statistical Methods.
Chapters Way Analysis of Variance - Completely Randomized Design.
Chapter 14: Analysis of Variance One-way ANOVA Lecture 9a Instructor: Naveen Abedin Date: 24 th November 2015.
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
DSCI 346 Yamasaki Lecture 4 ANalysis Of Variance.
Chapter 15 Analysis of Variance. The article “Could Mean Platelet Volume be a Predictive Marker for Acute Myocardial Infarction?” (Medical Science Monitor,
Keller: Stats for Mgmt & Econ, 7th Ed Analysis of Variance
Comparing Three or More Means
Statistics Analysis of Variance.
Statistics for Business and Economics (13e)
Comparing Three or More Means
1-Way Analysis of Variance - Completely Randomized Design
One-Way Analysis of Variance
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

Lecture 12 One-way Analysis of Variance (Chapter 15.2) Multiple comparisons for one-way ANOVA (Chapter 15.7)

Review of one-way ANOVA Objective: Compare the means of K populations of interval data based on independent random samples from each. H0: H1: At least two means differ Notation: xij – ith observation of jth sample; - mean of the jth sample; nj – number of observations in jth sample; - grand mean of all observations

Example 15.1 The marketing manager for an apple juice manufacturer needs to decide how to market a new product. Three strategies are considered, which emphasize the convenience, quality and low price of product respectively. An experiment was conducted as follows: In three cities an advertisement campaign was launched . In each city only one of the three characteristics (convenience, quality, and price) was emphasized. The weekly sales were recorded for twenty weeks following the beginning of the campaigns.

Rationale Behind Test Statistic Two types of variability are employed when testing for the equality of population means Variability of the sample means Variability within samples Test statistic is essentially (Variability of the sample means)/(Variability within samples)

The rationale behind the test statistic – I If the null hypothesis is true, we would expect all the sample means to be close to one another (and as a result, close to the grand mean). If the alternative hypothesis is true, at least some of the sample means would differ. Thus, we measure variability between sample means.

Variability between sample means The variability between the sample means is measured as the sum of squared distances between each group mean and the grand mean times the sample size of the group. This sum is called the Sum of Squares for Treatments SST In our example treatments are represented by the different advertising strategies.

Sum of squares for treatments (SST) There are k treatments The mean of sample j The size of sample j Note: When the sample means are close to one another, their distance from the grand mean is small, leading to a small SST. Thus, large SST indicates large variation between sample means, which supports H1.

Sum of squares for treatments (SST) Solution – continued Calculate SST The grand mean is calculated by = 20(577.55 - 613.07)2 + + 20(653.00 - 613.07)2 + + 20(608.65 - 613.07)2 = = 57,512.23

Sum of squares for treatments (SST) Is SST = 57,512.23 large enough to reject H0 in favor of H1? Large compared to what?

A small variability within the samples makes it easier 20 25 30 1 7 Treatment 1 Treatment 2 Treatment 3 10 12 19 9 20 16 15 14 11 10 9 A small variability within the samples makes it easier to draw a conclusion about the population means. The sample means are the same as before, but the larger within-sample variability makes it harder to draw a conclusion about the population means. Treatment 1 Treatment 2 Treatment 3

The rationale behind test statistic – II Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. Therefore, even though sample means may markedly differ from one another, SST must be judged relative to the “within samples variability”.

Within samples variability The variability within samples is measured by adding all the squared distances between observations and their sample means. This sum is called the Sum of Squares for Error SSE In our example this is the sum of all squared differences between sales in city j and the sample mean of city j (over all the three cities).

Sum of squares for errors (SSE) Solution – continued Calculate SSE = (n1 - 1)s12 + (n2 -1)s22 + (n3 -1)s32 = (20 -1)10,774.44 + (20 -1)7,238.61+ (20-1)8,670.24 = 506,983.50

Sum of squares for errors (SSE) Is SST = 57,512.23 large enough relative to SSE = 506,983.50 to reject the null hypothesis that specifies that all the means are equal?

The mean sum of squares To perform the test we need to calculate the mean squares as follows: Calculation of MST - Mean Square for Treatments Calculation of MSE Mean Square for Error

The F test rejection region And finally the hypothesis test: H0: m1 = m2 = …=mk H1: At least two means differ Test statistic: R.R: F>Fa,k-1,n-k

The F test Ho: m1 = m2= m3 H1: At least two means differ Test statistic F= MST/ MSE= 3.23 Since 3.23 > 3.15, there is sufficient evidence to reject Ho in favor of H1, and argue that at least one of the mean sales is different than the others.

Required Conditions for Test Independent simple random samples from each population The populations are normally distributed (look for extreme skewness and outliers, probably okay regardless if each ). The variances of all the populations are equal (Rule of thumb: Check if largest sample standard deviation is less than twice the smallest standard deviation)

ANOVA Table – Example 15.1 Analysis of Variance Source DF Sum of Squares Mean Square F Ratio Prob > F City 2 57512.23 28756.1 3.2330 0.0468 Error 57 506983.50 8894.4   C. Total 59 564495.73

Model for ANOVA = ith observation of jth sample is the overall mean level, is the differential effect of the jth treatment and is the random error in the ith observation under the jth treatment. The errors are assumed to be independent, normally distributed with mean zero and variance The are normalized:

Model for ANOVA Cont. The expected response to the jth treatment is Thus, if all treatments have the same expected response (i.e., H0 : all populations have same mean), . In general, is the difference between the means of population j and j’. MSE is estimate of Sums of squares decomposition: SS(Total)=SST+SSE

Review Question: Intro to ANOVA Which case does ANOVA generalize? 2-sample mean comparison with equal variance assumption 2-sample mean comparison with unequal variances permitted 2-sample variance comparison

Relationship between F-test and t-test for two samples For comparing two samples, the F-statistic equals the square of the t-statistic with equal variances. For two samples, the ANOVA F-test is equivalent to testing versus .

Comparing Pairs What crucial information does F not provide? If F is statistically significant, there is evidence that not all group means are equal, but we don’t know where the differences between group means are. Ex. of differences that make F stat. significant: Assume 4 groups with true means: , , , e.g., sales at 4 locations of a store

15.7 Multiple Comparisons When the null hypothesis is rejected, it may be desirable to find which mean(s) is (are) different, and at how they rank. Three statistical inference procedures, geared at doing this, are presented: Fisher’s least significant difference (LSD) method Bonferroni adjustment to Fisher’s LSD Tukey’s multiple comparison method

15.7 Multiple Comparisons Two means are considered different if the difference between the corresponding sample means is larger than a critical number. Then, the larger sample mean is believed to be associated with a larger population mean. Conditions common to all the methods here: The ANOVA model is the one way analysis of variance The conditions required to perform the ANOVA are satisfied.

Fisher Least Significant Different (LSD) Method This method builds on the equal variances t-test of the difference between two means. The test statistic is improved by using MSE rather than sp2. We conclude that mi and mj differ (at a% significance level if > LSD, where

Experimentwise Type I error rate (aE) (the effective Type I error) Using Fisher’s method may result in an increased probability of committing a type I error. The experimentwise Type I error rate is the probability of committing at least one Type I error at significance level of a. If C independent tests are done, aE = 1-(1 – a)C The Bonferroni adjustment determines the required Type I error probability per test (a) , to secure a pre-determined overall aE.

Multiple Comparisons Problem A hypothetical study of the effect of birth control pills is done. Two groups of women (one taking birth controls, the other not) are followed and 100 variables are recorded for each subject such as blood pressure, psychological and medical problems. After the study, two-sample t-tests are performed for each variable and it is found that women taking birth pills have higher incidences of depression at the 5% significance level (the p-value equals .02). Does this provide strong evidence that women taking birth control pills are more likely to be depressed?

Bonferroni Adjustment Suppose we carry out C tests at significance level If the null hypothesis for each test is true, the probability that we will falsely reject at least one hypothesis is at most Thus, if we carry out C tests at significance level , the experimentwise Type I error rate is at most

Bonferroni Adjustment for ANOVA The procedure: Compute the number of pairwise comparisons (C) [all: C=k(k-1)/2], where k is the number of populations. Set a = aE/C, where aE is the true probability of making at least one Type I error (called experimentwise Type I error). We conclude that mi and mj differ at a/C% significance level (experimentwise error rate at most )

Fisher and Bonferroni Methods Example 15.1 - continued Rank the effectiveness of the marketing strategies (based on mean weekly sales). Use the Fisher’s method, and the Bonferroni adjustment method Solution (the Fisher’s method) The sample mean sales were 577.55, 653.0, 608.65. Then,

Fisher and Bonferroni Methods Solution (the Bonferroni adjustment) We calculate C=k(k-1)/2 to be 3(2)/2 = 3. We set a = .05/3 = .0167, thus t.0167/2, 60-3 = 2.467 (Excel). Again, the significant difference is between m1 and m2.

Tukey Multiple Comparisons The test procedure: Assumes equal number of obs. per populations. Find a critical number w as follows: k = the number of populations n =degrees of freedom = n - k ng = number of observations per population a = significance level qa(k,n) = a critical value obtained from the studentized range table (app. B17/18)

Tukey Multiple Comparisons Select a pair of means. Calculate the difference between the larger and the smaller mean. If there is sufficient evidence to conclude that mmax > mmin . Repeat this procedure for each pair of samples. Rank the means if possible. If the sample sizes are not extremely different, we can use the above procedure with ng calculated as the harmonic mean of the sample sizes.

Tukey Multiple Comparisons Example 15.1 - continued We had three populations (three marketing strategies). K = 3, Sample sizes were equal. n1 = n2 = n3 = 20, n = n-k = 60-3 = 57, MSE = 8894. Take q.05(3,60) from the table: 3.40. Population Sales - City 1 Sales - City 2 Sales - City 3 Mean 577.55 653 698.65 City 1 vs. City 2: 653 - 577.55 = 75.45 City 1 vs. City 3: 608.65 - 577.55 = 31.1 City 2 vs. City 3: 653 - 608.65 = 44.35

Practice Problems 15.16, 15.22, 15.26, 15.66