Analysis of Variance or ANOVA. In ANOVA, we are interested in comparing the means of different populations (usually more than 2 populations). Since this.

Slides:



Advertisements
Similar presentations
Chapter 11 Analysis of Variance
Advertisements

i) Two way ANOVA without replication
Chapter 12 ANALYSIS OF VARIANCE.
Analysis and Interpretation Inferential Statistics ANOVA
ANALYSIS OF VARIANCE.
Statistics for Managers Using Microsoft® Excel 5th Edition
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
Chapter 11 Analysis of Variance
Experimental Design & Analysis
Statistics Are Fun! Analysis of Variance
Chapter 3 Analysis of Variance
Statistics for Managers Using Microsoft® Excel 5th Edition
PSY 307 – Statistics for the Behavioral Sciences
Chapter 17 Analysis of Variance
ANOVA Single Factor Models Single Factor Models. ANOVA ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 15 Analysis of Variance.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chap 10-1 Analysis of Variance. Chap 10-2 Overview Analysis of Variance (ANOVA) F-test Tukey- Kramer test One-Way ANOVA Two-Way ANOVA Interaction Effects.
F-Test ( ANOVA ) & Two-Way ANOVA
The basic idea So far, we have been comparing two samples
1 Chapter 11 Analysis of Variance Introduction 11.2 One-Factor Analysis of Variance 11.3 Two-Factor Analysis of Variance: Introduction and Parameter.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
INFERENTIAL STATISTICS: Analysis Of Variance ANOVA
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
© 2002 Prentice-Hall, Inc.Chap 9-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 9 Analysis of Variance.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
One-Factor Analysis of Variance A method to compare two or more (normal) population means.
Chapter 10 Analysis of Variance.
Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set.
TOPIC 11 Analysis of Variance. Draw Sample Populations μ 1 = μ 2 = μ 3 = μ 4 = ….. μ n Evidence to accept/reject our claim Sample mean each group, grand.
One-Way Analysis of Variance … to compare 2 or population means.
Testing Hypotheses about Differences among Several Means.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
INTRODUCTION TO ANALYSIS OF VARIANCE (ANOVA). COURSE CONTENT WHAT IS ANOVA DIFFERENT TYPES OF ANOVA ANOVA THEORY WORKED EXAMPLE IN EXCEL –GENERATING THE.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
Comparing Three or More Means ANOVA (One-Way Analysis of Variance)
One-Way ANOVA ANOVA = Analysis of Variance This is a technique used to analyze the results of an experiment when you have more than two groups.
ANALYSIS OF VARIANCE (ANOVA) BCT 2053 CHAPTER 5. CONTENT 5.1 Introduction to ANOVA 5.2 One-Way ANOVA 5.3 Two-Way ANOVA.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
Lecture 9-1 Analysis of Variance
CHAPTER 4 Analysis of Variance One-way ANOVA
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.
Chapter 4 Analysis of Variance
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
CHAPTER 10 ANOVA - One way ANOVa.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
EDUC 200C Section 9 ANOVA November 30, Goals One-way ANOVA Least Significant Difference (LSD) Practice Problem Questions?
While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 121.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Chapter 12 Introduction to Analysis of Variance
DSCI 346 Yamasaki Lecture 4 ANalysis Of Variance.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Chapter 11 Analysis of Variance
Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
Characteristics of F-Distribution
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
i) Two way ANOVA without replication
Applied Business Statistics, 7th ed. by Ken Black
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chapter 11 Analysis of Variance
Week ANOVA Four.
Presentation transcript:

Analysis of Variance or ANOVA

In ANOVA, we are interested in comparing the means of different populations (usually more than 2 populations). Since this technique has frequently been applied to medical research, samples from the different populations are referred to as different treatment groups.

Main Idea behind Analysis of Variance Measure the variability within the treatment groups and also the variability between the treatment groups. The “between” variability is due to both differences in treatment and the random error component. The “within” variability is due only to the random error component. So, if there are significant treatment differences, the “between” variability should be substantially larger than the “within” variability.

In ANOVA, the null hypothesis H 0 is always essentially “there are no differences in treatment” and the alternative hypothesis H 1 is always essentially “there is a difference in treatment.” If the problem says “test whether there are no treatment effects,” H 0 is “there are NO differences in treatment.” If the problem says “test whether there are treatment effects,” H 0 is “there are NO differences in treatment.”

We will begin with one-factor ANOVA. Suppose we’re looking at the effects of different types of heart medication on pulse rate. An individual’s pulse rate is the sum of different effects. If there is no difference in the effects of the various heart medications or treatments, then the middle effect above is zero. That is what we would like to test.

Example 1: A manager is comparing the output of 3 machines. He has recorded 5 randomly selected minutes of operation for machine 1, 10 minutes for machine 2, and 6 minutes for machine 3. The output is as shown below. Machine 1Machine 2Machine 3 Y1Y1 Y2Y2 Y3Y

In this example, we have n = # of observations = 21 and c = # of groups (or # of machines here) = 3 Machine 1Machine 2Machine 3 Y1Y1 Y2Y2 Y3Y

We compute the mean output per minute for each machine. The grand mean for all the machines combined is ( )/21 = Machine 1Machine 2Machine 3 Y1Y1 Y2Y2 Y3Y

We next look at the variation in output by computing sums of squared deviations. Machine 1Machine 2Machine 3 Y1Y1 Y2Y2 Y3Y

The sum of squared variation within groups or SS within is the sum of the sums of squared deviations for the 3 groups = = Machine 1Machine 2Machine 3 Y1Y1 Y2Y2 Y3Y

The mean squared variation within groups MS within = (SS Within) / (n – c). = 72.63/(21 – 3) = 72.63/18 = 4.035

The mean square variation between groups or MS Between is In our example, n 1 =5 n 2 =10 n 3 =6 c = 3 The sum of squared variation between groups or SS Between is

The sum of squares total or SS Total = SS Between + SS Within The mean squared total or MS Total = SS Total /( n – 1)

We compile our information in a table called an ANOVA table. Source of variation Degrees of freedom Sum of squares Mean square Between Groups c – 1 3 – 1 = Within Groups n – c 21 – 3 = Total n – 1 21 – 1 = Notice that the degrees of freedom column adds up to the total at the bottom, and the sum of squares column adds up to the total at the bottom, but the mean square column does NOT add up to the total at the bottom.

Remember that the purpose of all the work we have been doing is to test whether there is a “treatment effect,” that is a difference in the means of the various groups. The hypotheses are H 0 : there is no difference in the means, and H 1 : there is a difference in the means. In our current example, we want to know if there is a difference in the mean output level of the various machines.

The statistic we use here is the F-statistic: The subscripts are the degrees of freedom of the F-statistic and they are the degrees of freedom that are associated with the numerator and the denominator of the statistic.

F c-1, n-c f( F c-1, n-c ) acceptance region crit. reg. Like the  2, the F distribution is skewed to the right, and the critical region for the test is the right tail.

For our current example, we have: The F table shows that for 2 and 18 degrees of freedom, the 5% critical value is Since our F has a value of 7.33, we reject the null hypothesis and conclude that there is a difference in the mean output levels of the 3 machines. f( F 2, 18 ) F 2, 18 acceptance region crit. reg

Example 2: In order to compare the average tread life of 3 brands of tires, random samples of 6 tires of each brand are tested on a machine which simulates road conditions. The tread life for each tire is measured. Complete the analysis of variance table and test at the 1% level whether there is a difference in the average tread life of the 3 tire brands. Source of variation Sum of squares Degrees of freedom Mean square Between Within Total

First we can calculate SS Total by adding SS Between and SS Within. Source of variation Sum of squares Degrees of freedom Mean square Between Within Total343.61

The degrees of freedom for the SS Between is c – 1 = 3 – 1 = 2, since there are three groups (3 tire brands). Source of variation Sum of squares Degrees of freedom Mean square Between Within Total343.61

The degrees of freedom for the SS Within is n – c = 18 – 3 = 15, since there are 18 observations (6 of each of the 3 brands). Source of variation Sum of squares Degrees of freedom Mean square Between Within Total343.61

The degrees of freedom for the SS total is n – 1 = 18 – 1 = 17. Source of variation Sum of squares Degrees of freedom Mean square Between Within Total

The MS Between is (SS Between) / (dof Between) = / 2 = Source of variation Sum of squares Degrees of freedom Mean square Between Within Total

The MS Within is (SS Within) / (dof Within) = / 15 = Source of variation Sum of squares Degrees of freedom Mean square Between Within Total

The MS Total is (SS Total ) / (dof Total ) = / 17 = Source of variation Sum of squares Degrees of freedom Mean square Between Within Total

So, The F table shows that for 2 and 15 degrees of freedom, the 1% critical value is Since our F has a value of 14.19, we reject the null hypothesis and conclude that there is a difference in the mean tread life of the 3 tire brands. f( F 2, 15 ) F 2, 15 acceptance region crit. reg

Two-Factor ANOVA with Interaction Effects Now, we have two factors of interest, A and B, and the factors may interact to influence a particular variable Y with which we are concerned. For example, we may want to explore the effects of different teachers and different teaching methods on student performance. So,

We have 3 sets of hypotheses. H 0 : factor A has no effect on Y. H 1 : factor A has an effect on Y. H 0 : factor B has no effect on Y. H 1 : factor B has an effect on Y. H 0 : the interaction of factors A and B has no effect on Y. H 1 : the interaction of factors A and B has an effect on Y.

For each A and B possibility, we have r replications (or observations). For example, suppose we have a = 4 different teachers, b = 3 different methods, and r = 3 replications. There are 12 different teaching possibilities: You could have teacher A, B, C, or D, and that instructor could be using method 1, 2, or 3. For each one of these 12 situations, we have 3 replications, or 3 observations, or exam scores of 3 students.

Our ANOVA table now looks like this: Source of Variation Sum of Squares Degrees of Freedom Mean Square Factor ASSAa – 1MSA Factor BSSBb – 1MSB Interaction between A & B SSAB(a – 1)(b – 1)MSAB Random ErrorSSEab(r – 1 )MSE TotalSST n – 1 (or abr – 1) MST

Test Statistics for Two-Way ANOVA Testing for the effect of the interaction of A and B: Testing for the effect of factor A: Testing for the effect of factor B: Notice that in all three cases, the denominator is the same; it’s the Mean Squared Error.

Example 3: Consider the following ANOVA table relating student performance to 4 teachers, 3 teaching methods, and the interaction of those 2 factors, using 3 replications. Source of Variation Sum of Squares Degrees of Freedom Mean Square Teacher300 Method400 Interaction900 Error1200 Total

Complete the table and then test at the 5% level whether student performance depends on (1) the teacher, (2) the teaching method, and (3) the interaction of the teacher and the method. Source of Variation Sum of Squares Degrees of Freedom Mean Square Teacher300 Method400 Interaction900 Error1200 Total

First, we complete the table. Source of Variation Sum of Squares Degrees of Freedom Mean Square Teacher3004 – 1 = 3100 Method4003 – 1 = 2200 Interaction900(4 – 1)(3 – 1) = 6150 Error1200 ab(r – 1) = (4)(3)(3 – 1) = Total2800 n – 1 = abr – 1 = 36 – 1 = 35 80

Now we test the hypotheses. We’ll begin with the teachers. Source of Variation SSdofMS Teacher Method Interaction Error Total f( F 3, 24 ) F 3, 24 acceptance region crit. reg From the F table, we find that the 5% critical value for 3 and 24 degrees of freedom is Since our statistic, 2.00, is in the acceptance region, we accept H 0 that the teacher has no effect on student performance.

Next we look at the teaching methods. Source of Variation SSdofMS Teacher Method Interaction Error Total f( F 2, 24 ) F 2, 24 acceptance region crit. reg From the F table, we find that the 5% critical value for 2 and 24 degrees of freedom is Since our statistic, 4.00, is in the critical region, we reject H 0 and accept H 1 that the method does affect student performance.

Last we look at the interaction of teachers and methods. Source of Variation SSdofMS Teacher Method Interaction Error Total f( F 6, 24 ) F 6, 24 acceptance region crit. reg From the F table, we find that the 5% critical value for 6 and 24 degrees of freedom is Since 3 is in the critical region, we reject H 0 & accept H 1 : there is an interaction effect of teacher & method on student performance.