Analysis of Variance (ANOVA) W&W, Chapter 10. Introduction Last time we learned about the chi square test for independence, which is useful for data that.

Slides:



Advertisements
Similar presentations
Chapter 11 Analysis of Variance
Advertisements

Chapter 14, part D Statistical Significance. IV. Model Assumptions The error term is a normally distributed random variable and The variance of  is constant.
Design of Experiments and Analysis of Variance
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Analysis of Variance Chapter Introduction Analysis of variance compares two or more populations of interval data. Specifically, we are interested.
Chapter 11 Analysis of Variance
Statistics for Business and Economics
ANOVA Types of Variation. Variation between Groups Weighted sum of variances between sample mean and overall mean Large  factor affects system Small.
Lesson #23 Analysis of Variance. In Analysis of Variance (ANOVA), we have: H 0 :  1 =  2 =  3 = … =  k H 1 : at least one  i does not equal the others.
PSY 307 – Statistics for the Behavioral Sciences
Chapter 17 Analysis of Variance
Statistics for the Social Sciences Psychology 340 Spring 2005 Within Groups ANOVA.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 15 Analysis of Variance.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chap 10-1 Analysis of Variance. Chap 10-2 Overview Analysis of Variance (ANOVA) F-test Tukey- Kramer test One-Way ANOVA Two-Way ANOVA Interaction Effects.
Chapter 12: Analysis of Variance
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
QNT 531 Advanced Problems in Statistics and Research Methods
Analysis of Variance or ANOVA. In ANOVA, we are interested in comparing the means of different populations (usually more than 2 populations). Since this.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
INFERENTIAL STATISTICS: Analysis Of Variance ANOVA
1 1 Slide Analysis of Variance Chapter 13 BA 303.
Analysis of Variance ( ANOVA )
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Econ 3790: Business and Economics Statistics
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
CHAPTER 12 Analysis of Variance Tests
One-Factor Analysis of Variance A method to compare two or more (normal) population means.
Chapter 10 Analysis of Variance.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Analysis of Variance.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
One-Way ANOVA ANOVA = Analysis of Variance This is a technique used to analyze the results of an experiment when you have more than two groups.
Lecture 9-1 Analysis of Variance
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
Chapter 10: Analysis of Variance: Comparing More Than Two Means.
Random samples of size n 1, n 2, …,n k are drawn from k populations with means  1,  2,…,  k and with common variance  2. Let x ij be the j-th measurement.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Chapter 14: Analysis of Variance One-way ANOVA Lecture 9a Instructor: Naveen Abedin Date: 24 th November 2015.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 4 Investigating the Difference in Scores.
CHAPTER 3 Analysis of Variance (ANOVA) PART 2 =TWO- WAY ANOVA WITHOUT REPLICATION.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
DSCI 346 Yamasaki Lecture 4 ANalysis Of Variance.
Chapter 13 Analysis of Variance (ANOVA). ANOVA can be used to test for differences between three or more means. The hypotheses for an ANOVA are always:
Chapter 11 Analysis of Variance
CHAPTER 3 Analysis of Variance (ANOVA) PART 1
Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
i) Two way ANOVA without replication
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Statistics for Business and Economics (13e)
Econ 3790: Business and Economic Statistics
Statistics for the Social Sciences
Chapter 14: Analysis of Variance One-way ANOVA Lecture 8
Statistics for the Social Sciences
Chapter 13 Group Differences
Chapter 10 – Part II Analysis of Variance
Quantitative Methods ANOVA.
Week ANOVA Four.
Presentation transcript:

Analysis of Variance (ANOVA) W&W, Chapter 10

Introduction Last time we learned about the chi square test for independence, which is useful for data that is measured at the nominal or ordinal level of analysis. If we have data measured at the interval level, we can compare two or more population groups in terms of their population means using a technique called analysis of variance, or ANOVA.

Completely randomized design Population 1Population 2…..Population k Mean =  1 Mean =  2 ….Mean =  k Variance=  1 2 Variance=  2 2 … Variance =  k 2 We want to know something about how the populations compare. Do they have the same mean? We can collect random samples from each population, which gives us the following data.

Completely randomized design Mean = M 1 Mean = M 2..…Mean = M k Variance=s 1 2 Variance=s 2 2 …. Variance = s k 2 N 1 casesN 2 cases….N k cases Suppose we want to compare 3 college majors in a business school by the average annual income people make 2 years after graduation. We collect the following data (in $1000s) based on random surveys.

Completely randomized design AccountingMarketingFinance

Completely randomized design Can the dean conclude that there are differences among the major’s incomes? H o :  1 =  2 =  3 H A :  1   2   3 In this problem we must take into account: 1) The variance between samples, or the actual differences by major. This is called the sum of squares for treatment (SST).

Completely randomized design 2) The variance within samples, or the variance of incomes within a single major. This is called the sum of squares for error (SSE). Recall that when we sample, there will always be a chance of getting something different than the population. We account for this through #2, or the SSE.

F-Statistic For this test, we will calculate a F statistic, which is used to compare variances. F = SST/(k-1) SSE/(n-k) SST=sum of squares for treatment SSE=sum of squares for error k = the number of populations N = total sample size

F-statistic Intuitively, the F statistic is: F = explained variance unexplained variance Explained variance is the difference between majors Unexplained variance is the difference based on random sampling for each group (see Figure 10-1, page 327)

Calculating SST SST =  n i (M i -  ) 2  = grand mean or  =  M i /k or the sum of all values for all groups divided by total sample size M i = mean for each sample k= the number of populations

Calculating SST By major AccountingM 1 =29, n 1 =6 MarketingM 2 =33.5, n 2 =6 FinanceM 3 =37, n 3 =6  = ( )/3 = SST = (6)( ) 2 + (6)( ) 2 + (6)( ) 2 = 193

Calculating SST Note that when M 1 = M 2 = M 3, then SST=0 which would support the null hypothesis. In this example, the samples are of equal size, but we can also run this analysis with samples of varying size also.

Calculating SSE SSE =  (X it – M i ) 2 In other words, it is just the variance for each sample added together. SSE =  (X 1t – M 1 ) 2 +  (X 2t – M 2 ) 2 +  (X 3t – M 3 ) 2 SSE = [(27-29) 2 + (22-29) 2 +…+ (29-29) 2 ] + [( ) 2 + ( ) 2 +…] + [(48-37) 2 + (35-37) 2 +…+ (29-37) 2 ] SSE = 819.5

Statistical Output When you estimate this information in a computer program, it will typically be presented in a table as follows: Source ofdfSum of MeanF-ratio Variationsquaressquares Treatmentk-1SST MST=SST/(k-1)F=MST Errorn-kSSE MSE=SSE/(n-k) MSE Totaln-1 SS=SST+SSE

Calculating F for our example F = 193/ /15 F = 1.77 Our calculated F is compared to the critical value using the F-distribution with F , k-1, n-k degrees of freedom k-1 (numerator df) n-k (denominator df)

The Results For 95% confidence (  =.05), our critical F is 3.68 (averaging across the values at 14 and 16 In this case, 1.77 < 3.68 so we must accept the null hypothesis. The dean is puzzled by these results because just by eyeballing the data, it looks like finance majors make more money.

The Results Many other factors may determine the salary level, such as GPA. The dean decides to collect new data selecting one student randomly from each major with the following average grades.

New data Average AccountingMarketingFinanceM(b) A M(b 1 )=45.67 A M(b 2 )=39.67 B M(b 3 )=30.83 B M(b 4 )=32 C M(b 5 )=29.67 C M(b 6 )=25 M(t) 1 =30.83M(t) 2 =33.5M(t) 3 =36.83  = 33.72

Randomized Block Design Now the data in the 3 samples are not independent, they are matched by GPA levels. Just like before, matched samples are superior to unmatched samples because they provide more information. In this case, we have added a factor that may account for some of the SSE.

Two way ANOVA Now SS(total) = SST + SSB + SSE Where SSB = the variability among blocks, where a block is a matched group of observations from each of the populations We can calculate a two-way ANOVA to test our null hypothesis. We will talk about this next week.