Ka-fu Wong © 2003 Chap 12- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.

Slides:



Advertisements
Similar presentations
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Advertisements

Chapter 11 Analysis of Variance
Analysis of Variance (ANOVA) ANOVA can be used to test for the equality of three or more population means We want to use the sample results to test the.
1 Chapter 10 Comparisons Involving Means  1 =  2 ? ANOVA Estimation of the Difference between the Means of Two Populations: Independent Samples Hypothesis.
Chapter 10 Comparisons Involving Means
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Statistics Are Fun! Analysis of Variance
Chapter 3 Analysis of Variance
Chapter 17 Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 15 Analysis of Variance.
Inferences About Process Quality
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson10-1 Lesson 10: Analysis of Variance.
F-Test ( ANOVA ) & Two-Way ANOVA
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 3 月 30 日 第八週:變異數分析與實驗設計.
1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Analysis of Variance or ANOVA. In ANOVA, we are interested in comparing the means of different populations (usually more than 2 populations). Since this.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
INFERENTIAL STATISTICS: Analysis Of Variance ANOVA
1 1 Slide Analysis of Variance Chapter 13 BA 303.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Analysis of Variance ST 511 Introduction n Analysis of variance compares two or more populations of quantitative data. n Specifically, we are interested.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
One-Factor Analysis of Variance A method to compare two or more (normal) population means.
Chapter 10 Analysis of Variance.
Basic concept Measures of central tendency Measures of central tendency Measures of dispersion & variability.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
One-Way Analysis of Variance … to compare 2 or population means.
Copyright © 2004 Pearson Education, Inc.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
Analysis of Variance Chapter 12 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
© Copyright McGraw-Hill 2000
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
McGraw-Hill, Bluman, 7th ed., Chapter 12
1 Math 10 M Geraghty Part 8 Chi-square and ANOVA tests © Maurice Geraghty 2015.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Analysis of Variance Chapter 12.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd.. 1 Slide Slide Slides Prepared by Juei-Chao Chen Fu Jen Catholic University Slides Prepared.
Copyright © 2016, 2013, 2010 Pearson Education, Inc. Chapter 10, Slide 1 Two-Sample Tests and One-Way ANOVA Chapter 10.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Analysis of Variance. The F Distribution Uses of the F Distribution – test whether two samples are from populations having equal variances – to compare.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Chapter 13 Analysis of Variance (ANOVA). ANOVA can be used to test for differences between three or more means. The hypotheses for an ANOVA are always:
Chapter 11 Created by Bethany Stubbe and Stephan Kogitz.
Chapter 13 f distribution and 0ne-way anova
Analysis of Variance . Chapter 12.
Chapter 10 Two-Sample Tests and One-Way ANOVA.
Characteristics of F-Distribution
Statistics Analysis of Variance.
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Statistics for Business and Economics (13e)
Econ 3790: Business and Economic Statistics
Chapter 11 Analysis of Variance
Chapter 10 – Part II Analysis of Variance
Week ANOVA Four.
Presentation transcript:

Ka-fu Wong © 2003 Chap Dr. Ka-fu Wong ECON1003 Analysis of Economic Data

Ka-fu Wong © 2003 Chap l GOALS 1.Discuss the general idea of analysis of variance. 2.List the characteristics of the F distribution. 3.Conduct a test of hypothesis to determine whether the variances of two populations are equal. 4.Organize data into a one-way and a two-way ANOVA table. 5.Define and understand the terms treatments and blocks. 6.Conduct a test of hypothesis among three or more treatment means. 7.Develop confidence intervals for the difference between treatment means. 8.Conduct a test of hypothesis to determine if there is a difference among block means. Chapter Twelve Analysis of Variance

Ka-fu Wong © 2003 Chap Two Sample Tests TEST FOR EQUAL VARIANCES TEST FOR EQUAL MEANS HHoHHo HH1HH1 Population 1 Population 2 Population 1 Population 2 HHoHHo HH1HH1 Population 1 Population 2 Population 1Population 2

Ka-fu Wong © 2003 Chap Characteristics of F-Distribution There is a “family” of F Distributions. Each member of the family is determined by two parameters: the numerator degrees of freedom and the denominator degrees of freedom. F cannot be negative, and it is a continuous distribution. The F distribution is positively skewed. Its values range from 0 to . As F   the curve approaches the X-axis.

Ka-fu Wong © 2003 Chap The F-Distribution, F(m,n) 01.0 Not symmetric (skewed to the right) F Nonnegative values only  Each member of the family is determined by two parameters: the numerator degrees of freedom (m) and the denominator degrees of freedom (n).

Ka-fu Wong © 2003 Chap Test for Equal Variances For the two tail test, the test statistic is given by: where s 1 2 and s 2 2 are the sample variances for the two samples. The null hypothesis is rejected at  level of significance if the computed value of the test statistic is greater than the critical value with a confidence level  /2 and numerator and denominator dfs.

Ka-fu Wong © 2003 Chap Test for Equal Variances For the one tail test, the test statistic is given by: where s 1 2 and s 2 2 are the sample variances for the two samples. The null hypothesis is rejected at  level of significance if the computed value of the test statistic is greater than the critical value with a confidence level  and numerator and denominator dfs.

Ka-fu Wong © 2003 Chap EXAMPLE 1 Colin, a stockbroker at Critical Securities, reported that the mean rate of return on a sample of 10 internet stocks was 12.6 percent with a standard deviation of 3.9 percent. The mean rate of return on a sample of 8 utility stocks was 10.9 percent with a standard deviation of 3.5 percent. At the.05 significance level, can Colin conclude that there is more variation in the software stocks?

Ka-fu Wong © 2003 Chap EXAMPLE 1 continued Step 1: The hypotheses are: Step 2: The significance level is.05. Step 3: The test statistic is the F distribution. Step 4: H 0 is rejected if F>3.68. The degrees of freedom are 9 in the numerator and 7 in the denominator. Step 5: The value of F is H 0 is not rejected. There is insufficient evidence to show more variation in the internet stocks.

Ka-fu Wong © 2003 Chap Analysis of Variance (ANOVA)

Ka-fu Wong © 2003 Chap Underlying Assumptions for ANOVA The F distribution is also used for testing whether two or more sample means came from the same or equal populations. if any group mean differs from the mean of all groups combined Answers: “Are all groups equal or not?” This technique is called analysis of variance or ANOVA. ANOVA requires the following conditions: The sampled populations follow the normal distribution. The populations have equal standard deviations. The samples are randomly selected and are independent.

Ka-fu Wong © 2003 Chap The hypothesis Suppose that we have independent samples of n 1, n 2,..., n K observations from K populations. If the population means are denoted by  1,  2,...,  K, the one-way analysis of variance framework is designed to test the null hypothesis

Ka-fu Wong © 2003 Chap Sample Observations from Independent Random Samples of K Populations Population 12...K Mean 11 22... KK Variance 22 22... 22 Sample observations from the population x 11 x 12. x 1n1 x 21 x 22. x 2n2... x K1 x K2. x KnK Sample size n1n1 n2n2...nKnK Same !! unequal !! Unequal number of observations in the K samples in general. n T =n 1 +…+n K

Ka-fu Wong © 2003 Chap Sum of Squares Decomposition for one- way analysis of variance Suppose that we have independent samples of n 1, n 2,..., n K observations from K populations. sum of squares Denote by the K group sample means and by the overall sample mean. We define the following sum of squares: where x ij denotes the jth sample observation in the ith group.

Ka-fu Wong © 2003 Chap An Numerical Example of Sum of Squares Decomposition Population12 3 Mean 11 22 KK Variance 22 22 22 Sample obs from the population (x ij ) Sample size (n j ) 34 3 Sample mean Grand mean 2.9 SSTotal = SST + SSE

Ka-fu Wong © 2003 Chap A proof of SSTotal = SST + SSE Populat ion 12...K Sample obs x 11 x 12. x 1n1 x 21 x 22. x 2n2... x K1 x K2. x KnK Sample size n1n1 n2n2...nKnK

Ka-fu Wong © 2003 Chap Two Ways to estimate the population variance Note that the variance is assumed to be identical across populations If the population means are identical, we have two ways to estimate the population variance Based on the K sample variances. Based on the deviation of the K sample means from the grand mean.

Ka-fu Wong © 2003 Chap An estimate the population variance based on sample variances Anyone of the K sample variances can be used to estimate the population. We can get a more precise estimate if we use all the information from the K samples.

Ka-fu Wong © 2003 Chap An estimate the population variance based on deviation of the K sample means from the grand sample mean. If the sample sizes are the same for all samples, the Central Limit Theorem suggests that sample mean will be distributed normally with the population mean and the population variance divided by sample size. When sample sizes are different across samples, we will have to weight ???

Ka-fu Wong © 2003 Chap Comparing the Variance Estimates: The F Test If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of ratio of the two variance estimates follows F distribution with K - 1 and n T - K. If the means of the K populations are not equal, the value of F-stat will be inflated because SST/(K-1) will overestimate  2. Hence, we will reject H 0 if the resulting value of F-stat appears to be too large to have been selected at random from the appropriate F distribution.

Ka-fu Wong © 2003 Chap Test for the Equality of k Population Means Hypotheses H 0 :  1 =  2 =  4 =….=  k H 1 : Not all population means are equal Test Statistic F = [SST/(K-1)] / [SSE/(n T -K)] Rejection Rule Reject H 0 if F > F  where the value of F  is based on an F distribution with k - 1 numerator degrees of freedom and n T - K denominator degrees of freedom.

Ka-fu Wong © 2003 Chap Sampling Distribution of MST/MSE Do Not Reject H 0 Reject H 0 MST/MSE Critical Value FF FF The figure below shows the rejection region associated with a level of significance equal to  where F  denotes the critical value.

Ka-fu Wong © 2003 Chap The ANOVA Table Source of Variation Sum of Squares Degree of Freedom Mean Squares F TreatmentSSTK-1MSTMST/MSE ErrorSSEn T -KMSE TotalSSTotaln T -1

Ka-fu Wong © 2003 Chap Does learning method affect student ’ s exam scores? Consider 3 methods: standard osmosis shock therapy Convince 15 students to take part. Assign 5 students randomly to each method. Wait eight weeks. Then, test students to get exam scores. Are the three learning methods equally effective? i.e., are their population means of exam scores same?

Ka-fu Wong © 2003 Chap “ Analysis of Variance ” (Study #1) The variation between the group means and the grand mean is larger than the variation within each of the groups.

Ka-fu Wong © 2003 Chap ANOVA Table for Study #1 One-way Analysis of Variance Source DF SS MS F P Factor Error Total “Source” means “find the components of variation in this column” “DF” means “degrees of freedom” “SS” means “sums of squares” “F” means “F test statistic” “MS” means “mean squared” P-Value

Ka-fu Wong © 2003 Chap ANOVA Table for Study #1 One-way Analysis of Variance Source DF SS MS F P Factor Error Total “Factor” means “Variability between groups” or “Variability due to the factor of interest” “Error” means “Variability within groups” or “unexplained random variation” “Total” means “Total variation from the grand mean”

Ka-fu Wong © 2003 Chap ANOVA Table for Study #1 One-way Analysis of Variance Source DF SS MS F P Factor Error Total = = = / = 161.2/ = /13.4

Ka-fu Wong © 2003 Chap “ Analysis of Variance ” (Study #2) The variation between the group means and the grand mean is smaller than the variation within each of the groups.

Ka-fu Wong © 2003 Chap ANOVA Table for Study #2 One-way Analysis of Variance Source DF SS MS F P Factor Error Total The P-value is pretty large so cannot reject the null hypothesis. There is insufficient evidence to conclude that the average exam scores differ for the three learning methods.

Ka-fu Wong © 2003 Chap Do Holocaust survivors have more sleep problems than others?

Ka-fu Wong © 2003 Chap ANOVA Table for Sleep Study One-way Analysis of Variance Source DF SS MS F P Factor Error Total The P-value is so small that we reject the null hypothesis of equal population means and favor the alternative hypothesis that at least one pair of population means are different.

Ka-fu Wong © 2003 Chap Potential problem with the analysis What is driving the rejection of null of equal population means? From the plot, the Healthy and Depress seem to have different mean sleep quality. It looks like that the rejection is due to the difference between these two groups. If we pooled Healthy and Depress, the distribution will look more like Survivor. That is, an acceptance of the null is more likely. This example illustratse that we have to be careful about our analysis and interpretation of the result when we conduct a test of equal population means.

Ka-fu Wong © 2003 Chap EXAMPLE 2 Rosenbaum Restaurants specialize in meals for senior citizens. Katy Polsby, President, recently developed a new meat loaf dinner. Before making it a part of the regular menu she decides to test it in several of her restaurants. She would like to know if there is a difference in the mean number of dinners sold per day at the Anyor, Loris, and Lander restaurants. Use the.05 significance level.

Ka-fu Wong © 2003 Chap Example 2 continued # of dinners sold per day ObsAynorLorisLander

Ka-fu Wong © 2003 Chap EXAMPLE 2 continued Step 1: H 0 :  1 =  2 =  3 H 1 : Treatment means are not the same Step 2: H 0 is rejected if F>4.10. There are 2 df in the numerator and 10 df in the denominator.

Ka-fu Wong © 2003 Chap Example 2 continued To find the value of F: SourceSS dfMSF p-value Treatment E-05 Error Total The decision is to reject the null hypothesis. The treatment means are not the same. The mean number of meals sold at the three locations is not the same.

Ka-fu Wong © 2003 Chap Inferences About Treatment Means When we reject the null hypothesis that the means are equal, we may want to know which treatment means differ. One of the simplest procedures is through the use of confidence intervals.

Ka-fu Wong © 2003 Chap Confidence Interval for the Difference Between Two Means where t is obtained from the t table with degrees of freedom (n T - k). MSE = [SSE/(n T - k)] because

Ka-fu Wong © 2003 Chap EXAMPLE 3 From EXAMPLE 2 develop a 95% confidence interval for the difference in the mean number of meat loaf dinners sold in Lander and Aynor. Can Katy conclude that there is a difference between the two restaurants? Because zero is not in the interval, we conclude that this pair of means differs. The mean number of meals sold in Aynor is different from Lander.

Ka-fu Wong © 2003 Chap Two-Factor ANOVA For the two-factor ANOVA we test whether there is a significant difference between the treatment effect and whether there is a difference in the blocking effect.

Ka-fu Wong © 2003 Chap Sample Observations from Independent Random Samples of K Populations TREATMENT 12...K BLOCKBLOCK 1 x 11 x 21...x K1 2 x 12 x 22...x K B x 1B x 2B...x KB

Ka-fu Wong © 2003 Chap Sum of Squares Decomposition for Two-Way Analysis of Variance Suppose that we have a sample of observations with x ij denoting the observation in the ith group and jth block. Suppose that there are K groups and B blocks, for a total of n = KH observations. Denote the group sample means by, the block sample means by and the overall sample mean by x. SSTotal = SSE+SST+SSB

Ka-fu Wong © 2003 Chap General Format of Two-Way Analysis of Variance Table Source of Variation Sums of Squares Degrees of Freedom Mean SquaresF Ratios TreatmentsSSTK-1MST=SST/K-1)MST/MSE BlocksSSBB-1MSB=SSB/(B-1)MSB/MSE ErrorSSE(K-1)(B-1)MSE=SSE/[(K-1)(B-1)] TotalSSTotaln T -1

Ka-fu Wong © 2003 Chap EXAMPLE 4 The Bieber Manufacturing Co. operates 24 hours a day, five days a week. The workers rotate shifts each week. Todd Bieber, the owner, is interested in whether there is a difference in the number of units produced when the employees work on various shifts. A sample of five workers is selected and their output recorded on each shift. At the.05 significance level, can we conclude there is a difference in the mean production by shift and in the mean production by employee?

Ka-fu Wong © 2003 Chap EXAMPLE 4 continued

Ka-fu Wong © 2003 Chap EXAMPLE 4 continued TREATMENT EFFECT Step 1: H 0 : µ 1 = µ 2 = µ 3 versus H 1 : Not all means are equal. Step 2: H 0 is rejected if F>4.46, the degrees of freedom are 2 and 8.

Ka-fu Wong © 2003 Chap Example 4 continued Step 3: Compute the various sum of squares: SourceSS dfMSF p-value Treatments Blocks Error Total Step 4: H 0 is rejected. There is a difference in the mean number of units produced for the different time periods.

Ka-fu Wong © 2003 Chap EXAMPLE 4 continued Block Effect: Step 1: H 0 : µ 1 = µ 2 = µ 3 = µ 4 = µ 5 versus H 1 : Not all means are equal. Step 2: H 0 is rejected if F>3.84, the degrees of freedom are 4 and 8. Step 3: F=[33.73/4]/[43.47/8]=1.55 Step 4: H 0 is not rejected since there is no significant difference in the average number of units produced for the different employees.

Ka-fu Wong © 2003 Chap END - Chapter Twelve Analysis of Variance