1 Math 10 M Geraghty Part 8 Chi-square and ANOVA tests © Maurice Geraghty 2015.

Slides:



Advertisements
Similar presentations
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Advertisements

Goodness Of Fit. For example, suppose there are four entrances to a building. You want to know if the four entrances are equally used. You observe 400.
15- 1 Chapter Fifteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Inference about the Difference Between the
1 1 Slide © 2009, Econ-2030 Applied Statistics-Dr Tadesse Chapter 10: Comparisons Involving Means n Introduction to Analysis of Variance n Analysis of.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Analysis of Variance: Inferences about 2 or More Means
Statistics Are Fun! Analysis of Variance
Lesson #23 Analysis of Variance. In Analysis of Variance (ANOVA), we have: H 0 :  1 =  2 =  3 = … =  k H 1 : at least one  i does not equal the others.
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 2000 LIND MASON MARCHAL 1-1 Chapter Thirteen Nonparametric Methods: Chi-Square Applications GOALS.
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
Chi-Square Tests and the F-Distribution
Chi-Square and Analysis of Variance (ANOVA)
AM Recitation 2/10/11.
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
Aim: How do we test a comparison group? Exam Tomorrow.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
Copyright © 2004 Pearson Education, Inc.
Chapter 12 Analysis of Variance. An Overview We know how to test a hypothesis about two population means, but what if we have more than two? Example:
© Copyright McGraw-Hill 2000
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Chapter Outline Goodness of Fit test Test of Independence.
Ka-fu Wong © 2003 Chap Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Nonparametric Methods: Goodness-of-Fit Tests Chapter 15 Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
ContentFurther guidance  Hypothesis testing involves making a conjecture (assumption) about some facet of our world, collecting data from a sample,
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Econ 3790: Business and Economic Statistics Instructor: Yogesh Uppal
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chapter 12 Chi-Square Tests and Nonparametric Tests.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Chapter 10 Section 5 Chi-squared Test for a Variance or Standard Deviation.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Analysis of Variance. The F Distribution Uses of the F Distribution – test whether two samples are from populations having equal variances – to compare.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Chapter 11 Created by Bethany Stubbe and Stephan Kogitz.
Test of independence: Contingency Table
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-square test or c2 test
Characteristics of F-Distribution
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Chapter Fifteen McGraw-Hill/Irwin
Statistics Analysis of Variance.
Inferential Statistics and Probability a Holistic Approach
Statistics for Business and Economics (13e)
Econ 3790: Business and Economic Statistics
Inferential Statistics and Probability a Holistic Approach
Inferential Statistics and Probability a Holistic Approach
Chapter 11: Inference for Distributions of Categorical Data
Analyzing the Association Between Categorical Variables
Chapter 10 – Part II Analysis of Variance
Quantitative Methods ANOVA.
Week ANOVA Four.
Presentation transcript:

1 Math 10 M Geraghty Part 8 Chi-square and ANOVA tests © Maurice Geraghty 2015

2 Characteristics of the Chi- Square Distribution The major characteristics of the chi- square distribution are: It is positively skewed It is non-negative It is based on degrees of freedom When the degrees of freedom change a new distribution is created 14-2

3 CHI-SQUARE DISTRIBUTION CHI-SQUARE DISTRIBUTION df = 3 df = 5 df = 10  2-2

4 Goodness-of-Fit Test: Equal Expected Frequencies Let O i and E i be the observed and expected frequencies respectively for each category. : there is no difference between Observed and Expected Frequencies : there is a difference between Observed and Expected Frequencies The test statistic is: The critical value is a chi-square value with (k-1) degrees of freedom, where k is the number of categories 14-4

5 EXAMPLE 1 The following data on absenteeism was collected from a manufacturing plant. At the.01 level of significance, test to determine whether there is a difference in the absence rate by day of the week. 14-5

6 EXAMPLE 1 continued Assume equal expected frequency: ( )/5=

7 EXAMPLE 1 continued H o : there is no difference between the observed and the expected frequencies of absences. H a : there is a difference between the observed and the expected frequencies of absences. Test statistic: chi-square=  (O-E) 2 /E= Decision Rule: reject H o if test statistic is greater than the critical value of (4 df,  =.01) Conclusion: reject H o and conclude that there is a difference between the observed and expected frequencies of absences. 14-7

8 Goodness-of-Fit Test: Unequal Expected Frequencies EXAMPLE 2 The U.S. Bureau of the Census (2000) indicated that 54.4% of the population is married, 6.6% widowed, 9.7% divorced (and not re-married), 2.2% separated, and 27.1% single (never been married). A sample of 500 adults from the San Jose area showed that 270 were married, 22 widowed, 42 divorced, 10 separated, and 156 single. At the.05 significance level can we conclude that the San Jose area is different from the U.S. as a whole? 14-8

9 EXAMPLE 2 continued 14-9

10 EXAMPLE 2 continued Design: H o : p 1 =.544 p 2 =.066 p 3 =.097 p 4 =.022 p 5 =.271 H a : at least one p i is different  =.05 Model: Chi-Square Goodness of Fit, df=4 H o is rejected if  2 > Data:  2 = 7.745, Fail to Reject Ho Conclusion: Insufficient evidence to conclude San Jose is different than the US Average 14-10

11 Contingency Table Analysis Contingency table analysis is used to test whether two traits or variables are related. Each observation is classified according to two variables. The usual hypothesis testing procedure is used. The degrees of freedom is equal to: (number of rows-1)(number of columns-1). The expected frequency is computed as: Expected Frequency = (row total)(column total)/grand total 14-15

12 EXAMPLE 3 In May 2014, Colorado became the first state to legalize the recreational use of marijuana. A poll of 1000 adults were classified by gender and their opinion about same-sex marriage. At the.05 level of significance, can we conclude that gender and the opinion about legalizing marijuana for recreational use are dependent events? 14-16

13 EXAMPLE 3 continued 14-17

14 EXAMPLE 3 continued Design: H o : Gender and Opinion are independent. H a : Gender and Opinion are dependent.  =.05 Model: Chi-Square Test for Independence, df=2 H o is rejected if  2 > 5.99 Data:  2 = 6.756, Reject Ho Conclusion: Gender and opinion are dependent variables. Men are more likely to support legalizing marijuana for recreational use

15 Characteristics of F- Distribution There is a “family” of F Distributions. Each member of the family is determined by two parameters: the numerator degrees of freedom and the denominator degrees of freedom. F cannot be negative, and it is a continuous distribution. The F distribution is positively skewed. Its values range from 0 to . As F   the curve approaches the X-axis. 11-3

16 Underlying Assumptions for ANOVA The F distribution is also used for testing the equality of more than two means using a technique called analysis of variance (ANOVA). ANOVA requires the following conditions: The populations being sampled are normally distributed. The populations have equal standard deviations. The samples are randomly selected and are independent. 11-8

17 Analysis of Variance Procedure The Null Hypothesis: the population means are the same. The Alternative Hypothesis: at least one of the means is different. The Test Statistic: F=(between sample variance)/(within sample variance). Decision rule: For a given significance level , reject the null hypothesis if F (computed) is greater than F (table) with numerator and denominator degrees of freedom. 11-9

18 ANOVA – Null Hypothesis Ho is true -all means the same Ho is false -not all means the same

19 ANOVA NOTES If there are k populations being sampled, then the df (numerator)=k-1 If there are a total of n sample points, then df (denominator) = n-k The test statistic is computed by:F=[(SS F )/(k-1)]/[(SS E )/(N-k)]. SS F represents the factor (between) sum of squares. SS E represents the error (within) sum of squares. Let T C represent the column totals, n c represent the number of observations in each column, and  X represent the sum of all the observations. These calculations are tedious, so technology is used to generate the ANOVA table

20 Formulas for ANOVA 11-11

21 ANOVA Table SourceSSdfMSF FactorSS Factor k-1SS F /df F MS F /MS E ErrorSS Error n-kSS E/ df E TotalSS Total n-1

22 EXAMPLE 4 Party Pizza specializes in meals for students. Hsieh Li, President, recently developed a new tofu pizza. Before making it a part of the regular menu she decides to test it in several of her restaurants. She would like to know if there is a difference in the mean number of tofu pizzas sold per day at the Cupertino, San Jose, and Santa Clara pizzerias for sample of five days. At the.05 significance level can Hsieh Li conclude that there is a difference in the mean number of tofu pizzas sold per day at the three pizzerias? 11-12

23 Example 4

Example 4 continued 24

25 Example 4 continued ANOVA TABLE SourceSSdfMSF Factor Error Total

26 EXAMPLE 4 continued Design: H o :  1=  2=  3 H a : Not all the means are the same  =.05 Model: One Factor ANOVA H 0 is rejected if F>4.10 Data: Test statistic: F=[76.25/2]/[9.75/10]= H 0 is rejected. Conclusion: There is a difference in the mean number of pizzas sold at each pizzeria

27

28 Post Hoc Comparison Test Used for pairwise comparison Designed so the overall signficance level is 5%. Use technology. Refer to Tukey Test Material in Supplemental Material.

29 Post Hoc Comparison Test

30 Post Hoc Comparison Test