© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.

Slides:



Advertisements
Similar presentations
15- 1 Chapter Fifteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Advertisements

Inference about the Difference Between the
Statistical Inference for Frequency Data Chapter 16.
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Homogeneity.
Hypothesis Testing Using a Single Sample
Two Sample Hypothesis Testing for Proportions
© 2010 Pearson Prentice Hall. All rights reserved Hypothesis Testing Using a Single Sample.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Goodness-of-Fit Test.
Chapter 14 Analysis of Categorical Data
Chapter 11 Chi-Square Procedures 11.1 Chi-Square Goodness of Fit.
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
Chi-Square Tests and the F-Distribution
Chi-square Goodness of Fit Test
Presentation 12 Chi-Square test.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.
Hypothesis Testing.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 15 Inference for Counts:
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Other Chi-Square Tests
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Chap 11-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 11 Chi-Square Tests Business Statistics: A First Course 6 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Introductory Statistics. Test of Independence Review Hypothesis Testing Checking Requirements & Descriptive Statistics.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
Section 10.1 Goodness of Fit © 2012 Pearson Education, Inc. All rights reserved. 1 of 91.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-Square hypothesis testing
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chapter 11 Chi-Square Tests.
Chapter 12 Tests with Qualitative Data
Inference on Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
Contingency Tables: Independence and Homogeneity
Chapter 11 Chi-Square Tests.
Inference on Categorical Data
Hypothesis Tests for a Standard Deviation
Chapter Outline Goodness of Fit test Test of Independence.
Chapter 11 Chi-Square Tests.
Presentation transcript:

© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence

12-2 The chi-square test for independence is used to determine whether there is an association between a row variable and column variable in a contingency table constructed from sample data taken from a population of interest. The null hypothesis is that the variables are not associated; in other words, they are independent. The alternative hypothesis is that the variables are associated, or dependent.

12-3 “In Other Words” In a chi-square independence test, the null hypothesis is always H 0 : The variables are independent The alternative hypothesis is always H 0 : The variables are not independent

12-4 The idea behind testing these types of claims is to compare actual counts to the counts we would expect if the null hypothesis were true (if the variables are independent). If a significant difference between the actual counts and expected counts exists, we would take this as evidence against the null hypothesis.

12-5 If two events are independent, then P(A and B) = P(A)P(B) We can use the Multiplication Principle for independent events to obtain the expected proportion of observations within each cell under the assumption of independence and multiply this result by n, the sample size, in order to obtain the expected count within each cell.

12-6 In a poll, 883 males and 893 females were asked “If you could have only one of the following, which would you pick: money, health, or love?” Their responses are presented in the table below. Determine the expected counts within each cell assuming that gender and response are independent. Source: Based on a Fox News Poll conducted in January, 1999 Parallel Example 1: Determining the Expected Counts in a Test for Independence

12-7 Step 1: We first compute the row and column totals: Solution MoneyHealthLoveRow Totals Men Women Column totals

12-8 Step 2: Next compute the relative marginal frequencies for the row variable and column variable: Solution MoneyHealthLoveRelative Frequency Men /1776 ≈ Women /1776 ≈ Relative Frequency 128/1776 ≈ /1776 ≈ /1776 ≈

12-9 Step 3: Assuming gender and response are independent, we use the Multiplication Rule for Independent Events to compute the proportion of observations we would expect in each cell. Solution MoneyHealthLove Men Women

12-10 Step 4: We multiply the expected proportions from step 3 by 1776, the sample size, to obtain the expected counts under the assumption of independence. Solution MoneyHealthLove Men1776(0.0358) ≈ (0.2855) ≈ (0.1758) ≈ Wome n 1776(0.0362) ≈ (0.2888) ≈ (0.1778) ≈

12-11 Expected Frequencies in a Chi-Square Test for Independence To find the expected frequencies in a cell when performing a chi-square independence test, multiply the row total of the row containing the cell by the column total of the column containing the cell and divide this result by the table total. That is,

12-12 Test Statistic for the Test of Independence Let O i represent the observed number of counts in the ith cell and E i represent the expected number of counts in the ith cell. Then approximately follows the chi-square distribution with (r-1)(c-1) degrees of freedom, where r is the number of rows and c is the number of columns in the contingency table, provided that (1) all expected frequencies are greater than or equal to 1 and (2) no more than 20% of the expected frequencies are less than 5.

12-13 Step 1: Determine the null and alternative hypotheses. H 0 : The row variable and column variable are independent. H 1 : The row variable and column variables are dependent. Chi-Square Test for Independence To test the association (or independence of) two variables in a contingency table:

12-14 Step 2: Choose a level of significance, , depending on the seriousness of making a Type I error.

12-15 Step 3: a)Calculate the expected frequencies (counts) for each cell in the contingency table. b)Verify that the requirements for the chi- square test for independence are satisfied: 1.All expected frequencies are greater than or equal to 1 (all E i ≥ 1). 2.No more than 20% of the expected frequencies are less than 5.

12-16 Step 3: c) Compute the test statistic: Note: O i is the observed count for the ith category.

12-17 Step 4: Use Table VII to determine an approximate P- value by determining the area under the chi- square distribution with (r-1)(c-1) degrees of freedom to the right of the test statistic. P-Value Approach

12-18 Step 5: If the P-value < , reject the null hypothesis. If the P-value ≥ α, fail to reject the null hypothesis. P-Value Approach

12-19 Step 6: State the conclusion in the context of the problem.

12-20 In a poll, 883 males and 893 females were asked “If you could have only one of the following, which would you pick: money, health, or love?” Their responses are presented in the table below. Test the claim that gender and response are independent at the  = 0.05 level of significance. Source: Based on a Fox News Poll conducted in January, 1999 Parallel Example 2: Performing a Chi-Square Test for Independence

12-21 Step 1: We want to know whether gender and response are dependent or independent so the hypotheses are: H 0 : gender and response are independent H 1 : gender and response are dependent Step 2: The level of significance is  =0.05. Solution

12-22 Step 3: (a) The expected frequencies were computed in Example 1 and are given in parentheses in the table below, along with the observed frequencies. Solution MoneyHealthLove Men82 ( ) 446 ( ) 355 ( ) Women46 ( ) 574 ( ) 273 ( )

12-23 Step 3: (b)Since none of the expected frequencies are less than 5, the requirements for the goodness-of-fit test are satisfied. (c)The test statistic is Solution

12-24 Step 4: There are r = 2 rows and c =3 columns so we find the P-value using (2-1)(3-1) = 2 degrees of freedom. The P-value is the area under the chi-square distribution with 2 degrees of freedom to the right of which is approximately 0. Solution: P-Value Approach

12-25 Step 5: Since the P-value is less than the level of significance  = 0.05, we reject the null hypothesis. Solution: P-Value Approach

12-26 Step 6: There is sufficient evidence to conclude that gender and response are dependent at the  = 0.05 level of significance. Solution

12-27 To see the relation between response and gender, we draw bar graphs of the conditional distributions of response by gender. Recall that a conditional distribution lists the relative frequency of each category of a variable, given a specific value of the other variable in a contingency table.

12-28 Find the conditional distribution of response by gender for the data from the previous example, reproduced below. Source: Based on a Fox News Poll conducted in January, 1999 Parallel Example 3: Constructing a Conditional Distribution and Bar Graph

12-29 We first compute the conditional distribution of response by gender. Solution MoneyHealthLove Men82/883 ≈ /883 ≈ /883 ≈ Women46/893 ≈ /893 ≈ /893 ≈

12-30 Solution