Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chi-Square Tests Chi-Square Tests Chapter1414 Chi-Square Test for Independence Chi-Square Tests for Goodness-of-Fit Copyright © 2010 by The McGraw-Hill.

Similar presentations


Presentation on theme: "Chi-Square Tests Chi-Square Tests Chapter1414 Chi-Square Test for Independence Chi-Square Tests for Goodness-of-Fit Copyright © 2010 by The McGraw-Hill."— Presentation transcript:

1 Chi-Square Tests Chi-Square Tests Chapter1414 Chi-Square Test for Independence Chi-Square Tests for Goodness-of-Fit Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin

2 Chi-Square Test for Independence Chi-Square Test Chi-Square Test In a test of independence for an r x c contingency table, the hypotheses are H 0 : Variable A is independent of variable B H 1 : Variable A is not independent of variable BIn a test of independence for an r x c contingency table, the hypotheses are H 0 : Variable A is independent of variable B H 1 : Variable A is not independent of variable B Use the chi-square test for independence to test these hypotheses.Use the chi-square test for independence to test these hypotheses. This non-parametric test is based on frequencies.This non-parametric test is based on frequencies. The n data pairs are classified into c columns and r rows and then the observed frequency f jk is compared with the expected frequency e jk.The n data pairs are classified into c columns and r rows and then the observed frequency f jk is compared with the expected frequency e jk. 14-2

3 The critical value comes from the chi-square probability distribution with degrees of freedom.The critical value comes from the chi-square probability distribution with degrees of freedom. = degrees of freedom = (r – 1)(c – 1) where r = number of rows in the table c = number of columns in the table = degrees of freedom = (r – 1)(c – 1) where r = number of rows in the table c = number of columns in the table Appendix E contains critical values for right-tail areas of the chi- square distribution.Appendix E contains critical values for right-tail areas of the chi- square distribution. The mean of a chi-square distribution is with variance 2.The mean of a chi-square distribution is with variance 2. Chi-Square Distribution Chi-Square Distribution Chi-Square Test for Independence 14-3

4 Assuming that H 0 is true, the expected frequency of row j and column k is:Assuming that H 0 is true, the expected frequency of row j and column k is: e jk = R j C k /n where R j = total for row j (j = 1, 2, …, r) C k = total for column k (k = 1, 2, …, c) n = sample size Expected Frequencies Expected Frequencies Chi-Square Test for Independence Steps in Testing the Hypotheses Steps in Testing the Hypotheses Step 1: State the Hypotheses H 0 : Variable A is independent of variable B H 1 : Variable A is not independent of variable B 14-4

5 Chi-Square Test for Independence Step 2: Specify the Decision RuleStep 2: Specify the Decision Rule Calculate = (r – 1)(c – 1) For a given , look up the right-tail critical value (  2 R ) from Appendix E or by using Excel. Reject H 0 if  2 R > test statistic. Steps in Testing the Hypotheses Steps in Testing the Hypotheses Step 3: Calculate the Expected FrequenciesStep 3: Calculate the Expected Frequencies e jk = R j C k /n 14-5

6 Chi-Square Test for Independence Step 4: Calculate the Test StatisticStep 4: Calculate the Test Statistic The chi-square test statistic is Step 5: Make the DecisionStep 5: Make the Decision Reject H 0 if  2 R > test statistic or if the p-value test statistic or if the p-value < . Steps in Testing the Hypotheses Steps in Testing the Hypotheses calc 14-6

7 Chi-Square Test for Independence The chi-square test is unreliable if the expected frequencies are too small.The chi-square test is unreliable if the expected frequencies are too small. Rules of thumb:Rules of thumb: Cochran’s Rule requires that e jk > 5 for all cells. Cochran’s Rule requires that e jk > 5 for all cells. Up to 20% of the cells may have e jk < 5 Up to 20% of the cells may have e jk < 5 Small Expected Frequencies Small Expected Frequencies Most agree that a chi-square test is infeasible if e jk < 1 in any cell.Most agree that a chi-square test is infeasible if e jk < 1 in any cell. If this happens, try combining adjacent rows or columns to enlarge the expected frequencies.If this happens, try combining adjacent rows or columns to enlarge the expected frequencies. 14-7

8 Chi-Square Test for Goodness-of-Fit Why Do a Chi-Square Test on Numerical Data? Why Do a Chi-Square Test on Numerical Data? The researcher may believe there’s a relationship between X and Y, but doesn’t want to use regression.The researcher may believe there’s a relationship between X and Y, but doesn’t want to use regression. There are outliers or anomalies that prevent us from assuming that the data came from a normal population.There are outliers or anomalies that prevent us from assuming that the data came from a normal population. The researcher has numerical data for one variable but not the other.The researcher has numerical data for one variable but not the other. Purpose of the Test Purpose of the Test The goodness-of-fit (GOF) test helps you decide whether your sample resembles a particular kind of population.The goodness-of-fit (GOF) test helps you decide whether your sample resembles a particular kind of population. The chi-square test will be used because it is versatile and easy to understand.The chi-square test will be used because it is versatile and easy to understand. Goodness-of-fit tests may lack power in small samples. As a guideline, a chi-square goodness-of-fit test should be avoided if n < 25.Goodness-of-fit tests may lack power in small samples. As a guideline, a chi-square goodness-of-fit test should be avoided if n < 25. 14-8

9 A multinomial distribution is defined by any k probabilities  1,  2, …,  k that sum to unity.A multinomial distribution is defined by any k probabilities  1,  2, …,  k that sum to unity. For example, consider the following “official” proportions of M&M colors. The hypotheses areFor example, consider the following “official” proportions of M&M colors. The hypotheses are H 0 :  1 =.13,  2 =.13,  3 =.24,  4 =.20,  5 =.16,  6 =.14 H 1 : At least one of the  j differs from the hypothesized value Multinomial GOF Test Multinomial GOF Test Chi-Square Test for Goodness-of-Fit 14-9

10 Hypotheses for GOF Hypotheses for GOF The hypotheses are:The hypotheses are: H 0 : The population follows a _____ distribution H 1 : The population does not follow a ______ distribution H 0 : The population follows a _____ distribution H 1 : The population does not follow a ______ distribution The blank may contain the name of any theoretical distribution (e.g., uniform, Poisson, normal).The blank may contain the name of any theoretical distribution (e.g., uniform, Poisson, normal). Chi-Square Test for Goodness-of-Fit 14-10

11 Assuming n observations, the observations are grouped into c classes and then the chi-square test statistic is found using:Assuming n observations, the observations are grouped into c classes and then the chi-square test statistic is found using: Test Statistic and Degrees of Freedom for GOF Test Statistic and Degrees of Freedom for GOF where f j = the observed frequency of observations in class j e j = the expected frequency in class j if H 0 were true calc Chi-Square Test for Goodness-of-Fit 14-11

12 Uniform Goodness-of-Fit Test The uniform goodness-of-fit test is a special case of the multinomial in which every value has the same chance of occurrence.The uniform goodness-of-fit test is a special case of the multinomial in which every value has the same chance of occurrence. The chi-square test for a uniform distribution compares all c groups simultaneously.The chi-square test for a uniform distribution compares all c groups simultaneously. The hypotheses are:The hypotheses are: H 0 :  1 =  2 = …,  c = 1/c H 1 : Not all  j are equal Uniform Distribution Uniform Distribution 14-12

13 Uniform Goodness-of-Fit Test The test can be performed on data that are already tabulated into groups.The test can be performed on data that are already tabulated into groups. Calculate the expected frequency e j for each cell.Calculate the expected frequency e j for each cell. The degrees of freedom are = c – 1 since there are no parameters for the uniform distribution.The degrees of freedom are = c – 1 since there are no parameters for the uniform distribution. Obtain the critical value  2  from Appendix E for the desired level of significance .Obtain the critical value  2  from Appendix E for the desired level of significance . The p-value can be obtained from Excel.The p-value can be obtained from Excel. Reject H 0 if p-value < .Reject H 0 if p-value < . Uniform GOF Test: Grouped Data Uniform GOF Test: Grouped Data 14-13

14 Uniform Goodness-of-Fit Test First form c bins of equal width (X max – X min )/c and create a frequency distribution.First form c bins of equal width (X max – X min )/c and create a frequency distribution. Calculate the observed frequency f j for each bin.Calculate the observed frequency f j for each bin. Define e j = n/c and perform the chi-square calculations.Define e j = n/c and perform the chi-square calculations. The degrees of freedom are = c – 1 since there are no parameters for the uniform distribution.The degrees of freedom are = c – 1 since there are no parameters for the uniform distribution. Obtain the critical value from Appendix E for a given significance level  and make the decision.Obtain the critical value from Appendix E for a given significance level  and make the decision. Uniform GOF Test: Raw Data Uniform GOF Test: Raw Data 14-14

15 14-15 Uniform Goodness-of-Fit Test Calculate the mean and standard deviation of the uniform distribution as:Calculate the mean and standard deviation of the uniform distribution as:  = (a + b)/2 If the data are not skewed and the sample size is large (n > 30), then the mean is approximately normally distributed.If the data are not skewed and the sample size is large (n > 30), then the mean is approximately normally distributed. So, test the hypothesized uniform mean usingSo, test the hypothesized uniform mean using Uniform GOF Test: Raw Data Uniform GOF Test: Raw Data  = [(b – a + 1)2 – 1)/12 14-15


Download ppt "Chi-Square Tests Chi-Square Tests Chapter1414 Chi-Square Test for Independence Chi-Square Tests for Goodness-of-Fit Copyright © 2010 by The McGraw-Hill."

Similar presentations


Ads by Google