Inference for Tables Chi-Square Tests
Chi-Square Test Basics Formula for test statistic: Conditions: Data is from a random sample/event. All individual expected counts are at least 5 (sample size is large enough). counts Used to test the counts of categorical data
2 distribution facts Different df have different curves Skewed right Only positive values normal curveAs df increases, curve becomes more like a normal curve
Chi-Square df =1
Chi-Square df =2
Chi-Square df =3
Chi-Square df =4
Chi-Square df =5
Chi-Square df =8
Chi-square test ThreeThree types –Goodness of fit –Homogeneity –Independence
Chi-square test 4 steps4 steps –Write hypothesis –Check conditions (show expected table values) –Find test statistic and p-value –Decision and conclusion
Chi-Square Goodness of Fit The chi-square( 2 ) test for goodness of fit allows the observer to test if a sample distribution is different from the hypothesized population. Want to see how well the observed counts “fit” what we expect the counts to be H o : the sample distribution is the same as the expected distribution H a : the sample distribution is different from the expected distribution n – 1, where n is our number of categories and not our sample size!!!
Most computer software companies provide a telephone number that customers can use to obtain technical assistance. A particular manufacturer recently changed from a toll number to a free 800 number. The distribution for the lengths of calls (in seconds) when the number was a toll call is given below. After installing the 800 number, a random sample of 300 calls had the following durations. Is there sufficient evidence to conclude that a change has occurred in the distribution of the lengths of calls now that the number is toll free? Test at the 0.02 level.
Chi-square test 4 steps4 steps –Write hypothesis –Check conditions (show expected table values) –Find test statistic and p-value –Decision and conclusion
H o : the distribution of calls for the free number is the same as expected(as the toll #) H a : the distribution of calls for the free number is different than expected(different than the toll #) Expected values table: Chi-square goodness of fit Given a random sample of 300 calls.
H o : the distribution of calls for the free number is the same as expected(as the toll #) H a : the distribution of calls for the free number is different than expected(different than the toll #) Expected values table: Chi-square goodness of fit
H o : the distribution of calls for the free number is the same as expected(as the toll #) H a : the distribution of calls for the free number is different than expected(different than the toll #) Expected values table: Given a random sample of 300 calls. Sample is large enough since all expected counts are greater than 5. Chi-square goodness of fit
Formula: df=# of categories – 1 = 4 (no need to rewrite tables again)
Chi-Square Table Since p-value< I reject H o at 2% significance. There is enough evidence to believe the distribution of toll-free calls is different than the toll calls. df=4
Chi-Square Test of Independence The chi-square( 2 ) test for independence/association test if the variables in a two-way table are related (single random sample) Used to see if the two categorical variables in one population are associated or not associated (independent) H o : the variables are independent H a : the variables are dependent Df = (rows – 1)(columns – 1)
A study compared noncombat mortality rates for U.S. military personnel who were deployed in combat situations to those not deployed. The results of a random sample of 1580 military personnel: At the 0.05 significance level, test the claim that the cause of a noncombat death is independent of whether the military person was deployed in a combat zone. Unintentional Injury IllnessHomicide or Suicide Deployed Not deployed Cause of death
H o : Deployment status and cause of death are independent H a : Deployment status and cause of death are dependent Chi-square test of independence Given a random sample of 1580 military personnel
Unintentional Injury IllnessHomicide or Suicide Deployed Not deployed Expected count formula Unintentional Injury IllnessHomicide or Suicide Deployed??? Not deployed??? Observed Expected
Unintentional Injury IllnessHomicide or Suicide Deployed Not deployed Expected count formula Unintentional Injury IllnessHomicide or Suicide Deployed Not deployed Observed Expected
H a : Deployment status and cause of death are dependent Expected values table: Given a random sample of 1580 military personnel Sample is large enough since all expected counts are greater than 5. Chi-square test of independence Unintentional Injury IllnessHomicide or Suicide Deployed Not deployed H o : Deployment status and cause of death are independent
Formula: df=(2 – 1)(3 – 1)= 2 Since p-value< I reject H o at 5% significance, there is enough evidence to believe that cause of death depends on deployment status.
Chi-Square Table df=2
Chi-Square Test of Homogeneity The chi-square( 2 ) test for homogeneity allows the observer to test if the populations within a two-way table are the same (multiple random samples) Used to see if the two or more populations are the same (homogeneous) H o : all the populations are the same for the given variable H a : all the populations are different Df = (rows – 1)(columns – 1)
In the past, a number of professions were prohibited from advertising. In 1977, the U.S. Supreme Court ruled that prohibiting doctors and lawyers from advertising violated their right to free speech. The article “Should Dentists Advertise?” compared the attitudes of consumers and dentists toward the advertising of dental services. Separate random samples of 101 consumers and 124 dentists were asked to respond to the following statement “I favor the use of advertising by dentists to attract new patients.” The data is presented below: The authors of the article were interested in determining whether the two groups differed in their attitudes toward advertising.
H o : the opinions of consumers and dentists are the same H a : the opinions of consumers and dentists are different Expected values table: Given table is from two independent random samples. Sample is large enough since all expected counts are greater than 5. It is safe to use the chi-square procedures. Chi-square test of homogeneity
Formula: df=(2 – 1)(5 – 1)= 4 Since p-value< I reject H o at 5% significance. There is enough evidence to believe the opinions of consumers is different than the opinions of dentists.
Chi-Square Table df=4
Summary of All Chi-Square Tests Test for Goodness of Fit –Comparing the distribution of a variable to what is expected (given % or equally likely) Chi-Square test of association/independence –Comparing two variables within the same population Chi-Square test for homogeneity of populations –Comparing a single variable, multiple populations