Chapter 10 Associations between Categorical Variables
Explore associations between categorical variables Chapter 10 Topics Explore associations between categorical variables
The basic ingredients for testing with categorical variables Section 10.1 Monika Wisniewska. Shutterstock The basic ingredients for testing with categorical variables Identify the Basic Ingredients for Testing with Categorical Variables
Introductory Example: Fair Die Suppose we wanted to determine if a standard 6-sided die was fair. In a perfect world, if the die was fair, the distribution of outcomes would look like this:
Introductory Example: Fair Die We roll a die 60 times and record the number of spots. The outcomes are shown in the table and graph below.
Introductory Example: Fair Die We can see that our outcomes were not exactly what we would expect in a perfect world. We will use a statistic, called the chi-square statistic, to compare the real outcomes with the expected outcomes. We use the chi-square distribution to find p-values that tell us whether we should be suspicious that our outcomes are not matching our expectations.
Contingency Table (Two-Way Table) Summary table that displays frequencies for outcomes when two categorical variables are analyzed Even though there are numbers in the table, these numbers are summaries of variables whose values are categories.
Example: Contingency Table A sample of US adults and members of the American Association for the Advancement of Science (AAAS) were asked by the Pew Poll, “Is it safe to eat genetically modified foods?” The results are shown in the contingency table. (We assumed the sample size was 100 for each group.) US Adults AAAS Scientists Yes 37 88 No 63 12
Expected Counts The expected counts are the numbers of observations we would see in each cell of the contingency table if the null hypothesis were true.
Example: Expected Counts In our previous example of the fair die, if the die is rolled 60 times, we would expect 10 of each outcome. The table below shows the expected counts and the observed counts from the experiment.
Example: Finding Expected Counts In the Pew Research poll, there are two categorical variables: Background (US Adult or scientist) and Belief in Safety of GMO (Yes/No). What counts should we expect if these variables are truly not related to each other? US Adults AAAS Scientists Yes 37 88 No 63 12
Example: Finding Expected Counts Starting with the GMO (Yes/No) variable: 125/200 (0.625) of the sample said “Yes” and 75/200 (0.375) of the sample said “No.” If GMO (Yes/No) is independent of background, then we should expect the same percentage of US adults and AAAS scientists to say “Yes” and the same percentage to say “No.”
Example: Finding Expected Counts There were 100 US Adults in the sample, so we expect 0.625(100) = 62.5 to say “Yes” and 0.375(100) = 37.5 to say “No.” Since the sample size of AAAS Scientists in the survey is the same as that of the US adults (100 in each group), so we expect the same numbers to say “Yes” and “No” as the US Adults.
Contingency Table Showing Observed and Expected Counts The table below shows the actual counts and the expected counts (in parentheses): US Adults AAAS Scientists Total Yes 37 (62.5) 88 (62.5) 125 No 63 (37.5) 12 (37.5) 75 100 200
Notes about Expected Counts In this example we started with the GMO variable. We could also have started with the background variable and computed the expected counts using the background percentages. For example, 50% of the sample were AAAS Scientists, so we would expect 50% of those saying “Yes” to be scientists. If the expected counts are computed this way, the results are exactly the same. The formula can also be used to compute the expected counts, but in practice the expected counts are computed using technology.
The Chi-Square Statistic The chi-square statistic measures the amount that our expected counts differ from our observed counts. The formula for the chi-square statistic is where O is the observed count in each cell, E is the expected count in each cell, and means to add the results in each cell.
Example: Chi-Square Statistic In our GMO example,
Finding the P-Value for the Chi-Square Statistic The p-value is found using the chi-square distribution. The chi-square distribution has only positive values for test statistics and is right skewed. Like the t-distribution, the shape of the chi-square distribution depends on a number called the degrees of freedom.
The Chi-Square Distribution The chi-square distribution provides a good approximation to the sampling distribution of the chi-square statistic only if the sample size is large enough (if each expected count is five or higher). In practice, technology is used to compute the chi-square statistic and the accompanying p-value.
GMO Example: Conclusion Chi-square statistic: 55.488 p-value: < 0.0001 US Adults and AAAS scientists differ in their support of GMO foods.
The chi-square test for goodness of fit Section 10.2 bikeriderlondon. Shutterstock The chi-square test for goodness of fit Use the Chi-Square Test for Goodness of Fit to Determine Whether the Distribution of a Categorical Variable Follows a Proposed Distribution
Goodness of Fit Test Used to determine whether the distribution of a categorical variable follows a proposed distribution Can be applied when you are comparing the distribution of counts of one categorical variable with a distribution of expected counts
Goodness of Fit Test Hypothesize H0: The population distribution of the variable is the same as the proposed distribution. Ha: The distributions are different Prepare Random sample Independent measurements Large sample: The expected count is at least 5 in each cell.
Goodness of Fit Test Compute to Compare Use technology to find the chi-square statistic and p-value. 4. Interpret Reject H0 if the p-value is less than the significance level.
Example: Goodness of Fit – Toss a Die A 6-sided die was tossed 180 times and the data is shown in the table below. Test the hypothesis that the die is not fair. Use a significance level of 0.05. #dots 1 2 3 4 5 6 Freq 23 42 33 36
Example: Goodness of Fit – Toss a Die Hypothesize H0: The die is fair (there will be about the same number of tosses resulting in 1, 2, 3, 4, 5, and 6). Ha: The die is not fair (the distribution of tosses differs from what we expect if the die is fair). Prepare If the die is fair and we toss is 180 times, we would expect about 180/6 = 30 of each number of spots to come up. Our expected counts are all greater than the required minimum expected count of 5.
Example: Goodness of Fit – Toss a Die Compute to Compare Use technology to generate a test statistic and p-value. StatCrunch: Enter the data in a table. Stat > Goodness of fit > Chi-square test
Test statistic: Chi-square = 11.2 p-value: 0.0476
Goodness of Fit: Tossing a Die Interpret Since the p-value is less than the significance level, reject H0. The distribution of tosses is not what we expected if the dies were fair, suggesting the die is not fair.
Chi-Square Goodness of Fit Test Using a TI-84 Calculator To run a Chi-square Goodness of Fit Test on the TI-84 calculator: Enter your data, using one list for the observed values and one list for the expected values. Push STAT > CALC then select option X2 GOF Test. Make sure the Observed list is the one in which you have the observed counts and the Expected list is the one in which you have the expected counts. For df, enter the number of categories – 1. Press Calculate. The test-statistic and p-value will be displayed.
Example: Jury Selection The ethnic breakdown in a certain community is 60% white, 20% Hispanic, 10% Asian-American, and 10% African-American. Suppose 200 people are called for jury duty and the ethnic breakdown of the potential jurors is shown in the chart below. Does the ethnic distribution of the jury pool match that of the community? Use a significance level of 0.05. Ethnicity White Hispanic Asian-American African- American Freq 137 23 18 22
Example: Jury Selection Hypothesize H0: The ethnic distribution of the jury pool matches that of the community. Ha: The ethnic distribution of the jury pool differs form that of the community.
Example: Jury Selection Prepare To find the expected counts, multiply the % of each ethnic group by the sample size: White: 0.60(200)=120, Hispanic: 0.20(200)=40, Asian-American: 0.10(200)=20, African-American: 0.10(200)=20. All expected counts are greater than the required minimum of 5.
Test statistic: Chi-square = 10.03 p-value: 0.0183
Example: Jury Selection Compute to Compare Test statistic: Chi-square = 10.033 p-value: 0.0183 - Reject H0 Interpret The ethnic distribution of the jury pool does not match that of the community.
Chi-square tests for associations between categorical variables Section 10.3 Amy Walters. Shutterstock Chi-square tests for associations between categorical variables Use the Chi-Square Test to Determine Whether there is an Association between Two Categorical Variables
Two Tests for Association There are two tests to determine whether two categorical variables are associated. Which test you use depends on how the data were collected. Both methods use two-way tables to display data. Both are conducted in similar ways.
Test of Homogeneity Collect two (or more) independent samples, one from each population. Each object sampled has a categorical value that is recorded.
Test of Homogeneity Example: Collect a random sample of men and a random sample of women. Ask each person sampled if they agree that global warming is a serious problem. In this example we have two samples: one categorical response variable (opinion) and one categorical grouping variable (gender).
Test of Independence Collect only one sample. For objects in the sample we record two categorical response variables. Example: Collect a large sample of people and record their marital status and income.
Similarities in the Two Approaches In both situations we are interested in knowing whether the two categorical variables are related or unrelated. Use the same chi-square test statistic and the same chi-square distribution to find the p-value.
Example: Homogeneity or Independence? A polling organization asks a random sample of people for their party affiliation (Democrat, Republican, or other) and whether they think the minimum wage should be raised. If the organization wanted to test whether party affiliation and opinion on minimum wage are associated, would this be a test of homogeneity or independence?
Example: Homogeneity or Independence? This is a test of independence because only one sample was collected and two categorical variables (party affiliation and opinion on minimum wage) were recorded for each member of the sample.
Example: Homogeneity or Independence? In 2013 the Pew Organization surveyed adults in eight countries that had legalized same-sex marriage, asking the question, “Should homosexuality be accepted?” If the organization wanted to investigate whether country of origin and opinion are associated, would this be a test of homogeneity or independence?
Example: Homogeneity or Independence? This is an example of a test of homogeneity. Eight independent samples are collected (a sample from each of eight countries) and a single categorical variable is recorded for each member of the sample (the response to the question “ Should homosexuality be accepted?”).
Tests of Homogeneity and Independence Hypothesize H0: There is no association between the two variables (the variables are independent). Ha: There is an association between the two variables (the variables are not independent).
Tests of Homogeneity and Independence Prepare Random samples Independent samples and observations Large samples: The expected counts must be five or more in each cell of the table.
Tests of Homogeneity and Independence Compute to Compare Test statistic is Degrees of freedom = (#rows – 1)(#columns – 1) p-value comes from the X2 distribution. Technology can be used to compute the test statistic and p-value.
Tests of Homogeneity and Independence Interpret If the p-value is less than or equal to the significance level, we reject H0 and conclude there is an association between the variables. Otherwise we do not reject H0 and we cannot conclude there is an association between the variables.
Example: Republican Views on Global Warming The Yale Project on Climate Change investigated views on global warming among the Republican Party. Republicans surveyed identified themselves as Liberal, Moderate, Conservative, or Tea Party Republicans and also answered the question, “Do you believe global warming is happening?” The results are shown in the following two-way table.
Example: Republican Views on Global Warming Is this a test of homogeneity or independence? Run a test to see if there is an association between type of Republican and opinion. The report on this survey is titled “Not All Republicans Think Alike about Global Warming.” Do the results of the hypothesis test support this headline?
Liberal Republican Moderate Conservative Tea Party Yes 72 335 483 120 No 34 205 788 292 “Do you believe global warming is happening?” Hypothesize H0: Republican type and opinion are independent. Ha: Republican type and opinion are not independent. Prepare Samples are random and independent. Check on the technology output that all expected counts are greater than or equal to 5. StatCrunch: Stats > Table > Contingency > with summary
All expected counts (in parentheses) are greater than or equal to 5. Compute to Compare Test statistic: X2 = 151.59 p-value: < 0.0001 4. Interpret Reject H0. There is an association between type of Republican and opinion.
Example: Republican Views on Global Warming This was a test of independence. There is an association between Republican type and opinion. The study supports the headline, “Not all Republicans Think Alike about Global Warming.”
To Run a Test of Independence Using a TI-84 Calculator To run a test of independence on the TI-84 calculator: Enter your data into a Matrix. Push STAT > TESTS then select option X2 Test. Press Calculate. The X2 test statistic and p-value will be displayed.
Example: Education and Marital Status Does a person’s educational level affect his or her decision about marrying? A sample of 665 people was taken. Their marital status and educational level were recorded. The data is shown in the table. Are the variables marital status and educational level independent?
The Data: Education and Marital Status College or higher HS Less HS Divorced 15 59 10 Married 98 240 70 Single 27 68 17 Widow/er 3 30 28
Example: Education and Marital Status Hypothesize H0: Marital status and educational level are independent. Ha: Marital status and educational level are associated. Prepare We use technology to compute the test statistic, p-value, and expected counts. We need to check that the expected counts are all five or more.
All expected counts (in parentheses) are greater than or equal to 5. Compute to Compare Test statistic: X2=39.97 p-value: p<0.0001
Example: Education and Marital Status Interpret We reject H0 because the p-value is small. Marital status and educational level are associated.
Drawback of the Chi-Square Test The chi-square test reveals only if two variables are associated, not how they are associated. When both categorical variables only have two categories, the data can be analyzed using a two-proportion z-test instead which gives more information on how the variables are associated.
Example: AIDS Vaccine In a study of a potential AIDS vaccine, 8200 volunteers were randomly assigned to receive a vaccine against AIDS and another 8200 to receive a placebo. The number in each group who had contracted AIDS at the end of 3 years was recorded. The data is shown in the following table. Vaccine No Vaccine Total AIDS 51 74 125 No AIDS 8149 8126 16275 8200 16400
Example: AIDS Vaccine A Chi-square test could be used to determine if vaccine and AIDS are independent. The conclusion of this test could tell us there is an association between the variables but not how they are associated. What the researchers want to know is if the vaccine is effective.
Example: AIDS Vaccine Because both categorical variables Vaccine and AIDS have only 2 outcomes (yes/no), the data can be analyzed using a two-proportion z-test. By testing the hypotheses: H0: propvaccine = propplacebo Ha: propvaccine < propplacebo the researchers can determine the direction of the effect; in other words, whether the vaccine was effective.
Hypothesis tests when sample sizes are small Section 10.4 Sian Bradfield. Pearson Education Australia Pty Ltd Hypothesis tests when sample sizes are small Identify Two Methods for Dealing with Data when the Expected Counts are Less than Five: Combining Categories and Fisher’s Exact Test
Example: Swine Flu Hospitalization The table contains data on the age and hospitalization (yes/no) of a sample of swine flu victims in the US. Test the hypothesis that hospitalization for swine flu is associated with age category. Age Category Hospital? Under 5 5-14 15-29 30-44 45-60 over 60 Yes 7 9 1 No 44 195 241 59 35 10 H0: Hospitalization and age are independent. Ha: Hospitalization and age are associated.
Several expected values are less than five, so we cannot do a chi-square test for independence.
Example: Swine Flu Hospitalization One strategy for addressing expected counts that are less than five is to combine categories so that the expected counts increase to five or more. We decrease the age categories from six to three by combining several categories. The combined data is shown in the table. Age Category Hospital? Under 15 15-29 30 and older Yes 16 9 10 No 239 241 104
By combining age categories, all expected counts are now five or more. Test statistic: X2=4.24 p-value: 0.12 Do not reject H0. There is not enough evidence to conclude there is an association between age and hospitalization.
Advantages and Disadvantages of Combining Categories The advantage of combining categories is that it enables us to use a chi-square test if all expected values are five or more. The disadvantage of combining categories is that our knowledge becomes less refined, as we now look at simply “under 15” instead of the previously separate categories of “under five” and “5-14”).
Fisher’s Exact Test Called “exact test” because in many cases the p-value can be found exactly instead of being approximated One disadvantage is this test is not widely implemented in software packages.
Example: Scorpion Antivenom A study was conducted on the effectiveness of an antivenom on children who had been stung by scorpions. This group is small and so a chi-square test cannot be used to determine if there is an association between the antivenom and improvement in the children. The data from the study is shown in the table. Antivenom Placebo Total No Improvement 1 6 7 Improvement 8 15
Example: Scorpion Antivenom Imagine a parallel world similar to, but different from, the world from which the data was drawn. The parallel world has 15 children with scorpion stings. Eight of these children received the antivenom and eight improved; however, in the parallel world there is no association between treatment and outcome. Everything happens purely by chance.
Example: Scorpion Antivenom Placebo Total No improvement 7 Improvement 8 15 In the parallel world, the values in the yellow area of the table happen purely by chance, keeping the given row and column totals.
Example: Scorpion Antivenom Placebo Total No improvement 4 3 7 Improvement 8 15 For example, suppose in the parallel world four children received the antivenom and improved. Then all the totals in blue can be filled in using the row and column totals. There are many different such tables that can be seen in the parallel world. Some are more likely than others.
Example: Scorpion Antivenom Mathematically we can calculate the p-value by figuring out the probability of each table in the parallel world and then calculating the probability of getting an outcome as extreme or more extreme than the 7/8 (improvement in the antivenom group) that the researchers saw. This is Fisher’s Exact test, and it can be done using StatCrunch.
To run Fisher’s Exact test: StatCrunch > Stat > Tables > Contingency > With Summary Select the columns, rows, and then select Fisher’s Exact test. Since the p-value is 0.0101, we can conclude the antivenom was effective.
The chi-square distribution provides a good approximation to the sampling distribution of the chi-square statistic if each expected count is Five or higher. 10 or higher. 25 or higher. Any non-zero number. Section 10.1
The chi-square distribution provides a good approximation to the sampling distribution of the chi-square statistic if each expected count is Five or higher. 10 or higher. 25 or higher. Any non-zero number. Section 10.1
When testing if two categorical variables are associated using two or more samples and one categorical response variable, we use which of the following tests? Goodness-of-fit Homogeneity Independence Interdependence Section 10.3
When testing if two categorical variables are associated using two or more samples and one categorical response variable, we use which of the following tests? Goodness-of-fit Homogeneity Independence Interdependence Section 10.3
When testing if two categorical variables are associated using one sample and two categorical response variables, we use which of the following tests? Goodness-of-fit Homogeneity Independence Interdependence Section 10.3
When testing if two categorical variables are associated using one sample and two categorical response variables, we use which of the following tests? Goodness-of-fit Homogeneity Independence Interdependence Section 10.3
The table presents data on gender and political party affiliation for a sample of registered voters. Democrat Independent Republican Total Female 25 7 13 45 Male 38 18 29 85 Complete this sentence: Since ___% of the sample are female, we would expect to see ________ female Democrats. 34.6, 15.6 B. 52.9, 23.8 C. 34.6, 21.8 D. 52.9, 15.6 Section 10.2
The table presents data on gender and political party affiliation for a sample of registered voters. Democrat Independent Republican Total Female 25 7 13 45 Male 38 18 29 85 Complete this sentence: Since ___% of the sample are female, we would expect to see ________ female Democrats. 34.6, 15.6 B. 52.9, 23.8 34.6, 21.8 D. 52.9, 15.6 % female: 45/130x100% = 34.6%. We expect 34.6% of the 63 Democrats to be female (0.346x63=21.8) Section 10.2
When conducting a test of homogeneity or independence, the null hypothesis is always that the variables are Associated. Independent. Dependent. Both A and C are correct. Section 10.3
When conducting a test of homogeneity or independence, the null hypothesis is always that the variables are Associated. Independent. Dependent. Both A and C are correct. Section 10.3