Chi-square Test of Independence Presentation 10.2
Another Significance Test for Proportions But this time we want to test multiple variables. With this test we can determine if two variables are independent of not. This is sometimes called inference for two- way tables.
Chi-square Test of Independence Formulas Null Hypothesis (assumes independent) Alternate Hypothesis (not independent) Test Statistic (that symbol is called Chi-squared) The null and alternate hypotheses are always the same with a Test of Independence. O is the observed count for each cell in the table and E is the expected count for each cell in the table. Instead of a normal or t distribution, we now have a chi- squared distribution
The TitanicTitanic Look at the data of the passengers, their ticket, and whether or not they survived. Type of TicketRescuedDied First Class Second Class Third Class528178
Conditions for the Test of Independence None of the observed counts should be less than 1 No more than 20% of the counts should be less than 5 –Same as for the Goodness of Fit test These are simple checks to make sure that the sample size is sufficient.
The TitanicTitanic Check the conditions –Since all counts are much greater than 5, we are ok to conduct the test Write Hypotheses (these are always the same!) –Null: H o : Observed = Expected That is, what we observed should be the same as what we expected given the variables are independent –Alternate: H a : Observed Expected That is, the observed data is just too different from what is expected to be attributed to random chance.
The Titanic CalculationsTitanic Find the expected values (assume independence) Type of Ticket Rescu ed DiedTotals First Class Secon d Class Third Class Totals Type of TicketRescuedDiedTotals First Class326*849/1317= *468/1317= Second Class285*849/1317= *468/1317= Third Class706*849/1317= *468/1317= Totals Observed Expected To find an expected count, 849 out of 1317 total passengers were rescued (64.46%), so 849/1317 or 64.46% of the 326 first class passengers should have been rescued. This logic follows for each cell in the table.
The Titanic CalculationsTitanic Then, do the sum of just like with the Goodness of Fit Test Our degrees of freedom are: Finally, use chi- square cdf: X 2 cdf(99.69,99999,2)
The Titanic CalculationsTitanic Using the calculator First go to the Matrix menu (2 nd x -1 ) Go to edit and press enter Enter the number of row x column –Your matrix should fit the look of your table Enter in the data –Make the calculator match the table Then go to your stats tests and choose chi-test
The Titanic CalculationsTitanic Using the calculator Since you entered the data into matrix [A], you can just go right to: –Calculate –Draw Leave the expected alone as the calculator will calculate those for you (see next slide)
The Titanic CalculationsTitanic Using the calculator Lets go check out the expected table –Go back to matrix –Edit [B] to see the values How cool is that!
The Titanic CalculationsTitanic Conclusions –The p-value represents the chance of the data occurring given the variables are independent. –For the Titanic, this was a % chance –REJECT THE NULL! –There is a ton of evidence to suggest that there is an association between survival rate and the type of ticket.
Chi-square Goodness of Fit Test This concludes this presentation.