Download presentation
1
Contingency Table Analysis: a chi-square test for independence (or test for association)
2
Contingency table analysis
Contingency table analysis is an important analysis method in statistics. It can be used to inference whether one factor is correlated with another factor. For example, “Does smoking cause a lung cancer?” Let S=1 denote that a person smoke, and S=0 denotes that the person does not smoke. Let L=1 denotes that the person suffers from a lung cancer, and L=0 denotes that the person does not have a lung cancer. Table 1 is a contingency table (列聯表), Table 2 shows the corresponding Probability of each cell. Table 1 L=1 L=0 Total S=1 8 19 27 S=0 1 16 17 9 35 44 Table 2 L=1 L=0 Total S=1 P11 P10 P1* S=0 P01 P00 P0* P*1 P*0 1
3
If the two factors are independent (H0), P(S=s, L=l)=P(S=s)
If the two factors are independent (H0), P(S=s, L=l)=P(S=s)*P(L=l) for s=0,1 and l=0,1. We know that P(AB)=P(A)*P(B) if events A and B are independent. For table 2, we have
4
We can estimate P1* to be 27/44 by referring to Table 1.
Similarly, 9/44 for P0*. According to Equation 1, we can estimate P11 to be 27/44 * 9/44. If H0 holds, the number of persons in cell 1 can be estimated at 44*P11, which is 44*27/44*9/44=27*9/44
5
Denotes an observation in cell i by Oi, and an estimate of cell i by Ei, we have
6
We have, Check the table of Chi-square table, if the value cannot be seen very often ( p-value is small), then reject H0 and conclude that the two factors are correlated. The p-value for this case is 0.057; depending on your significant level α, you can decide whether reject H0 or not.
7
In general, if the contingency table has r rows and c columns of cells, we have
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.