Testing hypothesis with two proportions and Chi-square testing
Hypothesis testing for homogeneity of proportions We wish to compare the effects of two different insecticides. The first insectide, InsectsRUs, was applied to 300 insects and 50 of them died. The second insecticide, InsectsBGone, was applied to 350 insects and 90 died. Test to see if there is a significant difference in the proportion of insects killed. Hypothesis of interest H 0 : p 1 = p 2 versus H A : p 1 p 2 Test statistic: Where
Calculations InsectsRUs InsectsBGone p-value=
Contingency Table Can also view this as a contingency table Type of Insecticide InsectsRUsInsectsBGone Killed5090 Not killed TOTAL
Chi-square Chi-square testing can be used to test if the distribution is the same in each group (i.e. insecticides) Need to find Expected values
Expected Values for Chi Square Expected value = (row total)*(column total)/N Insects RUs InsectsB GoneTOTAL Killed Not Killed TOTAL Column Totals Row Totals N
InsectsRUsInsectsBGoneTOTAL Killed Not Killed TOTAL InsectsRUsInsectsBGoneTOTAL Killed=140*300/650=140*350/ Not Killed=510*300/650=350*510/ TOTAL Calculating Expected Values
Calculated expected values InsectsRUsInsectsBGoneTOTAL Killed Not Killed TOTAL Now calculate the Chi –square statistic = , degrees of freedom = 1 p-value =
Comments on the Chi-square Test The test of H0: no association versus HA: there is an association between two categorical variables is computationally the same as testing if the conditional distributions are the same (this can be extended to more than two populations). The degrees of freedom for a chi-square test is (r- 1)*(c-1), where r = # rows and c=# columns. Be cautious when expected frequencies are lower than 5. The chi-square test can also be used to test for goodness-of-fit.
Goodness of fit Test to see if percent of longleaf pine is evenly distributed across the four quadrants (example from Moore, McCabe, and Craig). Quad1Quad2Quad3Quad4 Count There are a total of 100 trees, so we would expect ¼ of 100 to be in each quadrant (where expected value =0.25*100=25). =10.8, degrees of freedom=3 P-value=