Do data match an expected ratio Chi-square Fisher’s exact test
When to Use Chi-Square Test for Homogeneity When the following conditions are met: For each population, the sampling method is simple random sampling. The variable under study is categorical. If sample data are displayed in a contingency table (Populations x Category levels), the expected frequency count for each cell of the table is at least 5.
Ecology example Do biogeographical realms differ in the relative number of endangered bird species? Neotropics 500 endangered, 2000 not endangered Nearctic 200 endangered, 1100 not endangered Prediction / Hypothesis?
Contingency table
> setwd("~/") > Chi<-read.csv("ChiClass.csv") > chisq.test(Chi$Neotropics,Chi$Nearctic) Pearson's Chi-squared test with Yates' continuity correction data: Chi$Neotropics and Chi$Nearctic X-squared = 0, df = 1, p-value = 1 Warning message: In chisq.test(Chi$Neotropics, Chi$Nearctic) : Chi-squared approximation may be incorrect
> ?chisq.test > chisq.test(Chi) Pearson's Chi-squared test with Yates' continuity correction data: Chi X-squared = 11.818, df = 1, p-value = 0.0005866
Yates's correction is to prevent overestimation of statistical significance for small data
When to Use Chi-Square Goodness-of-Fit test? The classic examples from Mendelian Genetics: 800 yellow and smooth seeds 250 yellow and wrinkled seeds 255 green and smooth seeds 99 green and wrinkled seeds
observed = c(800, 250,255,99) # observed frequencies expected = c(9/16, 3/16,3/16,1/16) # expected proportions chisq.test(x = observed, p = expected) chisq.test(x = observed,p = expected) Chi-squared test for given probabilities data: observed X-squared = 2.5008, df = 3, p-value = 0.4751
> observed = c(1203, 2919, 1678) > expected.prop = c(0.211, 0.497, 0.292) > expected.count = sum(observed)*expected.prop > chi2 = sum((observed- expected.count)^2/ expected.count) > chi2 [1] 0.9568563 > pchisq(chi2, + df=1, + lower.tail=FALSE) [1] 0.3279802 > chisq.test(x=observed,p=expected.prop) Chi-squared test for given probabilities data: observed X-squared = 0.95686, df = 2, p-value = 0.6198
When Fisher’s exact test?
“The usual rule of thumb for deciding whether the chi-squared approximation is good enough is that the chi-squared test is not suitable when the expected values in any of the cells of a contingency table are below 5, or below 10 when there is only one degree of freedom (this rule is now known to be overly conservative)”
fisher.test(Chi) Fisher's Exact Test for Count Data data: Chi p-value = 0.0004879 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 1.145349 1.654531 sample estimates: odds ratio 1.374887
Fisher, another example (for proportions) A significantly larger proportion of trees in disturbed landscapes than in undisturbed forests contained at least one cavity (Fisher’s exact test, P < 0.001), with cavities in 72% of all 35 dead trees (52% of all trees in plots in disturbed landscapes); and 9% of all 32 living trees (48% of all trees in plots in disturbed landscapes).
Post-hoc tests But, when you have multiple groups, which ones are different?
Ecology example Neotropics 500 endangered, 2000 not endangered Nearctic 200 endangered, 1100 not endangered Palearctic 50 endangered, 333 not endangered Oriental 600 endangered, 1100 not endangered
> Chi2<-read.csv("ChiClass2.csv") chisq.test(Chi2) Pearson's Chi-squared test data: Chi2 X-squared = 222.1, df = 3, p-value < 2.2e-16
Package ‘fifer’ chisq.post.hoc(tbl, test = c("fisher.test"), popsInRows = TRUE, control = c("fdr", "BH", "BY", "bonferroni", "holm", "hochberg", "hommel"), digits = 4, ...)
> Chi2 Neotropics Nearctic Palearctic Oriental 1 500 200 50 600 2 2000 1100 333 1100 > chisq.post.hoc(Chi2, test=c"chisq.test") Error: unexpected string constant in "chisq.post.hoc(Chi2,test=c"chisq.test"" > chisq.post.hoc(Chi2,test=c("chisq.test") + ) Adjusted p-values used the fdr method. comparison raw.p adj.p 1 1 vs. 2 0 0
> Chi3<-read.csv("ChiClass3.csv") > chisq.post.hoc(Chi3,test=c("chisq.test")) Adjusted p-values used the fdr method. comparison raw.p adj.p 1 1 vs. 2 0.0006 0.0009 2 1 vs. 3 0.0016 0.0019 3 1 vs. 4 0.0000 0.0000 4 2 vs. 3 0.2960 0.2960 5 2 vs. 4 0.0000 0.0000 6 3 vs. 4 0.0000 0.0000
The problem of multiple comparisons
Say you have a set of hypotheses that you wish to test simultaneously Say you have a set of hypotheses that you wish to test simultaneously. The first idea that might come to mind is to test each hypothesis separately, using some level of significance α. At first blush, this doesn’t seem like a bad idea. However, consider a case where you have 20 hypotheses to test, and a significance level of 0.05. What’s the probability of observing at least one significant result just due to chance? P(at least one significant result) = 1 − P(no significant results) = 1 − (1 − 0.05)^20 ≈ 0.64 So, with 20 tests being considered, we have a 64% chance of observing at least one significant result, even if all of the tests are actually not significant
Methods for dealing with multiple testing frequently call for adjusting α in some way, so that the probability of observing at least one significant result due to chance remains below your desired significance level. Famous example: Bonferroni Correction
The Bonferroni correction sets the significance cut-off at α/n The Bonferroni correction sets the significance cut-off at α/n. For example, in the example above, with 20 tests and α = 0.05, you’d only reject a null hypothesis if the p- value is less than 0.0025. The Bonferroni correction tends to be a bit too conservative. To demonstrate this, let’s calculate the probability of observing at least one significant result when using the correction just described: 1P(at least one significant result) = 1 − P(no significant results)= 1 − (1 − 0.0025)20 ≈ 0.0488 Here, we’re just a shade under our desired 0.05 level. We benefit here from assuming that all tests are independent of each other. In practical applications, that is often not the case. Depending on the correlation structure of the tests, the Bonferroni correction could be extremely conservative, leading to a high rate of false negatives.
> chisq.post.hoc(Chi3,test=c("chisq.test"),control="bonferroni") Adjusted p-values used the bonferroni method. comparison raw.p adj.p 1 1 vs. 2 0.0006 0.0035 2 1 vs. 3 0.0016 0.0097 3 1 vs. 4 0.0000 0.0000 4 2 vs. 3 0.2960 1.0000 5 2 vs. 4 0.0000 0.0000 6 3 vs. 4 0.0000 0.0000