Hypothesis testing. Chi-square test Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Parametric and non-parametric tests Parametric test – to estimate at least one population parameter from sample statistics Assumption: the variable we have measured in the sample is normally distributed in the population to which we plan to generalize our findings Non-parametric test – distribution free, no assumption about the distribution of the variable in the population
Choosing a statistical test Choice of a statistical test depends on: Level of measurement for the dependent and independent variables Number of groups or dependent measures Number of units of observation Type of distribution The population parameter of interest (mean, variance, differences between means and/or variances)
Parametric and non-parametric tests
Normality test Normality tests are used to determine if a data set is modeled by a normal distribution and to compute how likely it is for a random variable underlying the data set to be normally distributed. In descriptive statistics terms, a normality test measures a goodness of fit of a normal model to the data – if the fit is poor then the data are not well modeled in that respect by a normal distribution, without making a judgment on any underlying variable. In frequentist statistics statistical hypothesis testing, data are tested against the null hypothesis that it is normally distributed.
Normality test Graphical methods An informal approach to testing normality is to compare a histogram of the sample data to a normal probability curve. The empirical distribution of the data (the histogram) should be bell-shaped and resemble the normal distribution. This might be difficult to see if the sample is small.
Normality test Frequentist tests Tests of univariate normality include the following: D'Agostino's K-squared test Jarque–Bera test Anderson–Darling test Cramér–von Mises criterion Lilliefors test Kolmogorov–Smirnov test Shapiro–Wilk test Etc.
Normality test Kolmogorov–Smirnov test K–S test is a nonparametric test of the equality of distributions that can be used to compare a sample with a reference distribution (1-sample K–S test), or to compare two samples (2-sample K–S test). K–S statistic quantifies a distance between the empirical distribution of the sample and the cumulative distribution of the reference distribution, or between the empirical distributions of two samples. The null hypothesis is that the sample is drawn from the reference distribution (in the 1-sample case) or that the samples are drawn from the same distribution (in the 2-sample case).
Normality test Kolmogorov–Smirnov test In the special case of testing for normality of the distribution, samples are standardized and compared with a standard normal distribution. This is equivalent to setting the mean and variance of the reference distribution equal to the sample estimates, and it is known that using these to define the specific reference distribution changes the null distribution of the test statistic.
Is there an association? Chi-square test Chi-square test is used to check for an association between 2 categorical variables. H0: There is no association between the variables. HA: There is an association between the variables. If two categorical variables are associated, it means the chance that an individual falls into a particular category for one variable depends upon the particular category they fall into for the other variable. Is there an association?
Assumptions A large sample of independent observations All expected counts should be ≥ 1 (no zeros) At least 80% of expected counts should ≥ 5
Chi-square test The following table presents the data on place of birth and alcohol consumption. The two variables of interest, place of birth and alcohol consumption, have r = 4 and c = 2, resulting in 4 x 2 = 8 combinations of categories. Place of birth Alcohol No alcohol Big city 620 75 Rural 240 41 Small town 130 29 Suburban 190 38
Chi-square test E11 = (695 x 1180) / 1363 E12 = (695 x 183) / 1363 Place of birth Alcohol No alcohol Total Big city O11 = 620 O12 = 75 R1 = 695 Rural O21 = 240 O22 = 41 R2 = 281 Small town O31 = 130 O32 = 29 R3 = 159 Suburb O41 = 190 O42 = 38 R4 = 228 C1 = 1180 C2 = 183 n=1363 E11 = (695 x 1180) / 1363 E12 = (695 x 183) / 1363 E21 = (281 x 1180) / 1363 E22 = (281 x 183) / 1363 E31 = (159 x 1180) / 1363 E32 = (159 x 183) / 1363 E41 = (228 x 1180) / 1363 E42 = (228 x 183) / 1363
Chi-square test The test statistic measures the difference between the observed the expected counts assuming independence. If the statistic is large, it implies that the observed counts are not close to the counts we would expect to see if the two variables were independent. Thus, 'large' χ2 gives evidence against H0, and supports HA. The p-value of the χ2 test is the probability that the χ2 statistic is as large or larger than the value we obtained if H0 is true. To get this probability we need to use a chi-square distribution with (r-1) x (c-1) df.
Association is not causation. Beware! Association is not causation. The observed association between two variables might be due to the action of a third, unobserved variable.
Special case In a lot of cases the categorical variables of interest have two levels each. In this case, we can summarize the data using a contingency table having 2 rows and 2 columns (2x2 table): In this case, the χ2 statistic has a simplified form: Under the null hypothesis, χ2 statistic has chi-square distribution with (2-1) x (2-1) = 1 degrees of freedom. Column 1 Column 2 Total Row 1 A B R1 Row 2 C D R2 C1 C2 n
Special case Gender Alcohol No alcohol Total Male 540 52 592 Female 325 31 356 865 83 948
Limitations No categories should be less than 1 No more than 1/5 of the expected categories should be less than 5 To correct for this, can collect larger samples or combine your data for the smaller expected categories until their combined value is 5 or more Yates Correction* When there is only 1 degree of freedom, regular chi-test should not be used Apply the Yates correction by subtracting 0.5 from the absolute value of each calculated O-E term, then continue as usual with the new corrected values
Fisher’s exact test This test is only available for 2 x 2 tables. For small n, the probability can be computed exactly by counting all possible tables that can be constructed based on the marginal frequencies. Thus, the Fisher exact test computes the exact probability under the null hypothesis of obtaining the current distribution of frequencies across cells, or one that is more uneven.
Fisher’s Exact Test Gender Dieting Non-dieting Total Male 1 11 12 Female 9 3 10 14 24 Gender Dieting Non-dieting Total Male a c a + c Female b d b + d a + b c + d a + b + c + d
Fisher’s Exact Test Gender Dieting Non-dieting Total Male 1 11 12 Female 9 3 10 14 24
Ordinal data independent samples. Mann-Whitney test Null hypothesis: Two sampled populations are equivalent in location (they have the same mean ranks). The observations from both groups are combined and ranked, with the average rank assigned in the case of ties. If the populations are identical in location, the ranks should be randomly mixed between the two samples.
Ordinal data independent samples. Kruskal-Wallis test Null hypothesis: K sampled populations are equivalent in location. The observations from all groups are combined and ranked, with the average rank assigned in the case of ties. If the populations are identical in location, the ranks should be randomly mixed between the K samples.
Ordinal data 2 related samples. Wilcoxon signed rank test Two related variables. No assumptions about the shape of distributions of the variables. Null hypothesis: Two variables have the same distribution. Takes into account information about the magnitude of differences within pairs and gives more weight to pairs that show large differences than to pairs that show small differences. Based on the ranks of the absolute values of the differences between the two variables.