Presentation is loading. Please wait.

Presentation is loading. Please wait.

Testing for Independence

Similar presentations


Presentation on theme: "Testing for Independence"— Presentation transcript:

1 Testing for Independence
QSCI 381 – Lecture 41 (Larson and Farber, Sect 10.2)

2 Independence Two variables are independent if the occurrence of one variable does not affect the probability of the other. We often wish to examine whether two variables are independent: Age and having a “high” heavy metal concentration. Concerns regarding the most important factors influencing a fishery and occupation.

3 Contingency Tables contingency table r x c
An shows the observed frequencies for two variables. The observed frequencies are arranged in r rows and c columns. The intersection of a row and a column is called a cell. contingency table r x c

4 Example-A-1 We wish to examine whether having a high
Age-class High heavy metals? 1-10 11-20 21-30 31-40 41+ Yes 12 16 22 21 No 219 180 232 190 75 We wish to examine whether having a high concentration of heavy metals is independent of age.

5 Expected Frequencies The expected frequency for a cell Er,c in a contingency table is: Age-class Total High heavy metals? 1-10 11-20 21-30 31-40 41+ Yes 20.44 17.35 22.48 18.67 8.05 87 No 210.56 178.65 231.52 192.33 82.95 896 231 196 254 211 91 983

6 The Chi-square Test for Independence-I
A is used to test the independence of two variables. The conditions for use of this test are: the observed frequencies must be obtained from a random sample; and each expected frequency must be greater than or equal to 5. The null hypothesis for the test is that the variables are independent and the alternative hypothesis is that they are dependent. chi-square independence test

7 The Chi-square Test for Independence-II
The way this test works is to compare the observed frequencies with the expected frequencies (these expected frequencies are calculated assuming that the two variables are independent). If the value of the test statistic is high then we reject the null hypothesis of independence.

8 The Chi-square Test for Independence-III
The test statistic for the chi-square independence test is: where Oij represents the observed frequencies and Eij represents the expected frequencies. The sampling distribution for the test statistic is a chi-square distribution with degrees of freedom (r-1)(c-1).

9 Example-A-2 The value of the test statistic is in the rejection
Age-class High heavy metals? 1-10 11-20 21-30 31-40 41+ Yes 3.488 0.105 0.010 0.290 7.840 No 0.339 0.001 0.028 0.761 The value of the test statistic is in the rejection region for =0.05 but not for =0.01.

10 Using EXCEL to conduct Chi-square Tests.
EXCEL includes a function CHITEST which can be used to test for independence. CHITEST(observed range, expected range) CHITEST returns the probability associated with the test statistic, i.e. it returns CHIDIST(2,(r-1)(c-1)). The result of applying CHITEST to the data for the example is , i.e. a probability less than 0.05 and greater than 0.01.

11 Example-B-1 We sample 150 animals and assess the fraction in each of four categories to be: Test the null hypothesis that sex and maturity state are independent (=0.01). Mature Female Male Immature 30 40 32 48

12 Example-B-2 Mature Immature Female 30 (28.93) 32 (33.07) Male
40 (41.07) 48 (46.93) 2=0.1256 We cannot reject the null hypothesis of independence. We did reject the null hypothesis that these data are consistent with a “healthy” marine mammal population.

13 Homogeneity of Proportions
The chi-square test can be used to test the null hypothesis that proportions in various categories are equal among several populations. The alternative hypothesis for this test is that at least one proportion differs among populations.


Download ppt "Testing for Independence"

Similar presentations


Ads by Google