Download presentation
Presentation is loading. Please wait.
Published byCameron Walsh Modified over 9 years ago
1
1 G89.2228 Lect 7a G89.2228 Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2 2 Tables Strength of Association –Odds Ratios
2
2 G89.2228 Lect 7a Difference in proportions from independent samples Henderson-King & Nisbett had a binary outcome that is most appropriately analyzed using methods for categorical data. Consider their question, is choice to sit next to a Black person different in groups exposed to disruptive Black vs. White? In their study and. Are these numbers consistent with the same population mean? (Is there difference zero?) Consider the general large sample test statistic, which will have N(0,1), when the sample sizes are large:
3
3 G89.2228 Lect 7a Differences in proportions Under the null hypothesis, the standard errors of the means are and, where is the population standard deviation. Under the null hypotheses, the common proportion is estimated by pooling the data: The common variance is The Z statistic is then, The two-tailed p-value is.16. A 95% CI bound on the difference is.160±(1.96)(.114) = (-.06,.38). It includes the H 0 value of zero.
4
4 G89.2228 Lect 7a Pearson Chi Square for 2 2 Tables The z test statistic has a standard normal N(0,1) distribution for large samples. z 2 is distributed as 2 with 1 degree of freedom for large samples. From the example, 1.4 2 =1.96. Howell's table for Chi Square Pr( 2 > 1.96) to be in range.1 to.25. Pearson’s calculation for this test statistic is: where O i is an observed frequency and E i is the expected frequency given the null hypothesis of equal proportions.
5
5 G89.2228 Lect 7a Expected values for no association From the example: p 1 =11/37=.297, and p 2 =16/35=.457 The expected frequencies are based on a pooled p=(11+16)/(37+35)=.375
6
6 G89.2228 Lect 7a Chi square test of association, continued Marginal probabilities = pooled Expected joint probabilities|H0 = product of marginals (e.g..193 =.375*.514) E i = expected joint probability * n (e.g. 13.875 =.193*72=27*37/72) We use these values in Pearson's formula
7
7 G89.2228 Lect 7a Analysis of a 2 2 tables for small samples The z and 2 tests are justified on the basis of the central limit theorem, and will be approximately correct for fairly small n’s. What if the sample is ridiculously small? –Rule of thumb: if expected frequencies are less than 2.5, the sample is small For small n's, Fisher recommended using a Randomization test –Suppose we have N subjects, and g 1 are in group 1 and r 1 overall respond positively –Under H0, response and group are independent –Consider this thought experiment: Put all N subjects in an urn. Randomly draw r 1 subjects and pretend that they are positive responders? How often would the original pattern of data emerge from such a random process?
8
8 G89.2228 Lect 7a Fisher’s Exact test Suppose we have the following table Pearson ChiSquare would be 3.6, and two tailed p is.058 Hypergeometric probability of getting 1 or fewer Grp2 responses (given that 5 people responded) is:
9
9 G89.2228 Lect 7a Analysis of Matched Samples Many research questions involve comparing proportions computed from related observations: Analogue of paired t-test. –Analysis of change –Within-subjects designs –Analysis of siblings, spouses, supervisor-employee pairs, … –Samples constructed by matching on confounding variables When the outcome is binary, display the data showing the numbers of pairs (joint dist.)
10
10 G89.2228 Lect 7a Example (Howell Ex. 6.21-22) Is the proportion pro the same at the two time points? Note that the marginals (30/40 and 15/40) are not independent Instead of comparing those proportions, examine those whose opinions change Compare (5,20) to the expected (12.5,12.5) as a Chi Square test
11
11 G89.2228 Lect 7a McNemar’s test McNemar showed that this test [whether (5,20) is significantly different from (12.5,12.5)] may be computed, with the Yates correction for continuity, as: For the example, (20-5-1) 2 /25=7.84 is unusual for 1 d.o.f. 2, yielding p=.005
12
12 G89.2228 Lect 7a Confidence interval for matched proportion difference Fleiss (1981) recommends using the general form of the symmetric CI for testing the difference between p 1 and p 2 where the standard error is estimated using E.g. the 95% CI for the difference (30/40)-(15/40)=.375 requires the SE
13
13 G89.2228 Lect 7a Measures of association Consider two tables: The proportions with D in groups A and B is.90 vs.50 in the first table (.9-.5=.4) and.82 vs.33 (.82-.33=.49) in the second. Is the difference stronger in the second table?
14
14 G89.2228 Lect 7a Odds ratios as an alternative to differences in proportions The proportions in group A in levels D and ~D do not differ across tables. Which way we look at the table gives different answers. The odds of D (vs ~D) are 9 to 1 in group A and 1 to 1 in group B in the first table. The odds ratio is 9: the odd are 9 times greater for group A than B. The odds ratio is also 9 in the second table.
15
15 G89.2228 Lect 7a Properties of odds ratios Invariant to multiplying rows or columns by a constant Equal to one for equal odds Approaches infinity when off- diagonal cells approach zero Approaches zero when diagonal cells approach zero Easily computed as =ad/bc Log( ) (a “logit”) has a less obvious interpretation, but nicer scale features: with equal odds point ln(1)=0
16
16 G89.2228 Lect 7a Confidence interval on odds ratio Like other bounded parameters, confidence intervals for are difficult (symmetric bound does not work well) Approximate, but improved CI on ln( )=ln(ad/bc) uses Compute CI on ln( ), then take antilog (i.e., e x ) of each bound.
17
17 G89.2228 Lect 7a Example From the first table above, 95% CI on ln( ) is 95% CI on is thus: an asymmetric confidence interval
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.