Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 G89.2228 Lect 6b G89.2228 Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a.

Similar presentations


Presentation on theme: "1 G89.2228 Lect 6b G89.2228 Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a."— Presentation transcript:

1 1 G89.2228 Lect 6b G89.2228 Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a single proportion –“Exact” binomial test –Large sample test: Normal –Chi Square test –Making the Binomial and Large sample tests agree Confidence Bound on proportion –symmetric bound –nonsymmetric bound Differences in proportions –Large sample test –Confidence bounds –Chi Square test (2  2 table)

2 2 G89.2228 Lect 6b Generalizing from tests of quantitative variables to tests of categorical variables Binary variables (X=0 or X=1) resemble quantitative variables in several ways The mean or E(X) is in the range (0,1) –It is interpreted as a probability p The variance of X, computed the usual way, turns out to be: p(1-p) The sample mean,  is itself normally distributed for large sample sizes The logic for tests of binary means works in large samples the same way it does for continuous means

3 3 G89.2228 Lect 6b Binomial Variation If we know E(X) we know V(X)

4 4 G89.2228 Lect 6b Testing a hypothesis about a single proportion, H 0 : E(X)=p=k Example: Whether a population of musicians has the same proportion of left handed people as the population at large. –Sample 20 Juilliard musicians, and find that 5 are left-handed (fictional data) –If X=0 for right-handed and X=1 for left- handed, then  X=5/20=.25. –If p=.1, then is 5/20 an unusual event? Exact test: Application of binomial =1-(.122+.270+.285+.190+.09) =.043 One tailed inference would call H 0 in question, but not two-tailed inference, although there are no relevant possibilities in the other tail!

5 5 G89.2228 Lect 6b Large sample test: Normal While we can work with 20 subjects and 5 positive cases, what about 36, 48 or 180 subjects? Binomial calculations are often tedious. How about using the central limit theorem to test a z statistic? Assuming H 0 is true, we know that Z=.15/.067=2.24. Under the null hypothesis, such an extreme z would be observed only 13 times out of 1000 for a one tailed test, or 25 out of 1000 for a two-tailed test.

6 6 G89.2228 Lect 6b Chi Square test Z 2 =(2.24) 2 =5.0 can also be evaluated as  2 (1). On page 672 of Howell we see that this value corresponds to a p of.025. Squaring the Z makes it implicitly a two-tailed test. Pearson showed that this same statistic can be computed by comparing the observed values, 5,15, to the expected frequencies under H 0, 2,18. Let O i represent the observed frequency and E i be the expected frequency. Pearson’s Goodness of fit Chi Square is:

7 7 G89.2228 Lect 6b Making the Binomial and Large sample tests agree The one tailed p value for the binomial test was.043, while for the z (or  2 ), it was.013. Why the difference? The binomial has discrete jumps in probability as we consider the possibility of 0, 1,…,5 left-handed persons out of 20. The z and  2 tests make use of continuous distributions Yates suggested a correction The p value for this corrected z (one-tailed) is.031. It is called a “correction for continuity”, but its use is somewhat controversial.

8 8 G89.2228 Lect 6b Confidence Bound on binomial proportion What procedure can be used to define a bound on µ that will contain the parameter 95% of the time it is used? So far, we have considered symmetric bounds of the form, where is the estimate of the parameter value . This general form does not always work well on parameters that are bounded, such as the mean of a binary variable. If p is in the range (.2,.8), we can usually get by with the symmetric form The continuity correction expands the bounds by 1/2n: (.06-.025,.44+.025) (.035,.465).

9 9 G89.2228 Lect 6b A nonsymmetric CI for p Fleiss (1981) Statistical Methods for Rates and Proportions, 2nd Edition, gives better expressions for continuity- corrected bounds where q=1-p and k=1.96 for 95% bounds. Applying these formulas gives the bounds (.096,.49) Which bounds to use? Simulations show Fleiss’s to be best, but they are not necessarily most often used.

10 10 G89.2228 Lect 6b Difference in proportions from independent samples Henderson-King & Nisbett had a binary outcome that is most appropriately analyzed using methods for categorical data. Consider their question, is choice to sit next to a Black person different in groups exposed to disruptive Black vs. White? In their study and. Are these numbers consistent with the same population mean? Consider the general large sample test statistic, which will have N(0,1), when the sample sizes are large:

11 11 G89.2228 Lect 6b Differences in proportions Under the null hypothesis, the standard errors of the means are and, where  is the population standard deviation. Under the null hypotheses, the common proportion is estimated by pooling the data: The common variance is The Z statistic is then, The two-tailed p-value is.16. A 95% CI bound on the difference is.160±(1.96)(.114) = (-.06,.38). It includes the H 0 value of zero.

12 12 G89.2228 Lect 6b Pearson Chi Square for 2  2 Tables The z test statistic (e.g. for the difference of two proportions) is a standard normal z 2 is distributed as  2 with 1 degree of freedom From the example, 1.4 2 =1.96, p is between.1 and.25, and this is effectively a 2-tailed test Pearson’s calculation for this test statistic is: where O i is an observed frequency and E i is the expected frequency given the null hypothesis of equal proportions.

13 13 G89.2228 Lect 6b Expected values for no association From the example: p 1 =11/37=.297, and p 2 =16/35=.457 The expected frequencies are based on a pooled p=(11+16)/(37+35)=.375

14 14 G89.2228 Lect 6b Chi square test of independence vs. association, continued Marginal probabilities = pooled Expected joint probabilities|H0 = product of marginals (e.g..193 =.375*.514) E i = expected joint probability * n (e.g. 13.875 =.193*72=27*37/72)


Download ppt "1 G89.2228 Lect 6b G89.2228 Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a."

Similar presentations


Ads by Google