Computing Confidence Interval Proportion
A Three Color Bowl Suppose we have a bowl containing marbles, each identical in size, texture and weight, in three colors: Red, Green, Blue.
Proportion Red Suppose we have a large population containing marbles, each identical in size, texture and weight, in three colors: Red, Green, Blue. Suppose further that we wish to estimate the population proportion of red, but that examining the population directly and exhaustively is impractical.
Sample Proportion Red Colorn color p color Blue20p blue = (20/50) =.40 Green15p green = (15/50) =.30 Red15p red = (15/50) =.30 Total5050/50 = 1
Sample Proportion Red n red = 15 n = = 50 p red = 15 / n = 15 / 50 =.30 sdp = sqrt(p*(1-p)/n) = sqrt(.30*(1-p)/n) = sqrt(.30*.70/n) = sqrt(.30*.70/50) = sqrt(.210/50) ≈ sqrt(.0042) .06481
Confidence Level Our next step is to select a confidence level this number will provide a level of confidence in our estimation process. A standard choice is 95% confidence. Using the we obtain the following row: Our multiplier is 2.00.
Z(k) PROBRT PROBCENT Z(k) PROBRT PROBCENT Z(k) PROBRT PROBCENT
Lower Confidence Bound p red =.30 sdp Z = 2 lower bound = p red – Z*sdp =.30 – Z*sdp =.30 ─ 2*sdp ≈.30 ─ 2* ≈.1703
Upper Confidence Bound p red =.30 sdp Z = 2 upper bound = p red + Z*sdp =.30 + Z*sdp = *sdp ≈ * ≈.4296
Write the Interval We write the approximate interval as [.1703,.4296 ].
Confidence Estimation Schematic Population P red Obtain Sample Size = n Compute n red p red sdp Compute lower = p red – Z*sdp upper = p red + Z*sdp
Interpretation ─ Population and Proportion We have a large population of marbles. We seek the true population proportion of red marbles for this population.
Interpretation ─ Family of Samples We obtain random samples of n=50 marbles per sample. Each marble is drawn from the population with replacement. Our Family of Samples consists of every possible random sample as described above.
Interpretation ─ Family of Intervals From each member of the Family of Samples we comupute the interval [p red ─ 2*sdp, p red + 2*sdp]; where p red = n red /n, and sdp=sqrt(p red *(1- p red )/n). Our Family of Intervals consists of every possible interval computed as above.
Interpretation ─ Confidence Approximately 95% of the members of the Family of Intervals cover P red, the true population proportion of red marbles. The remaining 5% or so fail. We view our single interval, [.1703,.4296 ], as being drawn at random from the Family of Intervals. If our interval is drawn from the 95% supermajority, then between 17.03% and 42.96% of the marbles are red.