Estimating a Population Proportion ADM 2304 – Winter 2012 ©Tony Quon.

Slides:



Advertisements
Similar presentations
Overview 8.1 Z Interval for the Mean 8.2 t Interval for the Mean
Advertisements

Estimation in Sampling
Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.
Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.
Point and Confidence Interval Estimation of a Population Proportion, p
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Introduction to Statistics: Chapter 8 Estimation.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
Chapter 8 Estimation: Single Population
Chapter 7 Estimation: Single Population
7-2 Estimating a Population Proportion
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
BCOR 1020 Business Statistics
Statistics for Managers Using Microsoft® Excel 7th Edition
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics, A First Course.
SP Lesson B. Sampling Distribution Model Sampling models are what make Statistics work. Shows how a sample proportion varies from sample to sample.
Statistics for Managers Using Microsoft® Excel 7th Edition
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Chapter 7 Confidence Intervals and Sample Sizes
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Confidence Interval Estimation
Chapter 7 Confidence Intervals (置信区间)
Chapter 6 Confidence Intervals 1 Larson/Farber 4th ed.
June 18, 2008Stat Lecture 11 - Confidence Intervals 1 Introduction to Inference Sampling Distributions, Confidence Intervals and Hypothesis Testing.
Inference for a Single Population Proportion (p).
9- 1 Chapter Nine McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
QBM117 Business Statistics Estimating the population mean , when the population variance  2, is known.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
6 Chapter Confidence Intervals © 2012 Pearson Education, Inc.
Confidence Intervals 1 Chapter 6. Chapter Outline Confidence Intervals for the Mean (Large Samples) 6.2 Confidence Intervals for the Mean (Small.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Confidence Intervals for the Mean (Large Samples) Larson/Farber 4th ed 1 Section 6.1.
Confidence Intervals for the Mean (σ known) (Large Samples)
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Estimating a Population Proportion
Estimation Chapter 8. Estimating µ When σ Is Known.
Unit 6 Confidence Intervals If you arrive late (or leave early) please do not announce it to everyone as we get side tracked, instead send me an .
Chap 7-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 7 Estimating Population Values.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
LECTURE 25 THURSDAY, 19 NOVEMBER STA291 Fall
1 Section 10.1 Estimating with Confidence AP Statistics January 2013.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Section 6.1 Confidence Intervals for the Mean (Large Samples) Larson/Farber 4th ed.
Section Estimating a Proportion with Confidence Objectives: 1.To find a confidence interval graphically 2.Understand a confidence interval as consisting.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall
Understanding Basic Statistics
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
AP Statistics Unit 5 Addie Lunn, Taylor Lyon, Caroline Resetar.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Point Estimates point estimate A point estimate is a single number determined from a sample that is used to estimate the corresponding population parameter.
Chapter Outline 6.1 Confidence Intervals for the Mean (Large Samples) 6.2 Confidence Intervals for the Mean (Small Samples) 6.3 Confidence Intervals for.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics: A First Course 5 th Edition.
Chapter Confidence Intervals 1 of 31 6  2012 Pearson Education, Inc. All rights reserved.
Chapter 6 Confidence Intervals 1 Larson/Farber 4th ed.
Confidence Intervals.
CHAPTER 8 Estimating with Confidence
ESTIMATION.
Week 10 Chapter 16. Confidence Intervals for Proportions
Confidence Interval Estimation and Statistical Inference
Confidence Interval Estimation
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Chapter 8 Confidence Intervals.
Lecture Slides Elementary Statistics Twelfth Edition
Confidence Intervals for the Mean (Large Samples)
Presentation transcript:

Estimating a Population Proportion ADM 2304 – Winter 2012 ©Tony Quon

Objective Based on the Central Limit Theorem, we construct (confidence) interval estimates for the population proportion p (and the population mean  ). Interval estimates are useful for giving a range of possible values (a measure of precision) and not just a point estimate, along with a measure of “confidence” in the estimates.

Standard Normal z-notation z  is a number such that Prob [z z  ] =  For example, z  = if  =.05 since Prob( z ) =.05 z  /2 is a number such that Prob [z z  /2 ] =  / 2 or Prob [-z  /2 < z < z  /2 ] = 1 - . For example, z  /2 = 1.96 if  =.05 since Prob(-1.96 < z < 1.96) =.95

Probability Intervals Based on the Sampling Distribution of p-hat, Prob (p-hat falls inside [ p ± z  /2 *  (pq/n) ] ) = 1 - . (the probability is defined over all possible samples) For example, Prob(p-hat falls inside [ p ± 2 *  (pq/n) ] )= 95%. The interval [ p - z  /2 *  (pq/n), p + z  /2 *  (pq/n)] is a fixed probability interval, since it is based on a fixed value of p; moreover, it can be calculated only if p is known or some assumed value, and is not based on sample data.

Confidence Intervals The interval [p-hat ± z  /2 *  (pq/n)] is a 100(1 -  ) % confidence interval; it is random because the value of p-hat can vary from sample to sample. The confidence interval covers the true value p, if and only if p-hat falls inside the probability interval [p ± z  /2 *  (pq/n)].

Use of Standard Error The confidence interval can be calculated only if we estimate the unknown value of the standard deviation  (pq/n) by the “standard error”  (p-hat * q-hat /n). The standard error is the estimated standard deviation of a sampling distribution. The standard error of the sample proportion is denoted SE(p-hat).

Confidence Interval for a (Population) Proportion Based on the "approximately normal" sampling distribution of p-hat (sample proportion), the 100(1-  )% confidence interval for estimating the population proportion is: p-hat ± z  /2 * SE(p-hat) where SE(p-hat) =  [ p-hat * q-hat / n] and q-hat = 1 - p-hat, and z  /2 is the upper  /2 point of the standard normal distribution; the SDVW text uses the notation z*. (We must have a random sample where n/N < 10% but n must be large enough such that both n * p and n * ( 1 - p) are at least 10. In practice, we check whether n * p-hat and n * q-hat are both at least 10.)

Common CIs Based on the normal probability table, a 95% confidence interval would be p-hat ± 1.96 * SE(p-hat) and a 99% confidence interval would be p-hat ± * SE(p-hat).

Terminology of Confidence Intervals We say that we are “100(1-  )% confident" that an interval calculated using the formula: p-hat ± z  /2 *  (p-hat * q-hat /n), contains or covers the value of the population proportion.

Interpretation of Confidence Intervals Confidence intervals are random intervals, since they are based on the value of a random quantity (p-hat) that varies from sample to sample. Given an actual sample of size n, the calculated interval either contains the true population parameter or it does not. 100(1-  )% of confidence intervals based on all possible samples will contain the parameter value. The confidence level is a collective characteristic of all intervals, not an individual characteristic. It is incorrect to say that a particular interval contains the true value with probability 95%. See conf.xlt We can only hope that we got one of the good samples where the 95% confidence interval does cover the true parameter value.

Example A survey of 1000 potential voters conducted on January 9-10 ( presidential_election/2012_presidential_matchups ) found that 44% supported Obama against Romney (41%). What is a 95% confidence interval for the true level of support? p-hat =.44 and SE(p-hat) =  (.44*.56 / 1000) =.0157 The 95% confidence interval for Obama’s support is: 44.0% ± 1.96 *.0157 or 44.0% ± 3.1% and the 3.1% is known as the “margin of error”. Such results are usually qualified by the statement: “The poll is accurate to within 3 percentage points, 19 times out of 20.”

Sample Size Determination Suppose we want to estimate the level of support for a tunnel-based LRT with a margin of error (ME) =.05 and a 95% confidence level. We need to find n such that: ± z  /2  (pq/n) = ± ME or n = pq (z  /2 / ME) 2 n is maximized when p = q =.5; therefore, a conservative estimate of n is: (.5*.5)(1.96/.05) 2 = 384. Smaller sample sizes n could be justified if we had better information on the value of p.

Other Confidence Intervals Usually confidence intervals are calculated as symmetric intervals, with a margin of error equal on either side of the sample estimate. This reflects an objective point of view, without any bias in favour of one side or the other.

1-sided Probability Intervals Instead of calculating a symmetric probability interval [ p ± z  /2 *  (pq/n)], where the probability of p_hat falling inside the probability interval is 1 - , we could calculate non- symmetric probability intervals: [ 0, p + z  *  (pq/n) ] or [p - z  *  (pq/n), 1], where again the probability is 1 -  that: p-hat < p + z  *  (pq/n) or that p-hat > p - z  *  (pq/n), respectively. The probabilities are defined over all possible samples of size n.

1-sided Confidence Intervals The corresponding 1-sided confidence intervals [p-hat - z  *  (p-hat * q-hat /n ), 1] or [0, p-hat + z  *  (p-hat * q-hat /n) ] could reflect particular points of view.

A Republican might want to say the proportion who support Obama is: “No more than p-hat + z  *  (p-hat * q-hat /n)”, placing an upper bound (but no lower bound) on the level of support. A Democrat might want to say the proportion supporting Obama is: “No less than p-hat - z  *  (p-hat * q-hat /n)”, placing a lower bound (but no upper bound) on the level of support.

Comparisons of CIs Compare the 2-sided CI: 44% ± 1.96*.0157 or 44% ± 3.1% = (40.9%, 47.1%) with the 1-sided CIs: “at most 44% *.0157” = ( 0, 46.6% ), or “at least 44% *.0157” = ( 41.4%, 1 )

Summary The sampling distribution concept allows us to calculate fixed probability intervals for p-hat: [ p ± z  /2 *  (pq/n) ] Keeping in mind the sampling interpretation, we calculate the confidence intervals for p: [ p-hat ± z  /2 *  (p-hat * q-hat /n) ] A sample where p-hat falls inside the probability interval will be one where the confidence interval will cover the true value of p. Such samples occur with probability (1 -  ) but we do not know whether a specific confidence interval does or does not cover p.

Using Minitab to calculate confidence intervals for a population proportion Go first to the Stat Menu, and select Basic Statistics, then “1 Proportion”. The column of individual values must contain two different values (numerical or qualitative) or else we need summarized data in terms of “number of trials” n and the “number of events” x.

Options Specify the confidence level and check “test and interval based on normal distribution” if the sample size condition is satisfied (otherwise, Minitab will use the binomial probability calculation). For a 2-sided confidence interval, select the ‘not equal’ “alternative”; for a 1-sided confidence interval, select the ‘less than’ or the ‘greater than’ “alternative” for a “no greater than” or a “no less than” confidence interval, respectively. (Next week, we will talk more about “alternative” hypotheses in the context of hypothesis testing.)