Objectives (BPS chapter 20) Inference for a population proportion  The sample proportion  The sampling distribution of  Large sample confidence interval.

Slides:



Advertisements
Similar presentations
From the Data at Hand to the World at Large Chapters 19, 23 Confidence Intervals Estimation of population parameters: an unknown population proportion.
Advertisements

Part II: Significance Levels, Type I and Type II Errors, Power
Inference for Proportions Inference for a Single Proportion
Chapter 22 Comparing 2 Proportions © 2006 W.H. Freeman and Company.
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Business Statistics for Managerial Decision
Lecture Unit 5 Section 5.4 Testing Hypotheses about Proportions 1.
Lecture 11/7. Inference for Proportions 8.2 Comparing Two Proportions © 2012 W.H. Freeman and Company.
Inference about a population proportion Chapter 20 © 2006 W.H. Freeman and Company.
Warm-up An experiment on the side effects of pain relievers assigned arthritis patients to one of several over-the-counter pain medications. Of the 440.
Chapter 9 Hypothesis Testing.
LECTURE UNIT 5 Confidence Intervals (application of the Central Limit Theorem) Sections 5.1, 5.2 Introduction and Confidence Intervals for Proportions.
Chapter 20 Testing Hypotheses about Proportions 1.
From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating.
Confidence Intervals Mrs. Medina.
Chapters 20, 21 Testing Hypotheses about Proportions 1.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Inference for proportions - Comparing 2 proportions IPS chapter 8.2 © 2006 W.H. Freeman and Company.
+ Unit 6 - Comparing Two Populations or Groups Comparing Two Proportions 11.2Comparing Two Means.
Inference for proportions - Comparing 2 proportions IPS chapter 8.2 © 2006 W.H. Freeman and Company.
Comparing 2 proportions BPS chapter 21 © 2006 W. H. Freeman and Company These PowerPoint files were developed by Brigitte Baldi at the University of California,
Inferences Based on Two Samples
Example 1: a) Describe the shape, center, and spread of the sampling distribution of. Because n 1 p 1 = 100(0.7) = 70, n 1 (1 − p 1 ) = 100(0.3) = 30,
Chapter 12: Inference for Proportions
Inference about a population proportion BPS chapter 20 © 2006 W.H. Freeman and Company.
More About Significance Tests
+ Section 10.1 Comparing Two Proportions After this section, you should be able to… DETERMINE whether the conditions for performing inference are met.
Inference for proportions - Inference for a single proportion IPS chapter 8.1 © 2006 W.H. Freeman and Company.
Confidence intervals for Proportions
Lecture 5 Two population tests of Means and Proportions.
Chapter 22: Comparing Two Proportions
Chapter 10: Comparing Two Populations or Groups
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Chapter 8 Testing Hypotheses about Proportions Part II: Significance Levels, Type I and Type II Errors, Power 1.
10.1 Comparing Two Proportions. Section 10.1 Comparing Two Proportions After this section, you should be able to… DETERMINE whether the conditions for.
Comparing 2 Proportions © 2006 W.H. Freeman and Company.
From the Data at Hand to the World at Large Chapter 19 Confidence Intervals for an Unknown Population p Estimation of a population parameter: Estimating.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 10: Comparing Two Populations or Groups Section 10.1 Comparing Two Proportions.
Lecture Unit 5 Section 5.4 Testing Hypotheses about Proportions 1.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
+ Section 10.1 Comparing Two Proportions After this section, you should be able to… DETERMINE whether the conditions for performing inference are met.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 8: Tests of significance and confidence.
AP STATISTICS COMPARING TWO PROPORTIONS Chapter 22.
Objectives (PSLS Chapter 19) Inference for a population proportion  Conditions for inference on proportions  The sample proportion (p hat )  The sampling.
Objectives (Chapter 20) Comparing two proportions  Comparing 2 independent samples  Confidence interval for 2 proportion  Large sample method  Plus.
Chapter 10 Comparing Two Populations or Groups Sect 10.1 Comparing two proportions.
Confidence Interval for a Proportion Adapted from North Carolina State University.
20. Comparing two proportions
Warm-up An experiment on the side effects of pain relievers assigned arthritis patients to one of several over-the-counter pain medications. Of the 440.
Chapter 10: Comparing Two Populations or Groups
Chapter 19 Testing Hypotheses about Proportions
Chapter 8: Inference for Proportions
Chapter 6 Testing Hypotheses about Proportions
Chapter 10: Comparing Two Populations or Groups
The Practice of Statistics in the Life Sciences Fourth Edition
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Presentation transcript:

Objectives (BPS chapter 20) Inference for a population proportion  The sample proportion  The sampling distribution of  Large sample confidence interval for p  Accurate confidence intervals for p  Choosing the sample size  Significance tests for a proportion

The two types of data — reminder  Quantitative  Something that can be counted or measured and then added, subtracted, averaged, etc., across individuals in the population.  Example: How tall you are, your age, your blood cholesterol level  Categorical  Something that falls into one of several categories. What can be counted is the proportion of individuals in each category.  Example: Your blood type ( A, B, AB, O ), your hair color, your family health history for genetic diseases, whether you will develop lung cancer How do you figure it out? Ask:  What are the n individuals/units in the sample (of size “n”)?  What’s being recorded about those n individuals/units?  Is that a number (  quantitative) or a statement (  categorical)?

 We choose 50 people in an undergrad class, and find that 10 of them are Hispanic: = (10)/(50) = 0.2 (proportion of Hispanics in sample)  You treat a group of 120 Herpes patients given a new drug; 30 get better: = (30)/(120) = 0.25 (proportion of patients improving in sample) The sample proportion We now study categorical data and draw inference on the proportion, or percentage, of the population with a specific characteristic. If we call a given categorical characteristic in the population “success,” then the sample proportion of successes,,is:

Sampling distribution of The sampling distribution of is never exactly normal. But as the sample size increases, the sampling distribution of becomes approximately normal.

The mean and standard deviation (width) of the sampling distribution are both completely determined by p and n. Thus, we have only one population parameter to estimate, p. Implication for estimating proportions Therefore, inference for proportions can rely directly on the normal distribution (unlike inference for means, which requires the use of a t distribution with a specific degree of freedom).

Conditions for inference on p Assumptions: 1.We regard our data as a simple random sample (SRS) from the population. That is, as usual, the most important condition. 2.The sample size n is large enough that the sampling distribution is indeed normal. How large a sample size is enough? Different inference procedures require different answers ( we’ll see what to do practically ).

Large-sample confidence interval for p Use this method when the number of successes and the number of failures are both at least 15. C Z*Z*−Z*−Z* m Confidence intervals contain the population proportion p in C% of samples. For an SRS of size n drawn from a large population and with sample proportion calculated from the data, an approximate level C confidence interval for p is: C is the area under the standard normal curve between −z* and z*.

Medication side effects Arthritis is a painful, chronic inflammation of the joints. An experiment on the side effects of pain relievers examined arthritis patients to find the proportion of patients who suffer side effects. What are some side effects of ibuprofen? Serious side effects (seek medical attention immediately): Allergic reactions (difficulty breathing, swelling, or hives) Muscle cramps, numbness, or tingling Ulcers (open sores) in the mouth Rapid weight gain (fluid retention) Seizures Black, bloody, or tarry stools Blood in your urine or vomit Decreased hearing or ringing in the ears Jaundice (yellowing of the skin or eyes) Abdominal cramping, indigestion, or heartburn Less serious side effects (discuss with your doctor): Dizziness or headache Nausea, gaseousness, diarrhea, or constipation Depression Fatigue or weakness Dry mouth Irregular menstrual periods

Let’s calculate a 90% confidence interval for the population proportion of arthritis patients who suffer some “adverse symptoms.” What is the sample proportion ? What is the sampling distribution for the proportion of arthritis patients with adverse symptoms for samples of 440? For a 90% confidence level, z* = Using the large sample method, we calculate a margin of error m:  With 90% confidence level, between 2.9% and 7.5% of arthritis patients taking this pain medication experience some adverse symptoms.

Choosing the sample size You may need to choose a sample size large enough to achieve a specified margin of error. However, because the sampling distribution of is a function of the population proportion p this process requires that you guess a likely value for p: p*. The margin of error will be less than or equal to m if p* is chosen to be 0.5. Remember, though, that sample size is not always stretchable at will. There are typically costs and constraints associated with large samples.

What sample size would we need in order to achieve a margin of error no more than 0.01 (1%) for a 90% confidence interval for the population proportion of arthritis patients who suffer some “adverse symptoms?” We could use 0.5 for our guessed p*. However, since the drug has been approved for sale over the counter, we can safely assume that no more than 10% of patients should suffer “adverse symptoms” (a better guess than 50%). For a 90% confidence level, z* =  To obtain a margin of error of no more than 1% we would need a sample size n of at least 2435 arthritis patients.

Significance test for p The sampling distribution for is approximately normal for large sample sizes, and its shape depends solely on p and n. Thus, we can easily test the null hypothesis: H 0 : p = p 0 (a given value we are testing) If H 0 is true, the sampling distribution is known  The likelihood of our sample proportion given the null hypothesis depends on how far from p 0 our is in units of standard deviation. This is valid when both expected counts — expected successes np 0 and expected failures n(1 − p 0 ) — are each 10 or larger.

P-values and one- or two-sided hypotheses — reminder And as always, if the P-value is smaller than the chosen significance level , then the difference is statistically significant and we reject H 0.

A national survey by the National Institute for Occupational Safety and Health on restaurant employees found that 75% said that work stress had a negative impact on their personal lives. You investigate a restaurant chain to see if the proportion of all their employees negatively affected by work stress differs from the national proportion p 0 = H 0 : p = p 0 = 0.75 vs. H a : p ≠ 0.75 (two-sided alternative) In your SRS of 100 employees, you find that 68 answered “Yes” when asked, “Does work stress have a negative impact on your personal life?” The expected counts are 100 × 0.75 = 75 and 25. Both are greater than 10, so we can use the z-test. The test statistic is:

From Table A we find the area to the left of z < 1.62 is Thus P(Z ≥ 1.62) = 1 − , or Since the alternative hypothesis is two-sided, the P-value is the area in both tails, and P = 2 × =  The chain restaurant data are compatible with the national survey results ( = 0.68, z = 1.62, P = 0.11).

Objectives (BPS chapter 21) Comparing two proportions  The sampling distribution of a difference between proportions  Large Sample confidence intervals for comparing two proportions  Using technology  Accurate confidence intervals for comparing two proportions  Significance tests for comparing proportions

Comparing two independent samples We often need to compare two treatments used on independent samples. We can compute the difference between the two sample proportions and compare it to the corresponding, approximately normal sampling distribution for ( 1 – 2 ):

Large-sample CI for two proportions For two independent SRSs of sizes n 1 and n 2 with sample proportion of successes and respectively, an approximate level C confidence interval for p 1 – p 2 is: Use this method only when the populations are at least 10 times larger than the samples and the number of successes and the number of failures are each at least 10 in each sample. C is the area under the standard normal curve between −z* and z*.

Cholesterol and heart attacks How much does the cholesterol-lowering drug Gemfibrozil help reduce the risk of heart attack? We compare the incidence of heart attack over a 5-year period for two random samples of middle-aged men taking either the drug or a placebo. So the 90% CI is (0.0414−0.0273) ± 1.645* = ± We are 90% confident that the percentage of middle-aged men who suffer a heart attack is 0.16% to 2.7% lower when taking the cholesterol-lowering drug. Standard error of the difference p 1 − p 2 : H. attackn Drug % Placebo %

If the null hypothesis is true, then we can rely on the properties of the sampling distribution to estimate the probability of drawing two samples with proportions 1 and 2 at random. Test of significance This test is appropriate when all counts are at least 5 (number of successes and number of failures in each sample). =0

Gastric Freezing Gastric freezing was once a treatment for ulcers. Patients would swallow a deflated balloon with tubes, and a cold liquid would be pumped for an hour to cool the stomach and reduce acid production, thus relieving ulcer pain. The treatment was shown to be safe, significantly reducing ulcer pain, and so was widely used for years. A randomized comparative experiment later compared the outcome of gastric freezing with that of a placebo: 28 of the 82 patients subjected to gastric freezing improved, while 30 of the 78 in the control group improved. Conclusion: The gastric freezing was no better than a placebo (p-value 0.69), and this treatment was abandoned. ALWAYS USE A CONTROL! H 0 : p gf = p placebo H a : p gf > p placebo