Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparing Two Proportions Lesson 1 and Lesson 2: Section 10.1.

Similar presentations


Presentation on theme: "Comparing Two Proportions Lesson 1 and Lesson 2: Section 10.1."— Presentation transcript:

1 Comparing Two Proportions Lesson 1 and Lesson 2: Section 10.1

2 objectives Lesson 1:  Describe the characteristics of the sampling distribution of  Calculate the probabilities using the sampling distribution of  Determine whether the conditions for performing inference are met.  Construct and interpret a confidence interval to compare two proportions. Lesson 2:  Perform a significance test to compare two proportions.  Interpret the results of inference procedures in a randomized experiment.

3 ACTIVITY: Drinking Age and Response Bias  In Chapter 4, we learned how the working of a question can create bias. Two AP Statistics students, Amy and Grace, decided to investigate this issue by asking high school students two different versions of a question about underage drinking. Here are the questions they asked:  1. Each year, approximately 5000 people under the age of 21 die as a result of underage alcohol consumption. Should minors (under age 21) be allowed to consume alcohol?  2. You are legally considered an adult when you turn 18, gaining voting rights and other privileges. Should minors (under age 21) be allowed to consume alcohol?  Because of the wording, Amy and Grace speculated that a higher proportion of students would answer “Yes” to the second version of the question. Using 44 subjects from their school, Amy and Grace randomly assigned 21 students to get the first question and 23 students to get the second question.

4 ACTIVITY: Drinking Age and Response Bias  Here are the results:  The difference in the proportions of students who said “Yes” (Question 2 – Question 1) was 16/23 – 6/21 = 0.696 – 0.286 = 0.410. Does this difference provide convincing evidence that the wording of the questions has an effect on the response, or could the difference be due to the chance variation in random assignment?  Follow the steps on your activity sheet to find out! Question 1Question 2Total Yes61622 No15722 Total212344

5 In a two-sample problem, we want to compare the proportions p 1 and p 2 of successes in two populations. NOTATION: We compare populations by doing inference about the difference ________________ between the population proportions. The statistic that estimates this difference is the difference between the two sample proportions, ______________. PopulationPopulation Parameter (for proportion) Sample SizeSample Statistic (for proportion) 1 2 p1p1 p2p2 n1n1 n2n2 p 1 - p 2 comparing two proportions

6 the sampling distribution of  Shape: If p 1 and p 2 are Normal, then will be Normal. How do we check for this? n 1 p 1, n 1 (1 – p 1 ), n 2 p 2, n 2 (1 – p 2 ) are all ≥ 10.  Center: (that is, the difference of sample proportions is an unbiased estimator of the difference of population proportions)  Spread:  As long as the sample is ≤ 10% of the population  This is on your formula sheet under Two-Sample!

7 example 1: who does more homework?  Suppose that there are two large high schools, each with more than 2000 students, in a certain town. At School 1, 70% of students did their homework last night. Only 50% of the students at School 2 did their homework last night. The counselor at School 1 takes an SRS of 100 students and records the proportion that did homework. School 2’s counselor takes an SRS of 200 students and records the proportion that did their homework.  (a) Describe the shape, center, and spread of the sampling distribution of.  Since n 1 p 1, n 1 (1-p 1 ), n 2 p 2, n 2 (1-p 2 ) are all at least 10, the sampling distribution of is approximately Normal. Its mean is p 1 – p 2 =.70 –.50 = 0.20 and its standard deviation is

8 example 1: who does more homework?  (b) After the meeting, they both report to their principals that. Find the probability of getting a difference in sample proportions of 0.10 or less from the two surveys.  We want to find the P( ), so we standardize:  Using Table A, we find the area to the left of z = -1.72 under the standard Normal curve is 0.0427  (You could have also used normalcdf(-100, -1.72) or you could also do normalcdf(-100, 0.1, 0.2, 0.058)  The area is about 0.042 or 0.043 – about a 4.2% or 4.3% chance of getting a sample difference in proportions of 0.10 or less.

9 example 2: who does more homework? part 2  Suppose that two counselors at School 1, Mitchell and Zach, independently take a random sample of 100 students from their school and record the proportion of students who did their homework last night. When they are finished, they find that the difference in their proportions,, is 0.08. They are surprised to get a difference this big, considering that they were sampling from the same population.  (a) Find the probability of getting two proportions that are at least 0.08 apart.  We want to calculate P() or the P( )  After using Table A or the normalcdf function, you should get an area of 0.2184. There is a 21.84% chance of getting a difference of sample proportions of at least 0.08 apart.

10 example 2: who does more homework?  (b) Should the counselors have been surprised to get a difference this big?  Since the probability we calculated in part (a) isn’t very small, we shouldn’t be surprised to get a difference of sample proportions of 0.08 or larger just by chance, even when sampling from the same population.

11 conditions for z intervals and tests  Confidence intervals and significance test for 2 sample proportions have the same conditions, but different standard error (we will talk about in a few slides).  Conditions:  Random: both samples are collected by a random sample or from a randomized experiment.  Normal: are all at least 10.  Independent: Both the samples or groups themselves and the individual observations in each sample or group are independent. When sampling without replacement, check the 10% rule on both populations.

12 Two Sample z Interval for a Difference Between Two Proportions  When the conditions are met, an approximate level C confidence interval for:  Where z* is the critical value for the standard Normal curve with area A between –z* and z*.

13 example 3: Presidential approval  Many news organizations conduct polls asking adults in the United States if they approve of the job the president is doing. How did President Obama’s approval rating change from August 2009 to September 2010? According to a CNN poll of 1024 randomly selected U.S. adults on September 1-2, 2010, 50% approved of Obama’s job performance. A CNN poll of 1010 randomly selected U.S. adults on August 28-30, 2009, showed that 53% approved of Obama’s job performance.  Problem: (a) Use the results of these polls to construct and interpret a 90% confidence interval for the change in Obama’s approval rating among all U.S. adults.  (b) Based on your interval, is there convincing evidence that Obama’s job approval rating changed between August 2009 and September 2010?

14 example 3: Presidential approval  (a) STATE: We want to estimate p 2010 – p 2009 at the 90% confidence level where p 2010 = the true proportions of all U.S. adults who approved of President Obama’s job performance in September 2010 and p 2009 = the true proportions of all U.S. adults who approved of President Obama’s job performance in August 2009.  PLAN: We should use a two-sample z interval for p 2010 – p 2009 if the conditions are satisfied.  Random: The data came from separate random samples.  Normal: are all greater or equal to 10  Independent: The sample were taken independently and there were at least 10(1024) = 10,240 U.S. adults in 2010 and 10(1010) = 10,100 U.S. adults in 2009.

15 example 3: Presidential approval  (a) DO: With a 90% confidence interval, using InvNorm(0.05) to find the critical value z*, we have the interval:  = -0.03 ± 0.036 = (-0.066, 0.006)  CONCLUDE: We are 90% confident that the interval from -0.066 to 0.006 captures the true change in the proportion of U.S. adults who approve of President Obama’s job performance from August 2009 to September 2010. That is, it is plausible that his job approval has fallen by up to 6.6 percentage points or increase by up to 0.6 percentage points.  (b) Since 0 is included in the interval, it is plausible that there has been no change in President Obama’s approval rating. Thus, we do not have convincing evidence that his approval rating changed between August 2009 and September 2010.

16 in your calculator: two sample z interval for difference between two proportions.  (for another example of a STATE, PLAN, DO, CONCLUDE with a two sample z interval for a difference between two proportions, see pg. 609/610 in text)  To check answer in your calculator, press STAT, scroll to TESTS, and choose option B: 2-PropZInt  In x1: type in successes from 1 st population (.50*1024)  In n1: type in sample size from 1 st population (1024)  In x2: type in successes from 2 nd population (.53*1010 – you might have to round because calculator will not let you put in decimal)  In n2: type in sample size from 2 nd population (1010)  Highlight Calculate and press Enter

17 two sample tests: setting up hypotheses  Before we jump into a full fledged significance test, let’s make sure we know how to set up our hypotheses and the standard error.  Are teenagers going deaf? In a student of 3000 randomly selected teenagers in 1988 – 1994, 15% showed some hearing loss. In a similar study of 1800 teenagers in 2005-2006, 19.5% showed some hearing loss. Do these data give convincing evidence that the proportion of all teens with hearing loss has increased? (These data were reported in Arizona Daily Star, August 18, 2010).  State the hypotheses we are interested in testing. Define any parameters you use.  H 0 : p 1 – p 2 = 0 and H a : p 1 – p 2 > 0 where p 1 = the proportion of all teenagers with hearing loss in 2005-2006 and p 2 = the proportions of all teenagers with hearing loss in 1988-1994.

18 will the null always be = 0?  H 0 : p 1 – p 2 = 0 will be the most common (also written as H 0 : p 1 = p 2 )  However, there are occasions when the hypothesized difference is not zero. For example, suppose that a pharmaceutical company will decide to market a new drug if its success rate is at least 0.05 higher than the currently used drug. In this case, the null hypothesis would be  H 0 : p new – p current = 0.05 and  H a : p new – p current > 0.05

19 two sample tests: standard error  Remember standard error is the standard deviation of the statistic … I mentioned before that it is different in a two sample test (versus how we calculated it in a confidence interval).  The formula is the same, but we do not use, we use. = count of successes in both samples combined = count of individuals in both samples combined  This is called a pool (or combined) sample proportion – basically, we’re putting together our samples!  We will use in our standard error formula:  Note that on the formula sheet, they give you the formula using the pooled proportion:  it is the same, just written differently!

20 Two Sample z test for a Difference Between Two Proportions  When the conditions are met, to test the hypothesis  H 0 : p 1 – p 2 = 0,  First, find the pooled proportion of successes  Then calculate the z statistic:  Then find the P-value by calculating the probability of getting a z statistic this large or larger in this direction specified by the alternative hypothesis Ha using Table A or your calculator.

21 example 4: hearing loss  Are teenagers going deaf? In a study of 3000 randomly selected teenagers in 1988 – 1994, 15% showed some hearing loss. In a similar study of 1800 teenagers in 2005- 2006, 19.5% showed some hearing loss. (These data are reported in Arizona Daily Star, August 18, 2010)  Problem: (a) Do these data give convincing evidence that the proportion of all teens with hearing loss has increased?  (b) Between the two studies, Apple introduced the iPod. If the results of the test are statistically significant, can we blame iPods for the increased hearing loss in teenagers?

22 example 4: hearing loss  (a) STATE: We will test H 0 : p 1 – p 2 = 0 versus H a : p 1 – p 2 > 0 at the 0.05 significance level, where p 1 = the proportion of all teenagers with hearing loss in 2005-2006 and p 2 = the proportion of all teenagers with hearing loss in 1988 – 1994.  PLAN: We should use a two-sample z test for p 1 – p 2 if the conditions are satisfied.  Random: The data came from separate random samples.  Normal: are all at least 10. (Note: it is possible to use in this condition, but it is more accurate to use )  Independent: The sample were taken independently and there were at least 10(1800) = 180,000 teenagers in 2005-2006 and 10(3000) = 30,000 teenagers in 1988-1994.

23 example 4: hearing loss  (a) DO:  = Using Table A to find the P-value, the P(z>4.05) = 1 – P(z≤4.05) = 1 – 1 = 0  CONCLUDE: Since the P-value is less than 0.05, we reject H 0. We have convincing evidence that the proportion of all teens with hearing loss has increased from 1988 – 1994 to 2005-2006.  (b) No. Since we didn’t do an experiment where we randomly assigned some teens to listen to iPods and other teens to avoid listening to iPods, we cannot conclude iPods are the cause. What would a lurking variable be??

24 in your calculator: two sample z test for difference between two proportions.  (for another example – this one deals with a two-tailed test!, see pg. 614 in text)  To check answer in your calculator, press STAT, scroll to TESTS, and choose option 6: 2-PropZTest  In x1: type in successes from 1 st population (.195*1800)  In n1: type in sample size from 1 st population (1800)  In x2: type in successes from 2 nd population (.15*3000)  In n2: type in sample size from 2 nd population (3000)  Choose the correct alternate symbol  Highlight Calculate and press Enter (or you can press Draw to see Normal curve)

25 inference for experiments  In the last example, although our results were significant, we could not conclude the causation of hearing loss. Let’s look at an experiment and use inference.

26 example 5: cash for quitters  In an effort to reduce health care costs, General Motors sponsored a study to help employees stop smoking. In the study, half of the subjects were randomly assigned to receive up to $750 for quitting smoking for a year while the other half were simply encouraged to use traditional methods to stop smoking. None of the 878 volunteers knew that there was a financial incentive when they signed up. At the end of one year, 15% of those in the financial rewards group had quit smoking while only 5% in the traditional group had quit smoking. Do the results of this study give convincing evidence that a financial incentive helps people quit smoking? (These data are reported in Arizona Daily Star, February 11, 2009)

27 example 5: cash for quitters  (a) STATE: We will test H 0 : p 1 – p 2 = 0 versus H a : p 1 – p 2 > 0 at the 0.05 significance level, where p 1 = the true quitting rate for employees like these who get a financial incentive to quit smoking and p 2 = the true quitting rate for employees like these who don’t get a financial incentive to quit smoking.  PLAN: We should use a two-sample z test for p 1 – p 2 if the conditions are satisfied.  Random: The treatments were randomly assigned.  Normal: are all at least 10.  Independent: The random assignment allows us to view these two groups as independent. We must assume that each employee’s decision to quit is independent of other employees’ decisions.

28 example 5: cash for quitters  (a) DO:  = Using Table A to find the P-value, the P(z>4.94) = 1 – P(z≤4.94) = 1 – 1 = 0  CONCLUDE: Since the P-value is less than 0.05, we reject H 0. We have convincing evidence that the financial incentives help employees like these quit smoking.

29 homework  Assigned reading: Lesson 1 = p. 604-611  Lesson 2 = 611-621  HW problems: Lesson 1: p. 621/#1, 2, 5, 7-10, 11, 13  Lesson 2: p. 623/#15, 17, 20, 21, 23, 25, 28, 29- 32  Check answers to odd problems.


Download ppt "Comparing Two Proportions Lesson 1 and Lesson 2: Section 10.1."

Similar presentations


Ads by Google