Hypothesis tests for the difference between two proportions

Slides:



Advertisements
Similar presentations
Testing a Claim about a Proportion Assumptions 1.The sample was a simple random sample 2.The conditions for a binomial distribution are satisfied 3.Both.
Advertisements

Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
© 2010 Pearson Prentice Hall. All rights reserved Hypothesis Testing Using a Single Sample.
BCOR 1020 Business Statistics Lecture 22 – April 10, 2008.
Statistics Are Fun! Analysis of Variance
8-3 Testing a Claim about a Proportion
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
1 Chapter 9 Inferences from Two Samples In this chapter we will deal with two samples from two populations. The general goal is to compare the parameters.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Tue, Oct 23, 2007.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.2.
Chapter 9 Inferences from Two Samples
1 Section 9-4 Two Means: Matched Pairs In this section we deal with dependent samples. In other words, there is some relationship between the two samples.
Slide Slide 1 Section 8-4 Testing a Claim About a Mean:  Known.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Fri, Nov 12, 2004.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Wed, Nov 1, 2006.
Testing Hypotheses about a Population Proportion Lecture 31 Sections 9.1 – 9.3 Wed, Mar 22, 2006.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
1 Section 8.5 Testing a claim about a mean (σ unknown) Objective For a population with mean µ (with σ unknown), use a sample to test a claim about the.
1 Section 8.4 Testing a claim about a mean (σ known) Objective For a population with mean µ (with σ known), use a sample (with a sample mean) to test a.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Hypothesis Testing – Two Means(Small, Independent Samples)
9.3 Hypothesis Tests for Population Proportions
Testing the Difference between Means and Variances
Significance Test for the Difference of Two Proportions
Testing Hypotheses about a Population Proportion
STA 291 Spring 2010 Lecture 18 Dustin Lueker.
Lecture Slides Essentials of Statistics 5th Edition
AP Statistics Comparing Two Proportions
Testing a Claim About a Mean:  Not Known
Hypothesis Testing for Population Means (s Unknown)
Chapter 12 Tests with Qualitative Data
Chapter 8 Hypothesis Testing with Two Samples.
Confidence Intervals for a Population Proportion
Lecture Slides Elementary Statistics Twelfth Edition
Testing a Claim About a Mean:  Known
Hypothesis Tests for a Population Mean,
Hypothesis Tests for a Population Mean,
Comparing Two Proportions
Hypothesis Tests for a Population Mean in Practice
Hypothesis tests for the difference between two means: Independent samples Section 11.1.
Elementary Statistics
Elementary Statistics
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Elementary Statistics
Comparing Two Proportions
Chapter 11: Inference for Distributions of Categorical Data
Statistical Inference about Regression
Inference in Linear Models
One-Way Analysis of Variance
Hypothesis Tests for Proportions
Hypothesis Tests for Two Population Standard Deviations
Making Decisions about a Population Mean with Confidence
Confidence intervals for the difference between two means: Independent samples Section 10.1.
HYPOTHESIS TESTS ABOUT THE MEAN AND PROPORTION
Analyzing the Association Between Categorical Variables
The Rank-Sum Test Section 15.2.
Hypothesis Tests for a Standard Deviation
Confidence intervals for the difference between two proportions
Testing Hypotheses about a Population Proportion
Power Section 9.7.
Hypothesis Testing and Confidence Intervals
Independent Samples: Comparing Proportions
STA 291 Summer 2008 Lecture 18 Dustin Lueker.
Testing Hypotheses about a Population Proportion
Inference about Two Population Proportions
Testing Hypotheses about a Population Proportion
Hypothesis Testing for Proportions
Testing Hypotheses about a Population Proportion
Testing a Claim About a Mean:  Known
Presentation transcript:

Hypothesis tests for the difference between two proportions Section 11.2

Objectives Perform a hypothesis test for the difference between two proportions using the P-value method Perform a hypothesis test for the difference between two proportions using the critical value method

Objective 1 Perform a hypothesis test for the difference between two proportions using the P-value method

Difference Between Two Population Proportions The General Social Survey took a poll that asked 350 employed people aged 25–40 whether they used a computer at work, and 259 said they did. They also asked the same question of 500 employed people aged 41–65, and 384 of them said that they used a computer at work. We can compute the sample proportions of people who used a computer at work in each of these age groups. Among those 25–40, the sample proportion was 259 350 = 0.740, and among those aged 41–65 the sample proportion was 384 500 = 0.768. So the sample proportion is larger among older workers. The question of interest, however, involves the population proportions. There are two populations involved; the population of all employed people aged 25–40, and the population of all employed people aged 41–65. The question is whether the population proportion of people aged 41–65 who use a computer at work is greater than the population proportion among those aged 25–40. This is an example of a situation in which we have two independent samples involving sample proportions.

Notation We begin by associating some notation for the population proportions, the sample proportions, the numbers of individuals in each category, and the sample sizes. 𝑝 1 and 𝑝 2 are the population proportions of the category of interest in the two populations. 𝑝 1 and 𝑝 2 are the proportions of the category of interest in the two samples. 𝑥 1 and 𝑥 2 are the numbers of individuals in the category of interest in the two samples. 𝑛 1 and 𝑛 2 are the two sample sizes.

Null and Alternate Hypothesis In order to perform a hypothesis test in the previous situation, we need to examine the issue at hand, which is whether the population proportions 𝑝 1 and 𝑝 2 are equal. The null hypothesis says that they are equal: 𝐻 0 : 𝑝 1 = 𝑝 2 There are three possibilities for the alternate hypothesis: 𝐻 1 : 𝑝 1 < 𝑝 2 𝐻 1 : 𝑝 1 > 𝑝 2 𝐻 1 : 𝑝 1 ≠ 𝑝 2

Mean and Standard Deviation The test statistic is based on the difference between the sample proportions, 𝑝 1 – 𝑝 2 . When the sample size is large, this difference is approximately normally distributed. The mean and standard deviation of this distribution are Mean = 𝑝 1 − 𝑝 2 Standard Deviation = 𝑝 1 1− 𝑝 1 𝑛 1 + 𝑝 2 1− 𝑝 2 𝑛 2

Pooled Proportion 𝑝 = 𝑥 1 + 𝑥 2 𝑛 1 + 𝑛 2 To compute the test statistic, we must find values for the mean and standard deviation. The mean is straightforward: Under the assumption that 𝐻 0 is true, 𝑝 1 – 𝑝 2 = 0. The standard deviation is a bit more involved. The standard deviation depends on the population proportions 𝑝 1 and 𝑝 2 , which are unknown. We need to estimate 𝑝 1 and 𝑝 2 . Under 𝐻 0 , we assume that 𝑝 1 = 𝑝 2 . Therefore, we need to estimate 𝑝 1 and 𝑝 2 with the same value. The value to use is the pooled proportion, which we will denote by 𝑝 . The pooled proportion is found by treating the two samples as though they were one big sample. We divide the total number of individuals in the category of interest in the two samples by the sum of the two sample sizes. 𝑝 = 𝑥 1 + 𝑥 2 𝑛 1 + 𝑛 2

Standard Error/Test Statistic The standard deviation is estimated with the standard error: Standard Error = 𝑝 1− 𝑝 𝑛 1 + 𝑝 1− 𝑝 𝑛 2 = 𝑝 1− 𝑝 1 𝑛 1 + 1 𝑛 2 The test statistic is the 𝑧-score of 𝑝 1 – 𝑝 2 : 𝑧= 𝑝 1 − 𝑝 2 − 𝑝 1 − 𝑝 2 𝑝 1− 𝑝 1 𝑛 1 + 1 𝑛 2 = 𝑝 1 − 𝑝 2 − 0 𝑝 1− 𝑝 1 𝑛 1 + 1 𝑛 2 = 𝑝 1 − 𝑝 2 𝑝 1− 𝑝 1 𝑛 1 + 1 𝑛 2

Assumptions The method just described for performing a hypothesis test for the difference of population proportions requires the following assumptions. Assumptions: We have two independent simple random samples. Each population is at least 20 times as large as the sample drawn from it. The individuals in each sample are divided into two categories. Both samples contain at least 10 individuals in each category.

Hypothesis Test for 𝑝 1 − 𝑝 2 Step 1: State the null and alternate hypotheses. Step 2: If making a decision, choose a significance level 𝛼. Step 3: Compute the test statistic 𝑧= 𝑝 1 − 𝑝 2 𝑝 1− 𝑝 1 𝑛 1 + 1 𝑛 2 where 𝑝 = 𝑥 1 + 𝑥 2 𝑛 1 + 𝑛 2 Step 4: Compute the P-value. Step 5: Interpret the P-value. If making a decision, reject 𝐻 0 if the P-value is less than or equal to the significance level 𝛼. Step 6: State a conclusion.

Example The General Social Survey took a poll that asked 350 employed people aged 25–40 whether they used a computer at work, and 259 said they did. They also asked the same question of 500 employed people aged 41–65, and 384 of them said that they used a computer at work. Can you conclude that the proportion of people who use a computer at work is greater among those aged 41–65 than among those aged 25–40? Use the 𝛼 = 0.05 level. Solution: We first check the assumptions. We have two independent random samples, and the populations are more than 20 times as large as the samples. The individuals in each sample are divided into two categories with more than 10 individuals in each category. The assumptions are satisfied. We summarize the relevant information: 25-40 41-65 Sample size 𝑛 1 =350 𝑛 2 =500 Number of individuals 𝑥 1 =259 𝑥 2 =384 Sample proportion 𝑝 1 = 259 350 =0.740 𝑝 2 = 384 500 =0.768 Population proportion 𝑝 1 (unknown) 𝑝 2 (unknown) The null and alternate hypotheses are: 𝐻 0 : 𝑝 1 = 𝑝 2 𝐻 1 : 𝑝 1 < 𝑝 2

Example – Perform a Hypothesis Test Solution (continued): The pooled proportion 𝑝 is 𝑝 = 𝑥 1 + 𝑥 2 𝑛 1 + 𝑛 2 = 259+384 350+500 =0.756471. The test statistic is 𝑧= 𝑝 1 − 𝑝 2 𝑝 1− 𝑝 1 𝑛 1 + 1 𝑛 2 = 0.740−0.768 0.756471(1 −0.756471) 1 350 + 1 500 =−0.94. The alternate hypothesis, 𝑝 1 < 𝑝 2 , is left-tailed. Therefore, the P-value is the area to the left of 𝑧 = −0.94. We find this area to be 0.1736. Since P > 0.05, we do not reject 𝐻 0 at the 𝛼 = 0.05 level. We cannot conclude that the proportion of workers aged 41–65 who use a computer at work is greater than the proportion among those aged 25–40. Remember 𝐻 0 : 𝑝 1 = 𝑝 2 𝐻 1 : 𝑝 1 < 𝑝 2 𝑛 1 =350 𝑛 2 =500 𝑥 1 =259 𝑥 2 =384 𝑝 1 =0.740 𝑝 2 =0.768

Hypothesis Testing on the TI-84 PLUS The 2-PropZTest command will perform a hypothesis test for the difference between proportions. This command is accessed by pressing STAT and highlighting the TESTS menu. We enter the values of 𝑥 1 , 𝑛 1 , 𝑥 2 , and 𝑛 2 .

Example (TI-84 PLUS) The General Social Survey took a poll that asked 350 employed people aged 25–40 whether they used a computer at work, and 259 said they did. They also asked the same question of 500 employed people aged 41–65, and 384 of them said that they used a computer at work. Can you conclude that the proportion of people who use a computer at work is greater among those aged 41–65 than among those aged 25–40? Use the 𝛼 = 0.05 level. Solution: We first check the assumptions. We have two independent random samples, and the populations are more than 20 times as large as the samples. The individuals in each sample are divided into two categories with more than 10 individuals in each category. The assumptions are satisfied. We summarize the relevant information: The null and alternate hypotheses are: 𝐻 0 : 𝑝 1 = 𝑝 2 𝐻 1 : 𝑝 1 < 𝑝 2 25-40 41-65 Sample size 𝑛 1 =350 𝑛 2 =500 Number of individuals 𝑥 1 =259 𝑥 2 =384 Population proportion 𝑝 1 (unknown) 𝑝 2 (unknown)

Example (TI-84 PLUS) Press STAT and highlight the TESTS menu and select 2-PropZTest. We enter the following information: Since we have a left-tailed test, select the <𝒑𝟐 option. Select Calculate. The P-value is 0.1746. Since P > 0.05, we do not reject 𝐻 0 at the 𝛼 = 0.05 level. We cannot conclude that the proportion of workers aged 41–65 who use a computer at work is greater than the proportion among those aged 25–40. 25-40 41-65 Sample size 𝑛 1 =350 𝑛 2 =500 Number of individuals 𝑥 1 =259 𝑥 2 =384

Objective 2 Perform a hypothesis test for the difference between two proportions using the critical value method

Using the Critical Value Method The critical value method can be used to perform a hypothesis test for the difference between two proportions. To use the critical value method, compute the test statistic as in the P-value method. Since the critical value is a 𝑧-score, critical values can be found in Table A.2 or with technology. The assumptions for the critical value method are the same as for the P-value method. Step 1: State the null and alternate hypotheses: Step 2: If making a decision, choose a significance level 𝛼 and find the critical value(s). Step 3: Compute the test statistic 𝑧= 𝑝 1 − 𝑝 2 𝑝 1− 𝑝 1 𝑛 1 + 1 𝑛 2 where 𝑝 = 𝑥 1 + 𝑥 2 𝑛 1 + 𝑛 2 .

Using the Critical Value Method Step 4: Determine whether to reject 𝐻 0 , as follows: Step 5: State a conclusion.

Example Traffic engineers tabulated types of car accidents by drivers of various ages. Out of a total of 82,486 accidents involving drivers aged 15–24 years, 4243 of them, or 5.1%, occurred in a driveway. Out of a total of 219,170 accidents involving drivers aged 25–64 years, 10,701 of them, or 4.9%, occurred in a driveway. Can you conclude that accidents involving drivers aged 15–24 are more likely to occur in driveways than accidents involving drivers aged 25–64? Use 𝛼 = 0.05. Solution: We have two independent samples, and the individuals in each sample fall into two categories with at least 10 individuals in each category. The assumptions are satisfied. We summarize the relevant information: The null and alternate hypotheses are: 𝐻 0 : 𝑝 1 = 𝑝 2 𝐻 1 : 𝑝 1 > 𝑝 2 Ages 15-24 Ages 25-64 Sample size 𝑛 1 =82,486 𝑛 2 =219,170 Number of individuals 𝑥 1 =4,243 𝑥 2 =10,701 Sample proportion 𝑝 1 = 4,243 82,486 =0.051439 𝑝 2 = 10,701 219,170 =0.048825 Population proportion 𝑝 1 (unknown) 𝑝 2 (unknown)

Example – Perform a Hypothesis Test Solution (continued): The pooled proportion is 𝑝 = 𝑥 1 + 𝑥 2 𝑛 1 + 𝑛 2 = 4243+10701 82486+219170 =0.049540. The test statistic is 𝑧= 𝑝 1 − 𝑝 2 𝑝 1− 𝑝 1 𝑛 1 + 1 𝑛 2 =2.95. Because this is a right-tailed test, the critical value is the value for which the area to the right is 0.05. This value is 𝑧 𝛼 =1.645. We reject 𝐻 0 if 𝑧> 𝑧 𝛼 . Because 𝑧 = 2.95 and 𝑧 𝛼 = 1.645, we reject 𝐻 0 at the 𝛼 = 0.05 level. We conclude that accidents involving drivers aged 15–24 are more likely to occur in a driveway than accidents involving drivers aged 25–64. Remember 𝐻 0 : 𝑝 1 = 𝑝 2 𝐻 1 : 𝑝 1 > 𝑝 2 𝑛 1 =82,486 𝑛 2 =219,170 𝑥 1 =4,243 𝑥 2 =10,701 𝑝 1 =0.051439 𝑝 2 =0.048825

You Should Know… How to perform a hypothesis test for the difference between two proportions using the P-value method How to perform a hypothesis test for the difference between two proportions using the critical value method