CHAPTER 10 Asking and Answering Questions about a Population Proportion Created by Kathy Fritz
In the original study, 1385 women were sent a 19 question survey. Of the 561 surveys returned, 229 women said they would like to choose the sex of a future child. Of these 229 women, 140 choose to have a girl. What is the population of interest? Are the 561 women who responded to the survey representative of the population? Does the high nonresponse rate pose a problem? Or, is observing a sample proportion as large as very unlikely if the population proportion is 0.50?
HYPOTHESES AND POSSIBLE CONCLUSIONS Null Hypothesis Alternative Hypothesis
Hypotheses In its simplest form, a hypothesis is The following are examples of hypotheses about population proportions: HypothesisPopulation Proportion of InterestThe Hypothesis says... p < 0.25 Where p is the proportion of messages that included an attachment Less than 25% of the messages sent included an attachment p > 0.8 Where p is the proportion of messages that were longer than 500 characters More than 80% of the messages sent were longer than 500 characters p = 0.3 Where p is the proportion of messages that were sent to multiple recipients 30% of the messages sent were sent to multiple recipients
What is a hypothesis test? A hypothesis test uses Suppose that a particular community college claims that the majority of students completing an associate’s degree transfer to a 4-year college. You would then want to determine if the sample data provide convincing evidence in support of the hypothesis p > 0.5. To test a claim: set up competing hypotheses p ≤ 0.5 and p > 0.5
Hypothesis statements: The null hypothesis, The alternative hypothesis, Two possible conclusions in a hypothesis test are:
The Form of Hypotheses: Null hypothesis Alternative hypothesis
Let’s consider a murder trial... What is the null hypothesis? What is the alternative hypothesis?
In a study, researchers were interested in determining if sample data support the claim that more than one in four young adults live with their parents. Define the population characteristic: State the hypotheses :
A study included data from a survey of 1752 people ages 13 to 39. One of the survey questions asked participants how satisfied they were with their current financial situation. Suppose you want to determine if the survey data provide convincing evidence that fewer than 10% of adults 19 to 39 are very satisfied with their current financial situation. Define the population characteristic: State the hypotheses :
The manufacturer of M&Ms claims that 40% of plain M&Ms are brown. A sample of M&Ms will be used to determine if the proportion of brown M&Ms is different from what the manufacturer claims. Define the population characteristic: State the hypotheses :
For each pair of hypotheses, indicate which are not legitimate and explain why
POTENTIAL ERRORS IN HYPOTHESIS TESTING Type I Errors Type II Errors Significance Level
When you perform a hypothesis test you make a decision: reject H 0 or fail to reject H 0 Each could possibly be a wrong decision; therefore, there are two types of errors.
Type I error A Type I error is The probability of a Type I error is denoted by . In a hypothesis test,
Type II error A Type II error is The probability of a Type II error is denoted by
The U.S. Bureau of Transportation Statistics reports that for 2008, 65.3% of all domestic passenger flights arrived on time (meaning within 15 minutes of its scheduled arrival time). Suppose that an airline with a poor on-time record decides to offer its employees a bonus if the airline’s proportion of on-time flights exceeds the overall industry rate of in an upcoming month. Let p = the actual proportion of the airline’s flights that are on time during the month of interest. The hypotheses are:
Boston Scientific developed a new heart stent used to treat arteries blocked by heart disease. The new stent, called the Liberte, is made of thinner metal than heart stents currently in use, making it easier for doctors to direct the stent to a blockage. In order to obtain approval to sell the new Liberte stent, the Food and Drug Administration (FDA) required Boston Scientific to provide evidence that the proportion of patients receiving the Liberte stent who experienced a re-blocked artery was less than 0.1. Let p = the proportion of patients receiving the Liberte stent who experience a re-blocked artery
Boston Scientific developed a new heart stent used to treat arteries blocked by heart disease. The new stent, called the Liberte, is made of thinner metal than heart stents currently in use, making it easier for doctors to direct the stent to a blockage. In order to obtain approval to sell the new Liberte stent, the Food and Drug Administration (FDA) required Boston Scientific to provide evidence that the proportion of patients receiving the Liberte stent who experienced a re-blocked artery was less than 0.1. Let p = the proportion of patients receiving the Liberte stent who experience a re-blocked artery H 0 : p = 0.1 H a : p < 0.1
How does one decide what level to use? After assessing the consequences of type I and type II errors, identify the largest that is tolerable for the problem. Then employ a test procedure that uses this maximum acceptable value – rather than anything smaller – as the level of significance. Remember, using a
Heart Stents Revisited... Let p = the proportion of patients receiving the Liberte stent who experience a re-blocked artery H 0 : p = 0.1 versus H a : p < 0.1 A consequence of making a Type II error would be that the new stent is not approved for sale. Patients and doctors will not benefit from the new design. A consequence of making a Type I error would be that the new stent is approved for sale. More patients will experience re- blocked arteries.
THE LOGIC OF HYPOTHESIS TESTING An Informal Example
In June 2006, an Associated Press survey was conducted to investigate how people use the nutritional information provided on food packages. Interviews were conducted with 1003 randomly selected adult Americans, and each participant was asked a series of questions, including the following two: Question 1: When purchasing packaged food, how often do you check the nutritional labeling on the package? Question 2: How often do you purchase food that is bad for you, even after you’ve checked the nutrition labels? It was reported that 582 responded “frequently” to the question about checking labels and 441 responded “very often” or “somewhat often” to the question about purchasing bad foods even after checking the labels.
Nutritional Labels Continued... For this sample:
A PROCEDURE FOR CARRYING OUT A HYPOTHESIS TEST Test Statistic P-value
Test Statistic A test statistic is computed using sample data. The value of the test statistic is used to determine the P-value associated with the test.
P-values The P-value (also sometimes called the observed significance level) It is the probability, assuming that H 0 is true, of obtaining a test statistic value at least as inconsistent with H 0 as what actually resulted. You reject the null hypothesis
Using P-values to make a decision: A decision in a hypothesis test is based on comparing the P-value to the chosen significance level . For example, suppose that P-value = and = Then, because ≤ 0.05 H 0 would be rejected.
Recall the 5 Steps for Performing a Hypothesis Test StepThis Step Includes... H Hypotheses1.Describe the population characteristic of interest. 2.Translate the research question or claim into null and alternative hypotheses. M Method C Check C Calculate C Communicate Results 1.Identify the appropriate test and test statistic. 2.Select a significance level for the test. 1.Verify that any conditions for the selected test are met. 1.Find the values of any sample statistics needed to calculate the value of the test statistic. 2.Calculate the value of the test statistic. 3.Determine the P-value for the test. 1.Compare the P-value to the selected significance level and make a decision to either reject H 0 or fail to reject H 0. 2.Provide a conclusion in words that is in context and addresses the question of interest.
LARGE SAMPLE HYPOTHESIS TEST FOR A POPULATION PROPORTION
Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. Calculated z z curve P-value = area in upper tail
Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. Calculated z z curve P-value = area in lower tail
Computing P-values The calculation of the P-value depends on the form of the inequality in the alternative hypothesis. Calculated -z and z z curve P-value = sum of area in two tails
A Large-Sample Test for a Population Proportion Appropriate when the following conditions are met: 1. The sample is a random sample from the population of interest or the sample is selected in a way that would result in a representative sample. 2. The sample size n is large. This condition is met when both np > 10 and n (1 - p) > 10. When these conditions are met, the following test statistic can be used: Where p 0 is the hypothesized value from the null hypothesis
A Large-Sample Test for a Population Proportion Continued... Null hypothesis:H 0 : p = p 0 When the Alternative Hypothesis Is...The P-value Is... H a : p > p 0 Area under the z curve to the right of the calculated value of the test statistic H a : p < p 0 H a : p ≠ p 0 Area under the z curve to the left of the calculated value of the test statistic 2·(area to the right of z) if z is positive Or 2·(area to the left of z) if z is negative
In a study, 2205 adolescents ages 12 to 19 took a cardiovascular treadmill test. The researchers conducting the study believed that the sample was representative of adolescents nationwide. Of the 2205 adolescents tested, 750 had a poor level of cardiovascular fitness. Does this sample provide convincing evidence that more than thirty percent of adolescents have a poor level of cardiovascular fitness? Hypothesis:
Cardiovascular Fitness Continued... Significance level:a = 0.05 Because neither type of error is much worse than the other, you might choose a value of Method: Because the answers to the four key questions are 1) hypothesis testing, 2) sample data, 3) one categorical variable, and 4) one sample, consider a large sample hypothesis test for a population proportion. H 0 : p = 0.30 versus H a : p > 0.30
Cardiovascular Fitness Continued... Check: 1.The researchers believed the sample to be representative of adolescents nationwide. 2.The sample size is large enough because np 0 = 2205(0.3) = ≥ 10 and n (1 - p 0 ) = 2205(0.7) = ≥ 10 H 0 : p = 0.30 versus H a : p > 0.30
Cardiovascular Fitness Continued... Communicate Results: Decision: 0 < 0.05, Reject H 0 Conclusion: The sample provides convincing evidence that more than 30% of adolescents have a poor fitness level. H 0 : p = 0.30 versus H a : p > 0.30 z = 4.00P-value ≈ 0 Notice that the conclusion answers the question that was posed in the problem.
A Few Final Things to Consider 1. What about Small Samples? In np ≥ 10 and n (1 – p) ≥ 10, the standard normal distribution is a reasonable approximation to the distribution of the z test statistic when the null hypothesis is true. If the sample size is not large enough to satisfy the large sample conditions, the distribution of the test statistic may be quite different from the standard normal distribution. Thus, you can’t use the standard normal distribution to calculate P-values.
A Few Final Things to Consider 2. Choosing a Potential Method Take a look back at Table 7.1 (on page 420 and also on the inside back cover of the text). As you get into the habit of answering the four key questions for each new situation that you encounter, it will become easier to use this table to select an appropriate method in a given situation.
AVOID THESE COMMON MISTAKES
Be sure to include all the relevant information: 1. Hypothesis. Whether specified in symbols or described in words, it is important that both the null and the alternative hypothesis be clearly stated. If using symbols, be sure to define them in the context of the problem. 2. Test procedure. You should be clear about what test procedure was used and why you think it was reasonable to use this procedure. 3. Test statistic. Be sure to include the value of the test statistic and the associated P-value. 4. Conclusion in context. Always provide a conclusion that is in the context of the problem and that answers the question posed.
Avoid These Common Mistakes 1. A hypothesis test can never show strong support for the null hypothesis. Make sure that you don’t confuse “There is no reason to believe the null is not true” with the statement “There is convincing evidence that the null hypothesis is true”. These are very different statements! This is like saying the defendant is “innocent” instead of “not guilty”.
Avoid These Common Mistakes 2. If you have complete information for the population (census), don’t carry out a hypothesis test!
Avoid These Common Mistakes 3. Don’t confuse statistical significance with practical significance. When a null hypothesis has been rejected, be sure to step back and evaluate the result in the light of its practical importance. For example, you may be convinced that the proportion who respond favorably to a proposed new medical treatment is greater than p 0 = 0.4. But if your estimate for the proposed new treatment is 0.405, it may not be of any practical use.