AP Statistics Chapter 21 Notes “More about Hypothesis Testing”
“P-Value” The “p-value” of a hypothesis test is the probability of your sample’s results occurring by natural sampling variability. It is NOT the probability that the null hypothesis is true. Example: Interpret p = 7% Correct: There is a 7% chance of this sample’s results occurring naturally if the null is true. Incorrect: There is a 7% chance that the null hypothesis is true. When the “p-value” is small (usually less than 5%), it tells us that it is not likely that our sample’s results occurred naturally and therefore the null hypothesis should be rejected.
Significance Level (A.K.A. Alpha Level) The significance level (or alpha level) is the threshold we use to determine whether the results of our hypothesis testing indicate rejecting or retaining the null hypothesis. 5% is the most common, however, it is possible to have other significance levels such as 10% or 1%. If the “p-value” is less than the alpha level, we reject the null hypothesis Say “we have sufficient evidence from our data to conclude…” If the “p-value” is higher than the alpha level, we retain the null hypothesis Say “we have insufficient evidence from our data to conclude…”
Types of Errors We are never 100% certain. There is always a potential for error. Here are the types of errors we can make: Type 1 Error – You rejected the null hypothesis but it was actually true. Type 2 Error – You retained the null hypothesis but it was actually incorrect.
Example Production managers on an assembly line must monitor the output to be sure that no more than 2% of their products are defective. They periodically inspect a random sample of the items produced. Based on the results of their sample, they will shut down the assembly line if they believe that more than 2% of the items produced are defective. State the hypotheses. In this situation, what is a type 1 error? Why is this bad? In this situation, what is a type 2 error? Why is this bad?
Example A statistics professor has observed that for several years about 13% of the students who initially enroll in his introductory statistics course withdraw before the end of the semester. A salesman suggests that he try a certain software package that gets students more involved with computers, predicting that it will lower the dropout rate. 1. What are the null and alternative hypotheses? 2. In this situation, what is a type 1 error? Why is this bad? 3. In this situation, what is a type 2 error? Why is this bad? 4. Initially, 203 students signed up for his introductory statistics course and 11 dropped out before the end of the semester. Perform a significance test at the 5% level. Should the professor spend money to continue using this software?
The Power of the Test The power of the test is the potential for the null hypothesis to be rejected. A higher power means more potential to reject the null hypothesis You are not asked to calculate the value of the power of the test in this course, just to understand the factors that influence it.
Factors that influence the power of the test Higher significance level = stronger power because there is more chance to reject the null 10% alpha level has a stronger power than a 5% alpha level Larger sample size = stronger power because it reduces the standard deviation which decreases the probability of a type 2 error A sample size of 1000 has a stronger power than a sample size of 500
Back to the last example… How will the power of the test be influenced if the professor uses a 1% significance level rather than a 5% significance level? How will the power of the test be influenced if the professor uses a larger sample of statistics students?