Presentation is loading. Please wait.

Presentation is loading. Please wait.

How Bad Is Oops?. When we make a decision while hypothesis testing (to reject or to do not reject the H O ) we determine which kind of error we have made.

Similar presentations


Presentation on theme: "How Bad Is Oops?. When we make a decision while hypothesis testing (to reject or to do not reject the H O ) we determine which kind of error we have made."— Presentation transcript:

1 How Bad Is Oops?

2 When we make a decision while hypothesis testing (to reject or to do not reject the H O ) we determine which kind of error we have made. If we reject, meaning we found evidence, the truth of the matter might actually be that nothing was going on. This is called a false positive. This is the Type I Error If we do not reject, meaning we did not find sufficient evidence, the truth of the matter might actually be that there was a change, but we just did not find the evidence. This is called a false negative. This is the Type II Error

3 How Bad Is Oops? Sometimes a false positive is really, really bad, but a false negative is just kind of whatever. This means we should use a low alpha, like 1% instead of the usual 5%. Sometimes a false positive is kind of whatever, but a false negative is really, really bad. This means we should use a higher alpha, like 10%, in order to minimize our beta (β). Sometimes both are bad. This is the most likely situation. This means we should use a middle-of-the-road alpha, such as our old stand-in, 5%.

4 Some Examples Consider one of life’s most terrifying tests. A pregnancy test. Laugh all you want, but some day, you’ll understand. Generally a false negative and a false positive would both be at least inconvenient, if not a serious issue. Let’s look at some exceptions, however.

5 False Negatives Are Whatever Consider a married couple that has just reached the stage in their marriage where they want to try and have a kid. If the woman is, in fact, pregnant but the pregnancy test says she is not pregnant, the couple will just have to keep trying. Hurray for them!

6 False Positives Are Whatever Consider a married couple which decides not to use any form of birth control and to leave whether or not the wife gets pregnant in God’s hands. P.S. – My experience is this is pretty much the same as the couple I was just talking about, but whatever. At least at first, a false positive is not likely to be a big deal, as this couple is apparently open to the idea of having children. If the test says pregnant, but the woman is not pregnant (yet), then it is probably just a matter of time. I’ve heard that studies show that roughly 20% of the time if no contraception is used, pregnancy occurs.

7 The Traditional Pregnancy Test It usually totally matters what the test says and either form of wrong answer is a really big deal. So, even for pregnancy tests, an alpha of 5% is generally the way to go.

8 They’re Not Even Equally Inferior!

9 P-values When we run a hypothesis test, we find a p-value. P-value is the probability that if our H 0 is true, then we could have gotten a sample this unusual by randomness. If the p-value is really low, this is evidence that it was not an act of randomness, but instead evidence of something going on.

10 Critical Values Instead of finding the p-value for our actual z score, we can figure out what z-score matches our α. Any z-score beyond this alpha is too extreme and so we would reject H O. This saves us the trouble of calculating a p-value, and makes it so we only have to calculate a z-score. In other words, it only cuts out the normalcdf step. Instead of the normalcdf step, we find our critical z.

11 Critical Values To find our critical z, we usually need to draw a picture where we mark off the α and then use invnorm to find our critical z. This is very similar to the critical z process we used for confidence intervals. It is more work than the normalcdf step in almost every case, and the p-value is more useful than the critical value method in terms of what it tells us. Before fancy calculators did the normalcdf step, this was not the balance of convenience, especially for t tests.

12 Critical Values For a 2-tailed (≠) test with 5% significance, the critical z is ±1.96. Our decision rules would look like this instead of the ones we have been using: If |z| > 1.96, reject H O. If |z| ≤ 1.96, DNR H O. This method is generally inferior, but rears its ugly head on the AP Test, so if you are taking it, during our review period, we will discuss it in more detail, but spare the rest of the class.

13 Confidence Intervals This method has a fairly simple concept behind it. Confidence intervals are used to make a guess at what the true value for the population could reasonably be. In other words, we estimate p. In a hypothesis test, we assume what the population value is based on there not being sufficient evidence of something. In the H 0, we assume p = something. If our confidence interval does not contain our value from the H 0, then clearly our assumption in H 0 was unreasonable. So we reject it, and claim there is sufficient evidence.

14 Just To Clarify The decision rules would look like this: If p is outside of the confidence interval, reject H 0 If p is in the confidence interval, do not reject H 0 This p refers not to the p-value, but the p in the null hypothesis. For example, H 0 : p =.5 would mean p is.5

15 Why Shouldn’t We Do This? Confidence intervals are based on a specific confidence level and have to be recalculated if you change your α. This is also one of the reasons critical z testing is bad, since your critical z depends on α as well. P-values are compared to α, but they do not depend on it.

16 Why Should We Do Confidence Interval Testing With the p-value method our two results are evidence and not-so-evidence. With the confidence interval method not only can we comment on evidence vs. not-so-evidence, but we also come away with an estimate for p. This allows us to discuss practical significance, since we can now estimate how much of a difference there probably is. We usually base our estimations on the more conservative end of the interval.

17 Example Let’s say I am looking for evidence that students this trimester had a higher fail rate in Algebra 1. If the previous rate was 19% fail rate and the 95% confidence interval is from 19.6% to 21.5%, then I can be confident that the fail rate is higher. But it is only definitely.6% higher and presumably not more than 2.5% higher. This is statistically significant, but I personally would not consider it to be practically significant as that is a fairly small increase.


Download ppt "How Bad Is Oops?. When we make a decision while hypothesis testing (to reject or to do not reject the H O ) we determine which kind of error we have made."

Similar presentations


Ads by Google