How Bad Is Oops?. When we make a decision while hypothesis testing (to reject or to do not reject the H O ) we determine which kind of error we have made.

Slides:



Advertisements
Similar presentations
Chapter 7 Hypothesis Testing
Advertisements

1 Hypothesis Testing William P. Wattles, Ph.D. Psychology 302.
Hypothesis Testing making decisions using sample data.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Hypothesis Testing An introduction. Big picture Use a random sample to learn something about a larger population.
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
8-2 Basics of Hypothesis Testing
Chapter 9 Hypothesis Testing.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.2.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Jan 17,  Hypothesis, Null hypothesis Research question Null is the hypothesis of “no relationship”  Normal Distribution Bell curve Standard normal.
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
Slide Slide 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim about a Proportion 8-4 Testing a Claim About.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Large sample CI for μ Small sample CI for μ Large sample CI for p
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
The z test statistic & two-sided tests Section
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Chapter 21: More About Tests
Welcome to MM570 Psychological Statistics
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
1 Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Inference as Design Target Goal: I can calculate and interpret a type I and type II error. 9.1c h.w: pg 547: 15, 19, 21.
Psych 230 Psychological Measurement and Statistics Pedro Wolf October 21, 2009.
A 10 Step Quest Every Time!. Overview When we hypothesis test, we are trying to investigate whether or not a sample provides strong evidence of something.
Created by Erin Hodgess, Houston, Texas Section 7-1 & 7-2 Overview and Basics of Hypothesis Testing.
The Fine Art of Knowing How Wrong You Might Be. To Err Is Human Humans make an infinitude of mistakes. I figure safety in numbers makes it a little more.
+ Homework 9.1:1-8, 21 & 22 Reading Guide 9.2 Section 9.1 Significance Tests: The Basics.
Lecture Slides Elementary Statistics Twelfth Edition
Section 8.2 Day 3.
Tests About a Population Proportion
9.3 Hypothesis Tests for Population Proportions
Unit 5 – Chapters 10 and 12 What happens if we don’t know the values of population parameters like and ? Can we estimate their values somehow?
Making Sense of Statistical Significance Inference as Decision
Section Testing a Proportion
Unit 5: Hypothesis Testing
Review and Preview and Basics of Hypothesis Testing
Keller: Stats for Mgmt & Econ, 7th Ed Hypothesis Testing
Chapter 21 More About Tests.
CHAPTER 9 Testing a Claim
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Introduction to Hypothesis
Section 8.2 Day 4.
Hypothesis Testing Is It Significant?.
CHAPTER 9 Testing a Claim
Hypothesis Tests for 1-Sample Proportion
Hypothesis Testing: Hypotheses
Hypothesis Testing for Proportions
More about Tests and Intervals
INTEGRATED LEARNING CENTER
AP Statistics: Chapter 21
William P. Wattles, Ph.D. Psychology 302
Daniela Stan Raicu School of CTI, DePaul University
Testing Hypotheses about a Population Proportion
Chapter 8 Hypothesis Tests
CHAPTER 9 Testing a Claim
Chapter 9: Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
1 Chapter 8: Introduction to Hypothesis Testing. 2 Hypothesis Testing The general goal of a hypothesis test is to rule out chance (sampling error) as.
Section 11.1: Significance Tests: Basics
Section 8.2 Day 2.
Type I and Type II Errors
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

How Bad Is Oops?

When we make a decision while hypothesis testing (to reject or to do not reject the H O ) we determine which kind of error we have made. If we reject, meaning we found evidence, the truth of the matter might actually be that nothing was going on. This is called a false positive. This is the Type I Error If we do not reject, meaning we did not find sufficient evidence, the truth of the matter might actually be that there was a change, but we just did not find the evidence. This is called a false negative. This is the Type II Error

How Bad Is Oops? Sometimes a false positive is really, really bad, but a false negative is just kind of whatever. This means we should use a low alpha, like 1% instead of the usual 5%. Sometimes a false positive is kind of whatever, but a false negative is really, really bad. This means we should use a higher alpha, like 10%, in order to minimize our beta (β). Sometimes both are bad. This is the most likely situation. This means we should use a middle-of-the-road alpha, such as our old stand-in, 5%.

Some Examples Consider one of life’s most terrifying tests. A pregnancy test. Laugh all you want, but some day, you’ll understand. Generally a false negative and a false positive would both be at least inconvenient, if not a serious issue. Let’s look at some exceptions, however.

False Negatives Are Whatever Consider a married couple that has just reached the stage in their marriage where they want to try and have a kid. If the woman is, in fact, pregnant but the pregnancy test says she is not pregnant, the couple will just have to keep trying. Hurray for them!

False Positives Are Whatever Consider a married couple which decides not to use any form of birth control and to leave whether or not the wife gets pregnant in God’s hands. P.S. – My experience is this is pretty much the same as the couple I was just talking about, but whatever. At least at first, a false positive is not likely to be a big deal, as this couple is apparently open to the idea of having children. If the test says pregnant, but the woman is not pregnant (yet), then it is probably just a matter of time. I’ve heard that studies show that roughly 20% of the time if no contraception is used, pregnancy occurs.

The Traditional Pregnancy Test It usually totally matters what the test says and either form of wrong answer is a really big deal. So, even for pregnancy tests, an alpha of 5% is generally the way to go.

They’re Not Even Equally Inferior!

P-values When we run a hypothesis test, we find a p-value. P-value is the probability that if our H 0 is true, then we could have gotten a sample this unusual by randomness. If the p-value is really low, this is evidence that it was not an act of randomness, but instead evidence of something going on.

Critical Values Instead of finding the p-value for our actual z score, we can figure out what z-score matches our α. Any z-score beyond this alpha is too extreme and so we would reject H O. This saves us the trouble of calculating a p-value, and makes it so we only have to calculate a z-score. In other words, it only cuts out the normalcdf step. Instead of the normalcdf step, we find our critical z.

Critical Values To find our critical z, we usually need to draw a picture where we mark off the α and then use invnorm to find our critical z. This is very similar to the critical z process we used for confidence intervals. It is more work than the normalcdf step in almost every case, and the p-value is more useful than the critical value method in terms of what it tells us. Before fancy calculators did the normalcdf step, this was not the balance of convenience, especially for t tests.

Critical Values For a 2-tailed (≠) test with 5% significance, the critical z is ±1.96. Our decision rules would look like this instead of the ones we have been using: If |z| > 1.96, reject H O. If |z| ≤ 1.96, DNR H O. This method is generally inferior, but rears its ugly head on the AP Test, so if you are taking it, during our review period, we will discuss it in more detail, but spare the rest of the class.

Confidence Intervals This method has a fairly simple concept behind it. Confidence intervals are used to make a guess at what the true value for the population could reasonably be. In other words, we estimate p. In a hypothesis test, we assume what the population value is based on there not being sufficient evidence of something. In the H 0, we assume p = something. If our confidence interval does not contain our value from the H 0, then clearly our assumption in H 0 was unreasonable. So we reject it, and claim there is sufficient evidence.

Just To Clarify The decision rules would look like this: If p is outside of the confidence interval, reject H 0 If p is in the confidence interval, do not reject H 0 This p refers not to the p-value, but the p in the null hypothesis. For example, H 0 : p =.5 would mean p is.5

Why Shouldn’t We Do This? Confidence intervals are based on a specific confidence level and have to be recalculated if you change your α. This is also one of the reasons critical z testing is bad, since your critical z depends on α as well. P-values are compared to α, but they do not depend on it.

Why Should We Do Confidence Interval Testing With the p-value method our two results are evidence and not-so-evidence. With the confidence interval method not only can we comment on evidence vs. not-so-evidence, but we also come away with an estimate for p. This allows us to discuss practical significance, since we can now estimate how much of a difference there probably is. We usually base our estimations on the more conservative end of the interval.

Example Let’s say I am looking for evidence that students this trimester had a higher fail rate in Algebra 1. If the previous rate was 19% fail rate and the 95% confidence interval is from 19.6% to 21.5%, then I can be confident that the fail rate is higher. But it is only definitely.6% higher and presumably not more than 2.5% higher. This is statistically significant, but I personally would not consider it to be practically significant as that is a fairly small increase.