Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 101 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Significance level (4.3)

Slides:



Advertisements
Similar presentations
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Advertisements

Hypothesis Testing, Synthesis
Hypothesis Testing: Intervals and Tests
Chapter 10.  Real life problems are usually different than just estimation of population statistics.  We try on the basis of experimental evidence Whether.
Hypothesis Testing An introduction. Big picture Use a random sample to learn something about a larger population.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
Inference Sampling distributions Hypothesis testing.
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
Hypothesis Testing: One Sample Mean or Proportion
Fundamentals of Hypothesis Testing. Identify the Population Assume the population mean TV sets is 3. (Null Hypothesis) REJECT Compute the Sample Mean.
© 1999 Prentice-Hall, Inc. Chap Chapter Topics Hypothesis Testing Methodology Z Test for the Mean (  Known) p-Value Approach to Hypothesis Testing.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 250 Dr. Kari Lock Morgan SECTION 4.3 Significance level (4.3) Statistical.
Determining Statistical Significance
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 101 Dr. Kari Lock Morgan 9/25/12 SECTION 4.2 Randomization distribution.
Confidence Intervals and Hypothesis Tests
Hypothesis Testing III 2/15/12 Statistical significance Errors Power Significance and sample size Section 4.3 Professor Kari Lock Morgan Duke University.
Statistical Inference Decision Making (Hypothesis Testing) Decision Making (Hypothesis Testing) A formal method for decision making in the presence of.
Synthesis and Review 3/26/12 Multiple Comparisons Review of Concepts Review of Methods - Prezi Essential Synthesis 3 Professor Kari Lock Morgan Duke University.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Hypothesis Testing A hypothesis is a conjecture about a population. Typically, these hypotheses will be stated in terms of a parameter such as  (mean)
© 2002 Prentice-Hall, Inc.Chap 7-1 Statistics for Managers using Excel 3 rd Edition Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.
© 2003 Prentice-Hall, Inc.Chap 9-1 Fundamentals of Hypothesis Testing: One-Sample Tests IE 340/440 PROCESS IMPROVEMENT THROUGH PLANNED EXPERIMENTATION.
More Randomization Distributions, Connections
Testing Hypotheses About Proportions
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
1 Chapter 10: Section 10.1: Vocabulary of Hypothesis Testing.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 101 Dr. Kari Lock Morgan 9/27/12 SECTION 4.3 Significance level Statistical.
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
Hypothesis Testing: p-value
Statistical Inference Decision Making (Hypothesis Testing) Decision Making (Hypothesis Testing) A formal method for decision making in the presence of.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
© 2003 Prentice-Hall, Inc.Chap 7-1 Business Statistics: A First Course (3 rd Edition) Chapter 7 Fundamentals of Hypothesis Testing: One-Sample Tests.
Copyright ©2011 Nelson Education Limited Large-Sample Tests of Hypotheses CHAPTER 9.
LECTURE 19 THURSDAY, 14 April STA 291 Spring
A Broad Overview of Key Statistical Concepts. An Overview of Our Review Populations and samples Parameters and statistics Confidence intervals Hypothesis.
Introduction to Probability and Statistics Thirteenth Edition Chapter 9 Large-Sample Tests of Hypotheses.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
Chapter 20 Testing hypotheses about proportions
Testing of Hypothesis Fundamentals of Hypothesis.
Lecture 16 Dustin Lueker.  Charlie claims that the average commute of his coworkers is 15 miles. Stu believes it is greater than that so he decides to.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Errors (4.3) Multiple testing.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Chapter 20 Testing Hypothesis about proportions
Lecture 18 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Conclusions STAT 250 Dr. Kari Lock Morgan SECTION 4.3 Significance level (4.3) Statistical.
© 2004 Prentice-Hall, Inc.Chap 9-1 Basic Business Statistics (9 th Edition) Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
Introduction Suppose that a pharmaceutical company is concerned that the mean potency  of an antibiotic meet the minimum government potency standards.
STA Lecture 221 !! DRAFT !! STA 291 Lecture 22 Chapter 11 Testing Hypothesis – Concepts of Hypothesis Testing.
What is a Hypothesis? A hypothesis is a claim (assumption) about the population parameter Examples of parameters are population mean or proportion The.
Stat 100 Chapter 23, Try prob. 5-6, 9 –12 Read Chapter 24.
Chapter 12 Tests of Hypotheses Means 12.1 Tests of Hypotheses 12.2 Significance of Tests 12.3 Tests concerning Means 12.4 Tests concerning Means(unknown.
6.2 Large Sample Significance Tests for a Mean “The reason students have trouble understanding hypothesis testing may be that they are trying to think.”
Today: Hypothesis testing p-value Example: Paul the Octopus In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is.
Slide 20-1 Copyright © 2004 Pearson Education, Inc.
Hypothesis Tests for 1-Proportion Presentation 9.
Statistics: Unlocking the Power of Data Lock 5 Section 4.3 Determining Statistical Significance.
+ Homework 9.1:1-8, 21 & 22 Reading Guide 9.2 Section 9.1 Significance Tests: The Basics.
Module 10 Hypothesis Tests for One Population Mean
Hypothesis Testing: Conclusions
Testing Hypotheses about Proportions
A Closer Look at Testing
Determining Statistical Significance
Chapter 11: Introduction to Hypothesis Testing Lecture 5b
Hypothesis Testing A hypothesis is a claim or statement about the value of either a single population parameter or about the values of several population.
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 101 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Significance level (4.3) Statistical conclusions (4.3) Type I and II errors (4.3) Statistical versus practical significance (4.5) Multiple testing (4.5)

Statistics: Unlocking the Power of Data Lock 5 Review of Last Class The randomization distribution shows what types of statistics would be observed, just by random chance, if the null hypothesis were true A p-value is the chance of getting a statistic as extreme as that observed, if H 0 is true A p-value can be calculated as the proportion of statistics in the randomization distribution as extreme as the observed sample statistic The smaller the p-value, the stronger the evidence against H 0

Statistics: Unlocking the Power of Data Lock 5 Which of the following p-values gives the strongest evidence against H 0 ? a) b) 0.1 c) 0.32 d) 0.56 e) 0.94 p-value and H 0

Statistics: Unlocking the Power of Data Lock 5 Which of the following p-values gives the strongest evidence against H 0 ? a) 0.22 b) 0.45 c) 0.03 d) 0.8 e) 0.71 p-value and H 0

Statistics: Unlocking the Power of Data Lock 5 Two different studies obtain two different p- values. Study A obtained a p-value of and Study B obtained a p-value of 0.2. Which study obtained stronger evidence against the null hypothesis? a) Study A b) Study B p-value and H 0

Statistics: Unlocking the Power of Data Lock 5 Formal Decisions If the p-value is small:  REJECT H 0  the sample would be extreme if H 0 were true  the results are statistically significant  we have evidence for H a If the p-value is not small:  DO NOT REJECT H 0  the sample would not be too extreme if H 0 were true  the results are not statistically significant  the test is inconclusive; either H 0 or H a may be true

Statistics: Unlocking the Power of Data Lock 5 A formal hypothesis test has only two possible conclusions: 1.The p-value is small: reject the null hypothesis in favor of the alternative 2.The p-value is not small: do not reject the null hypothesis Formal Decisions How small?

Statistics: Unlocking the Power of Data Lock 5 Significance Level The significance level, , is the threshold below which the p-value is deemed small enough to reject the null hypothesis p-value <   Reject H 0 p-value >   Do not Reject H 0

Statistics: Unlocking the Power of Data Lock 5 Significance Level If the p-value is less than , the results are statistically significant, and we reject the null hypothesis in favor of the alternative If the p-value is not less than , the results are not statistically significant, and our test is inconclusive Often  = 0.05 by default, unless otherwise specified

Statistics: Unlocking the Power of Data Lock 5 Resveratrol, an ingredient in red wine and grapes, has been shown to promote weight loss in rodents, and has recently been investigated in primates (specifically, the Grey Mouse Lemur). A sample of lemurs had various measurements taken before and after receiving resveratrol supplementation for 4 weeks Red Wine and Weight Loss BioMed Central (2010, June 22). “Lemurs lose weight with ‘life-extending’ supplement resveratrol. Science Daily.

Statistics: Unlocking the Power of Data Lock 5 Red Wine and Weight Loss In the test to see if the mean resting metabolic rate is higher after treatment, the p-value is Using  = 0.05, is this difference statistically significant? (should we reject H 0 : no difference?) a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 Red Wine and Weight Loss In the test to see if the mean body mass is lower after treatment, the p-value is Using  = 0.05, is this difference statistically significant? (should we reject H 0 : no difference?) a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 Red Wine and Weight Loss In the test to see if locomotor activity changes after treatment, the p-value is Using  = 0.05, is this difference statistically significant? (should we reject H 0 : no difference?) a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 Red Wine and Weight Loss In the test to see if mean food intake changes after treatment, the p-value is Using  = 0.05, is this difference statistically significant? (should we reject H 0 : no difference?) a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 H 0 : X is an elephant H a : X is not an elephant Would you conclude, if you get the following data? X walks on two legs X has four legs Elephant Example

Statistics: Unlocking the Power of Data Lock 5 “For the logical fallacy of believing that a hypothesis has been proved to be true, merely because it is not contradicted by the available facts, has no more right to insinuate itself in statistical than in other kinds of scientific reasoning…” -Sir R. A. Fisher Never Accept H 0 “Do not reject H 0 ” is not the same as “accept H 0 ”! Lack of evidence against H 0 is NOT the same as evidence for H 0 !

Statistics: Unlocking the Power of Data Lock 5 Statistical Conclusions In a hypothesis test of H 0 :  = 10 vs H a :  < 10 the p-value is With α = 0.05, we conclude: a) Reject H 0 b) Do not reject H 0 c) Reject H a d) Do not reject H a

Statistics: Unlocking the Power of Data Lock 5 Statistical Conclusions In a hypothesis test of H 0 :  = 10 vs H a :  < 10 the p-value is With α = 0.01, we conclude: a) There is evidence that  = 10 b) There is evidence that  < 10 c) We have insufficient evidence to conclude anything

Statistics: Unlocking the Power of Data Lock 5 Statistical Conclusions In a hypothesis test of H 0 :  = 10 vs H a :  < 10 the p-value is With α = 0.01, we conclude: a) Reject H 0 b) Do not reject H 0 c) Reject H a d) Do not reject H a

Statistics: Unlocking the Power of Data Lock 5 Statistical Conclusions In a hypothesis test of H 0 :  = 10 vs H a :  < 10 the p-value is With α = 0.01, we conclude: a) There is evidence that  = 10 b) There is evidence that  < 10 c) We have insufficient evidence to conclude anything

Statistics: Unlocking the Power of Data Lock 5 Informal strength of evidence against H 0 : Formal decision of hypothesis test, based on  = 0.05 : Statistical Conclusions

Statistics: Unlocking the Power of Data Lock 5 Multiple Sclerosis and Sunlight It is believed that sunlight offers some protection against multiple sclerosis, but the reason is unknown Researchers randomly assigned mice to one of: Control (nothing) Vitamin D Supplements UV Light All mice were injected with proteins known to induce a mouse form of MS, and they observed which mice got MS Seppa, Nathan. “Sunlight may cut MS risk by itself”, Science News, April 24, 2010 pg 9, reporting on a study appearing March 22, 2010 in the Proceedings of the National Academy of Science.

Statistics: Unlocking the Power of Data Lock 5 Multiple Sclerosis and Sunlight For each situation below, write down  Null and alternative hypotheses  Informal description of the strength of evidence against H 0  Formal decision about H 0, using α = 0.05  Conclusion in the context of the question In testing whether UV light provides protection against MS (UV light vs control group), the p-value is In testing whether Vitamin D provides protection against MS (Vitamin D vs control group), the p- value is 0.47.

Statistics: Unlocking the Power of Data Lock 5 Multiple Sclerosis and Sunlight In testing whether UV light provides protection against MS (UV light vs control group), the p-value is

Statistics: Unlocking the Power of Data Lock 5 Multiple Sclerosis and Sunlight In testing whether Vitamin D provides protection against MS (Vitamin D vs control group), the p-value is 0.47.

Statistics: Unlocking the Power of Data Lock 5 There are four possibilities: Errors Reject H 0 Do not reject H 0 H 0 true H 0 false TYPE I ERROR TYPE II ERROR Truth Decision A Type I Error is rejecting a true null (false positive) A Type II Error is not rejecting a false null (false negative)

Statistics: Unlocking the Power of Data Lock 5 In the test to see if resveratrol is associated with food intake, the p-value is o If resveratrol is not associated with food intake, a Type I Error would have been made In the test to see if resveratrol is associated with locomotor activity, the p-value is o If resveratrol is associated with locomotor activity, a Type II Error would have been made Red Wine and Weight Loss

Statistics: Unlocking the Power of Data Lock 5 A person is innocent until proven guilty. Evidence must be beyond the shadow of a doubt. Types of mistakes in a verdict? Convict an innocent Release a guilty Analogy to Law

Statistics: Unlocking the Power of Data Lock 5 If the null hypothesis is true: 5% of statistics will be in the most extreme 5% 5% of statistics will give p-values less than % of statistics will lead to rejecting H 0 at α = 0.05 If α = 0.05, there is a 5% chance of a Type I error Distribution of statistics, assuming H 0 true: Probability of Type I Error

Statistics: Unlocking the Power of Data Lock 5 If the null hypothesis is true: 1% of statistics will be in the most extreme 1% 1% of statistics will give p-values less than % of statistics will lead to rejecting H 0 at α = 0.01 If α = 0.01, there is a 1% chance of a Type I error Distribution of statistics, assuming H 0 true: Probability of Type I Error

Statistics: Unlocking the Power of Data Lock 5 The probability of making a Type I error (rejecting a true null) is the significance level, α Probability of Type I Error

Statistics: Unlocking the Power of Data Lock 5 Probability of Type II Error How can we reduce the probability of making a Type II Error (not rejecting a false null)? Option 1: a) Decrease the significance level b) Increase the significance level Option 2: a) Decrease the sample size b) Increase the sample size

Statistics: Unlocking the Power of Data Lock 5 The probability of making a Type I error (rejecting a true null) if the null is true is the significance level, α The probability of making a Type II error (not rejecting a false null) if the alternative is true depends on the significance level and the sample size (among other things) α should be chosen depending how bad it is to make a Type I or Type II error Probability of Errors

Statistics: Unlocking the Power of Data Lock 5 Choosing α By default, usually α = 0.05 If a Type I error (rejecting a true null) is much worse than a Type II error, we may choose a smaller α, like α = 0.01 If a Type II error (not rejecting a false null) is much worse than a Type I error, we may choose a larger α, like α = 0.10

Statistics: Unlocking the Power of Data Lock 5 Come up with a hypothesis testing situation in which you may want to… Use a smaller significance level, like  = 0.01 Use a larger significance level, like  = 0.10 Significance Level

Statistics: Unlocking the Power of Data Lock 5 With small sample sizes, even large differences or effects may not be significant With large sample sizes, even a very small difference or effect can be significant A statistically significant result is not always practically significant, especially with large sample sizes Statistical vs Practical Significance

Statistics: Unlocking the Power of Data Lock 5 Example: Suppose a weight loss program recruits 10,000 people for a randomized experiment. A difference in average weight loss of only 0.5 lbs could be found to be statistically significant Suppose the experiment lasted for a year. Is a loss of ½ a pound practically significant? Statistical vs Practical Significance

Statistics: Unlocking the Power of Data Lock 5 Diet and Sex of Baby Are certain foods in your diet associated with whether or not you conceive a boy or a girl? To study this, researchers asked women about their eating habits, including asking whether or not they ate 133 different foods regularly A significant difference was found for breakfast cereal (mothers of boys eat more), prompting the headline “Breakfast Cereal Boosts Chances of Conceiving Boys”.

Statistics: Unlocking the Power of Data Lock 5 “Breakfast Cereal Boosts Chances of Conceiving Boys” I’m had identical twin boys a year ago, and I eat breakfast cereal every morning. Do you think this helped to boost my chances of having boys? a) Yes b) No c) Impossible to tell

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Tests For each of the 133 foods studied, a hypothesis test was conducted for a difference between mothers who conceived boys and girls in the proportion who consume each food If there are NO differences (all null hypotheses are true), about how many significant differences would be found using α = 0.05? How might you explain the significant difference for breakfast cereal?

Statistics: Unlocking the Power of Data Lock 5 Multiple Testing When multiple hypothesis tests are conducted, the chance that at least one test incorrectly rejects a true null hypothesis increases with the number of tests. If the null hypotheses are all true, α of the tests will yield statistically significant results just by random chance.

Statistics: Unlocking the Power of Data Lock 5 Author: JB Landers

Statistics: Unlocking the Power of Data Lock 5 Multiple Comparisons Consider a topic that is being investigated by research teams all over the world  Using α = 0.05, 5% of teams are going to find something significant, even if the null hypothesis is true

Statistics: Unlocking the Power of Data Lock 5 Multiple Comparisons Consider a research team/company doing many hypothesis tests  Using α = 0.05, 5% of tests are going to be significant, even if the null hypotheses are all true

Statistics: Unlocking the Power of Data Lock 5 This is a serious problem The most important thing is to be aware of this issue, and not to trust claims that are obviously one of many tests (unless they specifically mention an adjustment for multiple testing) There are ways to account for this (e.g. Bonferroni’s Correction), but these are beyond the scope of this class Multiple Comparisons

Statistics: Unlocking the Power of Data Lock 5 Publication Bias publication bias refers to the fact that usually only the significant results get published The one study that turns out significant gets published, and no one knows about all the insignificant results This combined with the problem of multiple comparisons, can yield very misleading results

Statistics: Unlocking the Power of Data Lock 5 Jelly Beans Cause Acne!

Statistics: Unlocking the Power of Data Lock 5

Statistics: Unlocking the Power of Data Lock 5 Results are statistically significant if the p-value is less than the significance level, α In making formal decisions, reject H 0 if the p-value is less than α, otherwise do not reject H 0 Not rejecting H 0 is NOT the same as accepting H 0 Two types of errors: rejecting a true null (Type I) and not rejecting a false null (Type II) Statistical vs practical significance Using α = 0.05, 5% of all hypothesis tests will lead to rejecting the null, even if all nulls are true Summary

Statistics: Unlocking the Power of Data Lock 5 To Do Project 1 proposal due TODAY at 11:59pm Read Section 4.3, 4.5 Do HW 4 (due Monday, 2/24)