Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 250 Dr. Kari Lock Morgan SECTION 4.2 p-value.

Slides:



Advertisements
Similar presentations
Introducing Hypothesis Tests
Advertisements

Simulating with StatKey Kari Lock Morgan Department of Statistical Science Duke University Joint Mathematical Meetings, San Diego 1/11/13.
Hypothesis Testing: Intervals and Tests
Bootstrap Distributions Or: How do we get a sense of a sampling distribution when we only have ONE sample?
Hypothesis Testing I 2/8/12 More on bootstrapping Random chance
Inference: Neyman’s Repeated Sampling STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 250 Dr. Kari Lock Morgan SECTION 4.2 Randomization distribution p-value.
Hypothesis Testing: Hypotheses
Introduction to Hypothesis Testing AP Statistics Chap 11-1.
STAT 135 LAB 14 TA: Dongmei Li. Hypothesis Testing Are the results of experimental data due to just random chance? Significance tests try to discover.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
Section 4.4 Creating Randomization Distributions.
Chapter 9 Hypothesis Testing.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 250 Dr. Kari Lock Morgan SECTION 4.3 Significance level (4.3) Statistical.
Determining Statistical Significance
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 101 Dr. Kari Lock Morgan 9/25/12 SECTION 4.2 Randomization distribution.
Using Bootstrap Intervals and Randomization Tests to Enhance Conceptual Understanding in Introductory Statistics Kari Lock Morgan Department of Statistical.
Statistics: Unlocking the Power of Data Lock 5 Inference for Proportions STAT 250 Dr. Kari Lock Morgan Chapter 6.1, 6.2, 6.3, 6.7, 6.8, 6.9 Formulas for.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Randomization Tests Dr. Kari Lock Morgan PSU /5/14.
Building Conceptual Understanding of Statistical Inference Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University
More Randomization Distributions, Connections
Understanding the P-value… Really! Kari Lock Morgan Department of Statistical Science, Duke University with Robin Lock, Patti Frazer.
Using Simulation Methods to Introduce Inference Kari Lock Morgan Duke University In collaboration with Robin Lock, Patti Frazer Lock, Eric Lock, Dennis.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 101 Dr. Kari Lock Morgan 9/27/12 SECTION 4.3 Significance level Statistical.
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Hypothesis Testing: p-value
1 Desipramine is an antidepressant affecting the brain chemicals that may become unbalanced and cause depression. It was tested for recovery from cocaine.
Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University
LECTURE 19 THURSDAY, 14 April STA 291 Spring
Using Randomization Methods to Build Conceptual Understanding of Statistical Inference: Day 2 Lock, Lock, Lock Morgan, Lock, and Lock MAA Minicourse- Joint.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
CHAPTER 17: Tests of Significance: The Basics
1 ConceptsDescriptionHypothesis TheoryLawsModel organizesurprise validate formalize The Scientific Method.
Lecture 17 Dustin Lueker.  A way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis ◦ Data that fall far.
Statistics: Unlocking the Power of Data Lock 5 Bootstrap Intervals Dr. Kari Lock Morgan PSU /12/14.
Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company.
Chapter 11 Chi- Square Test for Homogeneity Target Goal: I can use a chi-square test to compare 3 or more proportions. I can use a chi-square test for.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Logic and Vocabulary of Hypothesis Tests Chapter 13.
AP Statistics Section 11.1 B More on Significance Tests.
AP STATISTICS LESSON (DAY 1) INFERENCE FOR TWO – WAY TABLES.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan SECTION 7.1 Testing the distribution of a single categorical variable : χ.
Early Inference: Using Randomization to Introduce Hypothesis Tests Kari Lock, Harvard University Eric Lock, UNC Chapel Hill Dennis Lock, Iowa State Joint.
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 250 Dr. Kari Lock Morgan SECTION 4.1 Hypothesis test Null and alternative.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Statistics: Unlocking the Power of Data Lock 5 Section 4.2 Measuring Evidence with p-values.
Statistics: Unlocking the Power of Data Lock 5 Inference for Means STAT 250 Dr. Kari Lock Morgan Sections 6.4, 6.5, 6.6, 6.10, 6.11, 6.12, 6.13 t-distribution.
Introduction to Hypothesis Testing
Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 250 Dr. Kari Lock Morgan Chapter 5 Normal distribution (5.1) Central limit theorem.
Tests of Significance: The Basics ESS chapter 15 © 2013 W.H. Freeman and Company.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Testing a Single Mean Module 16. Tests of Significance Confidence intervals are used to estimate a population parameter. Tests of Significance or Hypothesis.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan SECTION 7.1 Testing the distribution of a single categorical variable : 
Today: Hypothesis testing. Example: Am I Cheating? If each of you pick a card from the four, and I make a guess of the card that you picked. What proportion.
Today: Hypothesis testing p-value Example: Paul the Octopus In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is.
Slide 20-1 Copyright © 2004 Pearson Education, Inc.
Level of Significance Level of significance Your maximum allowable probability of making a type I error. – Denoted by , the lowercase Greek letter alpha.
Chapter 13 Section 2. Chi-Square Test 1.Null hypothesis – written in words 2.Alternative hypothesis – written in words – always “different” 3.Alpha level.
Statistics: Unlocking the Power of Data Lock 5 Section 4.1 Introducing Hypothesis Tests.
Introducing Hypothesis Tests
Measuring Evidence with p-values
Hypothesis Testing: Hypotheses
Week 11 Chapter 17. Testing Hypotheses about Proportions
Measuring Evidence with p-values
Introducing Hypothesis Tests
Statistical Test A test of significance is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to.
Chapter 13: Chi-Square Procedures
Presentation transcript:

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 250 Dr. Kari Lock Morgan SECTION 4.2 p-value

Statistics: Unlocking the Power of Data Lock 5 Two Plausible Explanations If the sample data support the alternative, there are two plausible explanations: 1. The alternative hypothesis (H a ) is true 2. The null hypothesis (H 0 ) is true, and the sample results were just due to random chance Do the data provide enough evidence to rule out #2? We need a way to quantify evidence against the null…

Statistics: Unlocking the Power of Data Lock 5 Question #1 of the Day Is Desipramine or Lithium better at treating cocaine addiction? Is each better than a placebo?

Statistics: Unlocking the Power of Data Lock 5 In a randomized experiment on treating cocaine addiction, 48 people were randomly assigned to take either Desipramine (a new drug), or Lithium (an existing drug), and then followed to see who relapsed Question of interest: Is Desipramine better than Lithium at treating cocaine addiction? Cocaine Addiction

Statistics: Unlocking the Power of Data Lock 5 Cocaine Addiction Is Desipramine better than Lithium at treating cocaine addiction (preventing relapses)? What parameter(s) are we interested in? a) Proportion b) Mean c) Difference in proportions d) Difference in means e) Correlation

Statistics: Unlocking the Power of Data Lock 5 Cocaine Addiction Is Desipramine better than Lithium at treating cocaine addiction (preventing relapses)? p D : proportion of cocaine addicts who relapse after Desipramine p L : proportion of cocaine addicts who relapse after Lithium What are the relevant hypotheses? a) H 0 : p D = p L, H a : p D ≠ p L b) H 0 : p D = p L, H a : p D < p L c) H 0 : p D = p L, H a : p D > p L d) H 0 : p D < p L, H a : p D = p L e) H 0 : p D > p L, H a : p D = p L

Statistics: Unlocking the Power of Data Lock 5 RRRRRR RRRRRR RRRRRR RRRRRR RRRRRR RRRRRR RRRRRR RRRRRR RRRR RRRRRR RRRRRR RRRRRR RRRR RRRRRR RRRRRR RRRRRR Desipramine Lithium 1. Randomly assign units to treatment groups

Statistics: Unlocking the Power of Data Lock 5 RRRR RRRRRR RRRRRR NNNNNN RRRRRR RRRRNN NNNNNN RR NNNNNN R = Relapse N = No Relapse RRRR RRRRRR RRRRRR NNNNNN RRRRRR RRRRRR RRNNNN RR NNNNNN 2. Conduct experiment 3. Observe relapse counts in each group Lithium Desipramine 10 relapse, 14 no relapse18 relapse, 6 no relapse 1. Randomly assign units to treatment groups

Statistics: Unlocking the Power of Data Lock 5 To see if a statistic provides evidence against H 0, we need to see what kind of sample statistics we would observe, just by random chance, if H 0 were true Measuring Evidence against H 0

Statistics: Unlocking the Power of Data Lock 5 RRRR RRRRRR RRRRRR NNNNNN RR RRRR RRRRNN NNNNNN RR NNNNNN 10 relapse, 14 no relapse18 relapse, 6 no relapse

Statistics: Unlocking the Power of Data Lock 5 RRRRRR RRRRNN NNNNNN NNNNNN RRRRRR RRRRRR RRRRRR NNNNNN RNRN RRRRRR RNRRRN RNNNRR NNNR NRRNNN NRNRRN RNRRRR Simulate another randomization Desipramine Lithium 16 relapse, 8 no relapse12 relapse, 12 no relapse

Statistics: Unlocking the Power of Data Lock 5 RRRR RRRRRR RRRRRR NNNNNN RR RRRR RNRRNN RRNRNR RR RNRNRR Simulate another randomization Desipramine Lithium 17 relapse, 7 no relapse11 relapse, 13 no relapse

Statistics: Unlocking the Power of Data Lock 5 Distribution of Statistic under H 0 How extreme is our observed statistic of ?

Statistics: Unlocking the Power of Data Lock 5 p-value The p-value is the chance of obtaining a sample statistic as extreme (or more extreme) than the observed sample statistic, if the null hypothesis is true

Statistics: Unlocking the Power of Data Lock 5 1. What kinds of statistics would we get, just by random chance, if the null hypothesis were true? (randomization distribution) 2. What proportion of these statistics are as extreme as our original sample statistic? (p-value) Calculating a p-value

Statistics: Unlocking the Power of Data Lock 5 p-value Proportion as extreme as observed statistic observed statistic If the two drugs are equal regarding cocaine relapse rates, we have a 2% chance of seeing a difference in proportions as extreme as we observed. Cocaine Addiction Distribution of statistic if H 0 true

Statistics: Unlocking the Power of Data Lock 5 p-value: The chance of obtaining a statistic as extreme as that observed, just by random chance, if the null hypothesis is true

Statistics: Unlocking the Power of Data Lock 5 Cocaine Addiction In the cocaine addiction experiment, people were actually randomized to one of three groups: Desipramine, Lithium, or Placebo Does Desipramine do better than just a placebo at preventing relapses? Does Lithium do better than just a placebo at preventing relapses?

Statistics: Unlocking the Power of Data Lock 5 Desipramine vs Placebo p-value observed statistic Distribution of statistic if H 0 true If there were no difference between Desipramine and placebo regarding cocaine relapses, we would only see a difference as extreme as that observed 1 out of 1000 times.

Statistics: Unlocking the Power of Data Lock 5 Plausible Null? Based on the Despiramine vs placebo results, do you think the null of no difference is plausible? a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 Lithium vs Placebo p-value observed statistic Distribution of statistic if H 0 true If there were no difference between Lithium and placebo regarding cocaine relapses, we would see a difference as extreme as the one observed about 34% of the time.

Statistics: Unlocking the Power of Data Lock 5 Plausible Null? Based on the Lithium vs placebo results, do you think the null of no difference is plausible? a) Yes b) No

Statistics: Unlocking the Power of Data Lock 5 Question #2 of the Day Is sleep or caffeine better for memory?

Statistics: Unlocking the Power of Data Lock 5 Sleep versus Caffeine Mednick, Cai, Kanady, and Drummond (2008). “Comparing the benefits of caffeine, naps and placebo on verbal, motor and perceptual memory,” Behavioral Brain Research, 193, Students were given words to memorize, then randomly assigned to take either a 90 min nap, or a caffeine pill. 2 ½ hours later, they were tested on their recall ability. Explanatory variable: sleep or caffeine Response variable: number of words recalled

Statistics: Unlocking the Power of Data Lock 5 Sleep versus Caffeine What is the parameter of interest in the sleep versus caffeine experiment? a) Proportion b) Difference in proportions c) Mean d) Difference in means e) Correlation

Statistics: Unlocking the Power of Data Lock 5 Sleep versus Caffeine Let  s and  c be the true mean number of words recalled after sleeping and after caffeine. Is sleep better than caffeine for memory? What are the null and alternative hypotheses? a) H 0 :  s ≠  c, H a :  s =  c b) H 0 :  s =  c, H a :  s ≠  c c) H 0 :  s ≠  c, H a :  s >  c d) H 0 :  s =  c, H a :  s >  c e) H 0 :  s =  c, H a :  s <  c

Statistics: Unlocking the Power of Data Lock 5 Sleep versus Caffeine

Statistics: Unlocking the Power of Data Lock 5 Sleep or Caffeine for Memory? WordsGroup 9sleep 11sleep 13sleep 14sleep 14sleep 15sleep 16sleep 17sleep 17sleep 18sleep 18sleep 21sleep WordsGroup 6caffeine 7 10caffeine 10caffeine 12caffeine 12caffeine 13caffeine 14caffeine 14caffeine 15caffeine 16caffeine 18caffeine sleep mean = caffeine mean = Is sleep actually better than caffeine for memory, or did the people with better memory just happen to get randomly assigned to sleep? sleep mean – caffeine mean = 3

Statistics: Unlocking the Power of Data Lock 5 WordsGroup 9sleep 11sleep 13sleep 14sleep 14sleep 15sleep 16sleep 17sleep 17sleep 18sleep 18sleep 21sleep WordsGroup 6caffeine 7 10caffeine 10caffeine 12caffeine 12caffeine 13caffeine 14caffeine 14caffeine 15caffeine 16caffeine 18caffeine mean = 15.25mean = What kinds of results would you see, just by random chance, if there were no difference between sleep and caffeine? Sleep or Caffeine for Memory?

Statistics: Unlocking the Power of Data Lock 5 WordsGroup WordsGroup sleep mean = sleep caffeine sleep caffeine sleep caffeine caffeine mean = sleep mean – caffeine mean = 1 What kinds of results would you see, just by random chance, if there were no difference between sleep and caffeine? Sleep or Caffeine for Memory?

Statistics: Unlocking the Power of Data Lock 5 Sleep versus Caffeine p-value observed statistic Distribution of statistic if H 0 true If there were no difference between sleep and caffeine regarding memory, we would see a difference as extreme as the one observed about 2% of the time.

Statistics: Unlocking the Power of Data Lock 5 Sleep versus Caffeine Let  s and  c be the true mean number of words recalled after sleeping and after caffeine. Now: Is there a difference in average word recall between sleep and caffeine? What are the null and alternative hypotheses? a) H 0 :  s ≠  c, H a :  s =  c b) H 0 :  s =  c, H a :  s ≠  c c) H 0 :  s ≠  c, H a :  s >  c d) H 0 :  s =  c, H a :  s >  c e) H 0 :  s =  c, H a :  s <  c

Statistics: Unlocking the Power of Data Lock 5 A one-sided alternative contains either > or < A two-sided alternative contains ≠ The p-value is the proportion in the tail in the direction specified by H a For a two-sided alternative, the p-value is twice the proportion in the smallest tail Alternative Hypothesis

Statistics: Unlocking the Power of Data Lock 5 p-value and H a Upper-tail (Right Tail) Lower-tail (Left Tail) Two-tailed

Statistics: Unlocking the Power of Data Lock 5 Sleep or Caffeine for Memory? p-value = 2 × = 0.044

Statistics: Unlocking the Power of Data Lock 5 p-value and H 0 The p-value measures how extreme the observed result would be, if the null hypothesis were true If the p-value is small, then a statistic as extreme as that observed would be unlikely if the null hypothesis were true, providing evidence against H 0 The smaller the p-value, the stronger the evidence against the null hypothesis (and in favor of the alternative)

Statistics: Unlocking the Power of Data Lock 5 Which of the following p-values gives the strongest evidence against H 0 ? a) b) 0.1 c) 0.32 d) 0.56 e) 0.94 p-value and H 0

Statistics: Unlocking the Power of Data Lock 5 Which of the following p-values gives the strongest evidence against H 0 ? a) 0.22 b) 0.45 c) 0.03 d) 0.8 e) 0.71 p-value and H 0

Statistics: Unlocking the Power of Data Lock 5 Two different studies obtain two different p- values. Study A obtained a p-value of and Study B obtained a p-value of 0.2. Which study obtained stronger evidence against the null hypothesis? a) Study A b) Study B p-value and H 0

Statistics: Unlocking the Power of Data Lock 5 The smaller the p-value, the stronger the evidence against H o. p-value and H 0

Statistics: Unlocking the Power of Data Lock 5 A formal hypothesis test has only two possible conclusions: 1.The p-value is small: reject the null hypothesis in favor of the alternative 2.The p-value is not small: do not reject the null hypothesis Formal Decisions How small?

Statistics: Unlocking the Power of Data Lock 5 To Do Read Section 4.2 HW 4.2 due Friday 10/23