Understanding the P-value… Really! Kari Lock Morgan Department of Statistical Science, Duke University with Robin Lock, Patti Frazer.

Slides:

Advertisements

Similar presentations

Panel at 2013 Joint Mathematics Meetings

Advertisements

Statistical Inference Using Scrambles and Bootstraps Robin Lock Burry Professor of Statistics St. Lawrence University MAA Allegheny Mountain 2014 Section.

Rerandomization in Randomized Experiments STAT 320 Duke University Kari Lock Morgan.

Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock Morgan, Lock, and Lock MAA Minicourse –

Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock, Lock, and Lock MAA Minicourse – Joint Mathematics.

Simulating with StatKey Kari Lock Morgan Department of Statistical Science Duke University Joint Mathematical Meetings, San Diego 1/11/13.

Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 10/23/12 Sections , Single Proportion, p Distribution (6.1)

Hypothesis Testing: Intervals and Tests

Bootstrap Distributions Or: How do we get a sense of a sampling distribution when we only have ONE sample?

Early Inference: Using Bootstraps to Introduce Confidence Intervals Robin H. Lock, Burry Professor of Statistics Patti Frazer Lock, Cummings Professor.

Hypothesis Testing I 2/8/12 More on bootstrapping Random chance

Intuitive Introduction to the Important Ideas of Inference Robin Lock – St. Lawrence University Patti Frazer Lock – St. Lawrence University Kari Lock Morgan.

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 250 Dr. Kari Lock Morgan SECTION 4.2 Randomization distribution p-value.

A Fiddler on the Roof: Tradition vs. Modern Methods in Teaching Inference Patti Frazer Lock Robin H. Lock St. Lawrence University Joint Mathematics Meetings.

STAT 101 Dr. Kari Lock Morgan Exam 2 Review.

Connecting Simulation- Based Inference with Traditional Methods Kari Lock Morgan, Penn State Robin Lock, St. Lawrence University Patti Frazer Lock, St.

StatKey: Online Tools for Bootstrap Intervals and Randomization Tests Kari Lock Morgan Department of Statistical Science Duke University Joint work with.

Section 4.4 Creating Randomization Distributions.

Dr. Kari Lock Morgan Department of Statistics Penn State University Teaching the Common Core: Making Inferences and Justifying Conclusions ASA Webinar.

Starting Inference with Bootstraps and Randomizations Robin H. Lock, Burry Professor of Statistics St. Lawrence University Stat Chat Macalester College,

Using Simulation Methods to Introduce Statistical Inference Patti Frazer Lock Kari Lock Morgan Cummings Professor of Mathematics Assistant Professor of.

Building Conceptual Understanding of Statistical Inference with Lock 5 Dr. Kari Lock Morgan Department of Statistical Science Duke University Wake Forest.

Inference for Categorical Variables 2/29/12 Single Proportion, p Distribution Intervals and tests Difference in proportions, p 1 – p 2 One proportion or.

Bootstrapping: Let Your Data Be Your Guide Robin H. Lock Burry Professor of Statistics St. Lawrence University MAA Seaway Section Meeting Hamilton College,

Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 101 Dr. Kari Lock Morgan 9/25/12 SECTION 4.2 Randomization distribution.

Using Bootstrap Intervals and Randomization Tests to Enhance Conceptual Understanding in Introductory Statistics Kari Lock Morgan Department of Statistical.

ANOVA 3/19/12 Mini Review of simulation versus formulas and theoretical distributions Analysis of Variance (ANOVA) to compare means: testing for a difference.

Introducing Inference with Bootstrap and Randomization Procedures Dennis Lock Statistics Education Meeting October 30,

Dennis Shasha From a book co-written with Manda Wilson

Randomization Tests Dr. Kari Lock Morgan PSU /5/14.

Building Conceptual Understanding of Statistical Inference Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University

More Randomization Distributions, Connections

Using Simulation Methods to Introduce Inference Kari Lock Morgan Duke University In collaboration with Robin Lock, Patti Frazer Lock, Eric Lock, Dennis.

More About Significance Tests

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Significance STAT 101 Dr. Kari Lock Morgan 9/27/12 SECTION 4.3 Significance level Statistical.

Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)

Using Lock5 Statistics: Unlocking the Power of Data

Hypothesis Testing: p-value

How to Handle Intervals in a Simulation-Based Curriculum? Robin Lock Burry Professor of Statistics St. Lawrence University 2015 Joint Statistics Meetings.

Statistics: Unlocking the Power of Data Lock 5 Afternoon Session Using Lock5 Statistics: Unlocking the Power of Data Patti Frazer Lock University of Kentucky.

Building Conceptual Understanding of Statistical Inference Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University

Statistics: Unlocking the Power of Data Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University University of Kentucky.

Introducing Inference with Simulation Methods; Implementation at Duke University Kari Lock Morgan Department of Statistical Science, Duke University

Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/1/12 ANOVA SECTION 8.1 Testing for a difference in means across multiple.

Statistics: Unlocking the Power of Data Lock 5 Normal Distribution STAT 101 Dr. Kari Lock Morgan 10/18/12 Chapter 5 Normal distribution Central limit theorem.

Using Randomization Methods to Build Conceptual Understanding of Statistical Inference: Day 2 Lock, Lock, Lock Morgan, Lock, and Lock MAA Minicourse- Joint.

Robin Lock St. Lawrence University USCOTS Opening Session.

Section 10.1 Confidence Intervals

Introducing Inference with Bootstrapping and Randomization Kari Lock Morgan Department of Statistical Science, Duke University with.

Implementing a Randomization-Based Curriculum for Introductory Statistics Robin H. Lock, Burry Professor of Statistics St. Lawrence University Breakout.

Building Conceptual Understanding of Statistical Inference Patti Frazer Lock Cummings Professor of Mathematics St. Lawrence University Canton, New York.

Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.

Using Bootstrapping and Randomization to Introduce Statistical Inference Robin H. Lock, Burry Professor of Statistics Patti Frazer Lock, Cummings Professor.

Give your data the boot: What is bootstrapping? and Why does it matter? Patti Frazer Lock and Robin H. Lock St. Lawrence University MAA Seaway Section.

+ Using StatCrunch to Teach Statistics Using Resampling Techniques Webster West Texas A&M University.

Early Inference: Using Randomization to Introduce Hypothesis Tests Kari Lock, Harvard University Eric Lock, UNC Chapel Hill Dennis Lock, Iowa State Joint.

Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.

Bootstraps and Scrambles: Letting a Dataset Speak for Itself Robin H. Lock Patti Frazer Lock ‘75 Burry Professor of Statistics Cummings Professor of MathematicsSt.

Today: Hypothesis testing p-value Example: Paul the Octopus In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is.

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: p-value STAT 250 Dr. Kari Lock Morgan SECTION 4.2 p-value.

Simulation-based inference beyond the introductory course Beth Chance Department of Statistics Cal Poly – San Luis Obispo

Using Randomization Methods to Build Conceptual Understanding in Statistical Inference: Day 1 Lock, Lock, Lock, Lock, and Lock Minicourse – Joint Mathematics.

Making computing skills part of learning introductory stats

Patti Frazer Lock Cummings Professor of Mathematics

Connecting Intuitive Simulation-Based Inference to Traditional Methods

Using Simulation Methods to Introduce Inference

Improving Conceptual Understanding in Intro Stats

Using Simulation Methods to Introduce Inference

Improving Conceptual Understanding in Intro Stats

Presentation transcript:

Understanding the P-value… Really! Kari Lock Morgan Department of Statistical Science, Duke University with Robin Lock, Patti Frazer Lock, Eric Lock, Dennis Lock Statistics: Unlocking the Power of Data Wiley Faculty Network 10/11/12

Mind-Set Matters 84 hotel maids recruited Half were randomly selected to be informed that their work satisfies recommendations for an active lifestyle After 8 weeks, the informed group had lost 1.59 more pounds, on average, than the control group Did the information actually cause them to lose more weight, or might we see a difference this extreme just by random chance??? Crum, A.J. and Langer, E.J. (2007). “Mind-Set Matters: Exercise and the Placebo Effect,” Psychological Science, 18:

Traditional Inference 1. Which formula? 2. Calculate numbers and plug into formula 3. Plug into calculator 4. Which theoretical distribution? 5. df? 6. find p-value < p-value < 0.01 > pt(2.65, 33, lower.tail=FALSE) [1]

Confidence intervals and hypothesis tests using the normal and t-distributions With a different formula for each situation, students often get mired in the details and fail to see the big picture Plugging numbers into formulas does little to help reinforce conceptual understanding Traditional Inference

Simulation methods (bootstrapping and randomization) are a computationally intensive alternative to the traditional approach Rather than relying on theoretical distributions for specific test statistics, we can directly simulate the distribution of any statistic Great for conceptual understanding! Simulation Methods

To generate a distribution assuming H 0 is true: Traditional Approach: Calculate a test statistic which should follow a known distribution if the null hypothesis is true (under some conditions) Randomization Approach: Decide on a statistic of interest. Simulate many randomizations assuming the null hypothesis is true, and calculate this statistic for each randomization Hypothesis Testing

Paul the Octopus

Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is this evidence that Paul actually has psychic powers? How unusual would this be if he was just randomly guessing (with a 50% chance of guessing correctly)? How could we figure this out? Paul the Octopus

Students each flip a coin 8 times, and count the number of heads Count the number of students with all 8 heads by a show of hands (will probably be 0) If Paul was just guessing, it would be very unlikely for him to get all 8 correct! How unlikely? Simulate many times!!! Simulate with Students

Simulate with StatKey

In a randomized experiment on treating cocaine addiction, 48 people were randomly assigned to take either Desipramine (a new drug), or Lithium (an existing drug) The outcome variable is whether or not a patient relapsed Is Desipramine significantly better than Lithium at treating cocaine addiction? Cocaine Addiction

RRRRRR RRRRRR RRRRRR RRRRRR RRRRRR RRRRRR RRRRRR RRRRRR RRRR RRRRRR RRRRRR RRRRRR RRRR RRRRRR RRRRRR RRRRRR New Drug Old Drug 1. Randomly assign units to treatment groups

RRRR RRRRRR RRRRRR NNNNNN RRRRRR RRRRNN NNNNNN RR NNNNNN R = Relapse N = No Relapse RRRR RRRRRR RRRRRR NNNNNN RRRRRR RRRRRR RRNNNN RR NNNNNN 2. Conduct experiment 3. Observe relapse counts in each group Old Drug New Drug 10 relapse, 14 no relapse18 relapse, 6 no relapse 1. Randomly assign units to treatment groups

Assume the null hypothesis is true Simulate new randomizations For each, calculate the statistic of interest Find the proportion of these simulated statistics that are as extreme as your observed statistic Randomization Test

RRRR RRRRRR RRRRRR NNNNNN RR RRRR RRRRNN NNNNNN RR NNNNNN 10 relapse, 14 no relapse18 relapse, 6 no relapse

RRRRRR RRRRNN NNNNNN NNNNNN RRRRRR RRRRRR RRRRRR NNNNNN RNRN RRRRRR RNRRRN RNNNRR NNNR NRRNNN NRNRRN RNRRRR Simulate another randomization New Drug Old Drug 16 relapse, 8 no relapse12 relapse, 12 no relapse

RRRR RRRRRR RRRRRR NNNNNN RR RRRR RNRRNN RRNRNR RR RNRNRR Simulate another randomization New Drug Old Drug 17 relapse, 7 no relapse11 relapse, 13 no relapse

Give students index cards labeled R (28 cards) and N (20 cards) Have them deal the cards into 2 groups Compute the difference in proportions Contribute to a class dotplot for the randomization distribution Simulate with Students

Why did you re-deal your cards? Why did you leave the outcomes (relapse or no relapse) unchanged on each card? Cocaine Addiction You want to know what would happen by random chance (the random allocation to treatment groups) if the null hypothesis is true (there is no difference between the drugs)

Simulate with StatKey The probability of getting results as extreme or more extreme than those observed if the null hypothesis is true, is about.02. p-value Proportion as extreme as observed statistic observed statistic Distribution of Statistic Assuming Null is True

Mind-Set Matters 84 hotel maids recruited Half were randomly selected to be informed that their work satisfies recommendations for an active lifestyle After 8 weeks, the informed group had lost 1.59 more pounds, on average, than the control group Did the information actually cause them to lose more weight, or might we see a difference this extreme just by random chance??? Crum, A.J. and Langer, E.J. (2007). “Mind-Set Matters: Exercise and the Placebo Effect,” Psychological Science, 18:

Simulation Approach Non-Informed … Informed … Non-Informed … Informed … Non-Informed … Informed … ACTUAL SIMULATED Difference in means: 1.59 Difference in means: Difference in means: 0.32

StatKey p-value Proportion as extreme as observed statistic observed statistic Distribution of Statistic by random chance, if H 0 true

Traditional Inference 1. Which formula? 2. Calculate numbers and plug into formula 3. Plug into calculator 4. Which theoretical distribution? 5. df? 6. find p-value < p-value < 0.01 > pt(2.65, 33, lower.tail=FALSE) [1]

Mind-Set and Weight Loss The Conclusion! The results seen in the experiment are very unlikely to happen just by random chance (just 6 out of 1000!) We have strong evidence that the information actually caused the informed maids to lose more weight! In other words, MIND-SET MATTERS!

Simulation vs Traditional Simulation methods intrinsically connected to concepts same procedure applies to all statistics no conditions to check minimal background knowledge needed Traditional methods (normal and t based) familiarity expected after intro stats needed for future statistics classes only summary statistics are needed insight from standard error

Simulation AND Traditional? Our book introduces inference with simulation methods, then covers the traditional methods Students have seen the normal distribution appear repeatedly via simulation; use this common shape to motivate traditional inference “Shortcut” formulas give the standard error, avoiding the need for thousands of simulations Students already know the concepts, so can go relatively fast through the mechanics

Topics Ch 1: Collecting Data Ch 2: Describing Data Ch 3: Confidence Intervals (Bootstrap) Ch 4: Hypothesis Tests (Randomization) Ch 5: Normal Distribution Ch 6: Inference for Means and Proportions (formulas and theory) Ch 7: Chi-Square Tests Ch 8: ANOVA Ch 9: Regression Ch 10: Multiple Regression (Optional): Probability

Normal and t-based inference after bootstrapping and randomization: Students have seen the normal distribution repeatedly – CLT easy! Same idea, just using formula for SE and comparing to theoretical distribution Can go very quickly through this! Theoretical Approach

p-value t-statistic

Introduce new statistic -  2 or F Students know that these can be compared to either a randomization distribution or a theoretical distribution Students are comfortable using either method, and see the connection! If conditions are met, the randomization and theoretical distributions are the same! Chi-Square and ANOVA

Chi-Square Statistic Randomization Distribution Chi-Square Distribution (3 df) p-value =  2 statistic = p-value = 0.356

Student Preferences Which way of doing inference gave you a better conceptual understanding of confidence intervals and hypothesis tests? Bootstrapping and Randomization Formulas and Theoretical Distributions %31%

Student Preferences Which way did you prefer to learn inference (confidence intervals and hypothesis tests)? Bootstrapping and Randomization Formulas and Theoretical Distributions %36% SimulationTraditional AP Stat3136 No AP Stat7424

Student Behavior Students were given data on the second midterm and asked to compute a confidence interval for the mean How they created the interval: Bootstrappingt.test in RFormula %8%

A Student Comment " I took AP Stat in high school and I got a 5. It was mainly all equations, and I had no idea of the theory behind any of what I was doing. Statkey and bootstrapping really made me understand the concepts I was learning, as opposed to just being able to just spit them out on an exam.” - one of my students

It is the way of the past… "Actually, the statistician does not carry out this very simple and very tedious process [the randomization test], but his conclusions have no justification beyond the fact that they agree with those which could have been arrived at by this elementary method." -- Sir R. A. Fisher, 1936

… and the way of the future “... the consensus curriculum is still an unwitting prisoner of history. What we teach is largely the technical machinery of numerical approximations based on the normal distribution and its many subsidiary cogs. This machinery was once necessary, because the conceptually simpler alternative based on permutations was computationally beyond our reach. Before computers statisticians had no choice. These days we have no excuse. Randomization-based inference makes a direct connection between data production and the logic of inference that deserves to be at the core of every introductory course.” -- Professor George Cobb, 2007

Book Statistics: Unlocking the Power of Data Robin H. Lock, St. Lawrence University Patti Frazer Lock, St. Lawrence University Kari Lock Morgan, Duke University Eric F. Lock, Duke University Dennis F. Lock, Iowa State To be published November 2012