Using Simulation/Randomization to Introduce p-value in Week 1 Soma Roy Department of Statistics, Cal Poly, San Luis Obispo ICOTS9, Flagstaff – Arizona.

Slides:



Advertisements
Similar presentations
Using Technology to Engage Students and Enhance Learning Roxy Peck Cal Poly, San Luis Obispo
Advertisements

Implementation and Order of Topics at Hope College.
Panel at 2013 Joint Mathematics Meetings
An Active Approach to Statistical Inference using Randomization Methods Todd Swanson & Jill VanderStoep Hope College Holland, Michigan.
Using Simulation/Randomization-based Methods to Introduce Statistical Inference on Day 1 Soma Roy Department of Statistics, Cal Poly, San Luis Obispo CMC.
ANALYZING MORE GENERAL SITUATIONS UNIT 3. Unit Overview  In the first unit we explored tests of significance, confidence intervals, generalization, and.
Concepts of Statistical Inference: A Randomization-Based Curriculum Allan Rossman, Beth Chance, John Holcomb Cal Poly – San Luis Obispo, Cleveland State.
QUANTITATIVE EVIDENCE FOR THE USE OF SIMULATION AND RANDOMIZATION IN THE INTRODUCTORY STATISTICS COURSE Nathan Tintle Associate Professor of Statistics.
A debate of what we know, think we know, and don’t know about the use of simulation and randomization-based methods as alternatives to the consensus curriculum.
John Holcomb - Cleveland State University Beth Chance, Allan Rossman, Emily Tietjen - Cal Poly State University George Cobb - Mount Holyoke College
A new approach to introductory statistics Nathan Tintle Hope College.
A Fiddler on the Roof: Tradition vs. Modern Methods in Teaching Inference Patti Frazer Lock Robin H. Lock St. Lawrence University Joint Mathematics Meetings.
STAT E100 Exam 2 Review.
Stat 217 – Day 9 Sampling. Recap (Lab 1) Observational unit = Sarah’s attempts (n = 8) Research hypothesis:Can Sarah solve problems  Research conjecture:
Stat 301 – Day 28 Review. Last Time - Handout (a) Make sure you discuss shape, center, and spread, and cite graphical and numerical evidence, in context.
Stat 301 – Day 15 Comparing Groups. Statistical Inference Making statements about the “world” based on observing a sample of data, with an indication.
Stat 301 – Day 14 Review. Previously Instead of sampling from a process  Each trick or treater makes a “random” choice of what item to select; Sarah.
Stat 217 – Day 27 Chi-square tests (Topic 25). The Plan Exam 2 returned at end of class today  Mean.80 (36/45)  Solutions with commentary online  Discuss.
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
Stat 512 – Day 8 Tests of Significance (Ch. 6). Last Time Use random sampling to eliminate sampling errors Use caution to reduce nonsampling errors Use.
Stat 217 – Day 15 Statistical Inference (Topics 17 and 18)
Chapter 9 Hypothesis Testing.
Using Bootstrap Intervals and Randomization Tests to Enhance Conceptual Understanding in Introductory Statistics Kari Lock Morgan Department of Statistical.
Introducing Inference with Bootstrap and Randomization Procedures Dennis Lock Statistics Education Meeting October 30,
Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Hypotheses STAT 101 Dr. Kari Lock Morgan SECTION 4.1 Statistical test Null and alternative.
Review Tests of Significance. Single Proportion.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
Using Simulation Methods to Introduce Inference Kari Lock Morgan Duke University In collaboration with Robin Lock, Patti Frazer Lock, Eric Lock, Dennis.
Chapter 9 Comparing More than Two Means. Review of Simulation-Based Tests  One proportion:  We created a null distribution by flipping a coin, rolling.
Aiming to Improve Students' Statistical Reasoning: An Introduction to AIMS Materials Bob delMas, Joan Garfield, and Andy Zieffler University of Minnesota.
Using Lock5 Statistics: Unlocking the Power of Data
Statistics: Unlocking the Power of Data Lock 5 Afternoon Session Using Lock5 Statistics: Unlocking the Power of Data Patti Frazer Lock University of Kentucky.
Essential Statistics Chapter 131 Introduction to Inference.
Introducing Statistical Inference with Randomization Tests Allan Rossman Cal Poly – San Luis Obispo
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
CHAPTER 17: Tests of Significance: The Basics
Review of Chapters 1- 6 We review some important themes from the first 6 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
9.1 – The Basics Ch 9 – Testing a Claim. Jack’s a candidate for mayor against 1 other person, so he must gain at least 50% of the votes. Based on a poll.
Introducing Inference with Bootstrapping and Randomization Kari Lock Morgan Department of Statistical Science, Duke University with.
Introduction to the Practice of Statistics Fifth Edition Chapter 6: Introduction to Inference Copyright © 2005 by W. H. Freeman and Company David S. Moore.
Implementing a Randomization-Based Curriculum for Introductory Statistics Robin H. Lock, Burry Professor of Statistics St. Lawrence University Breakout.
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
PANEL: Rethinking the First Statistics Course for Math Majors Joint Statistical Meetings, 8/11/04 Allan Rossman Beth Chance Cal Poly – San Luis Obispo.
Bayesian Inference, Review 4/25/12 Frequentist inference Bayesian inference Review The Bayesian Heresy (pdf)pdf Professor Kari Lock Morgan Duke University.
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
CHAPTER 9 Testing a Claim
Rejecting Chance – Testing Hypotheses in Research Thought Questions 1. Want to test a claim about the proportion of a population who have a certain trait.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Early Inference: Using Randomization to Introduce Hypothesis Tests Kari Lock, Harvard University Eric Lock, UNC Chapel Hill Dennis Lock, Iowa State Joint.
AP Statistics Chapter 11 Notes. Significance Test & Hypothesis Significance test: a formal procedure for comparing observed data with a hypothesis whose.
Stat 31, Section 1, Last Time Distribution of Sample Means –Expected Value  same –Variance  less, Law of Averages, I –Dist’n  Normal, Law of Averages,
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Teaching Introductory Statistics with Simulation-Based Inference Allan Rossman and Beth Chance Cal Poly – San Luis Obispo
Simulation-based inference beyond the introductory course Beth Chance Department of Statistics Cal Poly – San Luis Obispo
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.1 Significance Tests:
Introducing Statistical Inference with Resampling Methods (Part 1)
Simulation Based Inference for Learning
What Is a Test of Significance?
CHAPTER 9 Testing a Claim
Stat 217 – Day 28 Review Stat 217.
Stat 217 – Day 17 Review.
Using Simulation Methods to Introduce Inference
CHAPTER 9 Testing a Claim
Using Simulation Methods to Introduce Inference
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Presentation transcript:

Using Simulation/Randomization to Introduce p-value in Week 1 Soma Roy Department of Statistics, Cal Poly, San Luis Obispo ICOTS9, Flagstaff – Arizona July 15, 2014

Overview  Background and Motivation  Philosophy and Approach  Course Materials  Sample Examples/Explorations  Advantages of this Approach ICOTS9: July 15,

Background  The Stat 101 course  Algebra-based introductory statistics for non-majors  Recent changes in Stat 101  Content, pedagogy, and use of technology  Implementation of GAISE (2002) guidelines  A more “modern” course compared to twenty years ago  Still, one “traditional” aspect  Sequencing of topics ICOTS9: July 15,

Background (contd.) Typical “traditional” sequence of topics  Part I:  Descriptive statistics (graphical and numerical)  Part II:  Data collection (types of studies)  Part III:  Probability (e.g. normal distribution, z-scores, looking up z-tables to calculate probabilities)  Sampling distribution/CLT  Part IV: Inference  Tests of hypotheses, and  Confidence intervals ICOTS9: July 15,

 Concerns with using the typical “traditional” sequence of topics  Puts the inference at the very end of the quarter/semester  Leaves very little time for students to  Develop a strong conceptual understanding of the logic of inference, and the reasoning process behind:  Statistical significance  p-values (evaluation and interpretation)  Confidence intervals  Estimation of parameter of interest ICOTS9: July 15, Motivation

 Concerns with using the typical “traditional” sequence of topics (contd.)  Content (parts I, II, III, and IV) appears disconnected and compartmentalized  Not successful at presenting the big picture of the entire statistical investigation process ICOTS9: July 15, Motivation (contd.)

 Chance and Rossman (ISCAM, 2005)  introduce statistical inference in week 1 or 2 of a 10- week quarter in a calculus-based introductory statistics course.  Malone et al. (2010)  discuss reordering of topics such that inference methods for one categorical variable are introduced in week 3 of a 15-week semester, in Stat 101 type courses. ICOTS9: July 15, Recent attempts to change the sequence of topics

Our Philosophy  Expose students to the logic of statistical inference early  Give them time to develop and strengthen their understanding of the core concepts of  Statistical significance, and p-value (interpretation, and evaluation)  Interval estimation  Help students see that the core logic of inference stays the same regardless of data type and data structure  Give students the opportunity to “discover” how the study design connects with the scope of inference ICOTS9: July 15,

Our Approach: Key features  Introduce the concept of p-value in week 1  Present the entire 6-Step statistical investigation process  Use a spiral approach to repeat the 6-Step statistical investigation process in different scenarios ICOTS9: July 15,

Implementation of Our Approach  Order of topics  Inference (Tests of significance and Confidence Intervals) for  One proportion  One mean  Two proportions  Two means  Paired data  More than 2 means  r x k tables  Regression ICOTS9: July 15,

Implementation (contd.)  The key question is the same every time “Is the observed result surprising (unlikely) to have happened by random chance alone?”  First through simulation/randomization, and then theory-based methods, every time  Start with a tactile simulation/randomization using coins, dice, cards, etc.  Follow up with technology – purposefully-designed (free) web applets (instead of commercial software); self-explanatory; lots of visual explanation  Wrap up with “theory-based” method, if available ICOTS9: July 15,

Implementation (contd.)  So, we start with “Part IV.” What about “Part I” and “Part II”? Descriptive statistics and data collection?  Descriptive statistics are introduced as and when the need arises; step 3 of the 6-Step process.  E.g. Segmented bar charts for comparing two proportions  Random sampling is discussed early on, but the discussion on random assignment (and experiments vs. observational studies) is saved for comparison of two groups  These concepts fall under the discussion of scope of inference (generalization and causation); step 5 of the 6-Step process. ICOTS9: July 15,

Implementation (contd.)  What about “Part III”? Probability and sampling distribution? Theoretical distributions?  Probability and sampling distribution are integrated into inference  Students start working with the null distribution and p-value in week 1.  Theoretical distributions are introduced as alternative paths to approximating a p-value in certain situations.  Show that (under certain conditions) the theory can predict what the simulation will show  Show the limitations of the theory-based approach ICOTS9: July 15,

Course materials  Challenge: no existing textbook has these features  Sequencing of topics; process of statistical investigations  Just-in-time introduction of descriptive statistics and data collection concepts  Alternating between simulation/randomization- based and theory-based inference  So, we developed our own materials  Another key feature: an instructor can choose from  Exposition-based example, or  Activity-based exploration ICOTS9: July 15,

Example 1: Introduction to chance models  Research question: Can chimpanzees solve problems?  A trained adult chimpanzee named Sarah was shown videotapes of 8 different problems a human was having (Premack and Woodruff, 1978)  After each problem, she was shown two photographs, one of which showed a potential solution to the problem.  Sarah picked the correct photograph 7 out of 8 times.  Question to students: What are two possible explanations for why Sarah got 7 correct out of 8? ICOTS9: July 15,

Example 1: Intro to chance models (contd.)  Generally, students can come up with the two possible explanations 1.Sarah guesses in such situations, and got 7 correct just by chance 2.Sarah tends to do better than guess in such situations  Question: Given her performance, which explanation do you find more plausible?  Typically, students pick explanation #2 as the more plausible explanation for her performance.  Question: How do you rule out explanation #1? ICOTS9: July 15,

Example 1: Intro to chance models (contd.)  Simulate what Sarah’s results could-have-been had she been just guessing  Coin tossing seems like a reasonable mechanism to model “just guessing” each time  How many tosses?  How many repetitions? What to record after each repetition?  Thus, we establish the need to mimic the actual study, but now assuming Sarah is just guessing, to generate the pattern of “just guessing” results ICOTS9: July 15,

Example 1: Intro to chance models (contd.)  Here are the results of 35 repetitions ( for a class size of 35)  Question: What next? How can we use the above dotplot to decide whether Sarah’s performance is surprising (i.e. unlikely) to have happened by chance alone?  Aspects of the distribution to discuss: center and variability; typical and atypical values ICOTS9: July 15,

The One Proportion applet  Move to the applet to increase the number of repetitions  Question: Does the long-run guessing pattern convince you that Sarah does better than guess in such situations? Explain. ICOTS9: July 15,

Example 1: Intro to chance models (contd.)  For this first example/exploration, we are deliberate about  Getting across the idea of “is the observed result surprising to have happened by chance alone?”  Using a simple model  Having the observed result be quite clearly in the tail of the null distribution  Avoid terminology such as parameter, hypotheses, null distribution, and p-value ICOTS9: July 15,

Example 1: Intro to chance models (contd.)  Follow-up or “Think about it” questions:  What if Sarah had got 5 correct out of 8? Would her performance be more convincing, less convincing, or similarly convincing that she tends to do better than guess?  What if Sarah had got 14 correct out of 16 questions?  Based on Sarah’s results, can we conclude that all chimpanzees tend to do better than guess?  Step 6 of the 6-Step Statistical Investigation Process ICOTS9: July 15,

Example 2: Measuring the strength of evidence  Research question: Does psychic functioning exist?  Utts (1995) cites research from various studies involving the Ganzfeld technique  “Receiver” sitting in a different room has to choose the picture (from 4 choices) being “sent” by the “sender”  Out of 329 sessions, 106 produced a “hit” (Bem and Honorton, 1994)  Key question: Is the observed number of hits surprising (i.e. unlikely) to have happened by chance alone? ICOTS9: July 15,

Example 2: Measuring the strength of evidence (contd.)  Question: What is the probability of getting a hit by chance?  0.25 (because 1 out of 4)  Can’t use a coin. How about a spinner?  Same logic as before:  Use simulation to generate what the pattern/distribution for “number of hits” could-have- been if receivers are randomly choosing an image from 4 choices.  Compare the observed number of hits (106) to this pattern ICOTS9: July 15,

The One Proportion applet  Question: Is the observed number of hits surprising (i.e. unlikely) to have happened by chance alone?  What’s a measure of how unlikely?  “Tail proportion”  The p-value ! ICOTS9: July 15,

The One Proportion applet  Approx. p-value =  Note that the statistic can either be the number of or the proportion of hits ICOTS9: July 15,

Example 2: Measuring the strength of evidence (contd.)  Natural follow-ups  The standardized statistic (or z-score) as a measure of how far the observed result is in the tail of the null distribution  Theoretical distribution: the normal model, and normal approximation-based p-value  Examples of studies where the normal approximation is not a valid approach ICOTS9: July 15,

Example 2: Measuring the strength of evidence (contd.)  For this example we are deliberate about  Formalizing terminology such as hypotheses, parameter vs. statistic (with symbols), null distribution, and p-value  Moving away from model  Still staying with a one-sided alternative to facilitate the understanding of what the p-value measures, but in a simpler scenario ICOTS9: July 15,

What comes next  Two-sided tests for one proportion  Sampling from a finite population  Tests of significance for one mean  Confidence intervals: for one proportion, and for one mean  Observational studies vs. experiments  Comparing two groups – simulating randomization tests… ICOTS9: July 15,

Advantages of this approach  Does not rely on a formal discussion of probability, and hence can be used to introduce statistical inference as early as week 1  Provides a lot of opportunity for activity/exploration-based learning  Students seem to find it easier to interpret the p-value  Students seem to find it easier to remember that smaller p- values provide stronger evidence against the null ICOTS9: July 15,

Advantages of this approach (contd.)  Allows one to use the spiral approach  To deepen student understanding throughout the course  Allows one to use other statistics that don’t have theoretical distributions; for example, difference in medians, or relative risk (without getting into logs)  Most importantly, this approach is more fun for instructors (not that I am biased ) ICOTS9: July 15,

Assessment results  Beth Chance and Karen McGaughey, “Impact of simulation/randomization-based curriculum on student understanding of p-values and confidence intervals” – Session 6B, Thursday, 10:55 am  Nathan Tintle, “Quantitative evidence for the use of simulation and randomization in the introductory statistics course” – Session 8A; see Proceedings  Todd Swanson and Jill VanderStoep, “Student attitudes towards statistics from a randomization-based curriculum” – Session 1F; see Proceedings ICOTS9: July 15,

Acknowledgements  Thank you for listening!  National Science Foundation DUE/TUES ,  If you’d like to know more:  Workshop on Saturday, July 19, 8:00 am to 5:00 pm  “Modifying introductory courses to use simulation methods as the primary introduction to statistical inference”  Presenters: Beth Chance, Kari Lock Morgan, Patti Lock, Robin Lock, Allan Rossman, Todd Swanson, Jill VanderStoep ICOTS9: July 15,

Resources  Course materials: Introduction to Statistical Investigations (Fall 2014, John Wiley and Sons) by Nathan Tintle, Beth Chance, George Cobb, Allan Rossman, Soma Roy, Todd Swanson, Jill VanderStoep  Applets:  ICOTS9: July 15,