Presentation is loading. Please wait.

Presentation is loading. Please wait.

Teaching Introductory Statistics with Simulation-Based Inference Allan Rossman and Beth Chance Cal Poly – San Luis Obispo

Similar presentations


Presentation on theme: "Teaching Introductory Statistics with Simulation-Based Inference Allan Rossman and Beth Chance Cal Poly – San Luis Obispo"— Presentation transcript:

1 Teaching Introductory Statistics with Simulation-Based Inference Allan Rossman and Beth Chance Cal Poly – San Luis Obispo arossman@calpoly.edu bchance@calpoly.edu

2 2 AMATYC webinar April 2016 2 Outline Who are you? Overview, motivation Three examples Advantages Implementation suggestions Assessment suggestions Resources Q&A

3 Who are you? How many years have you been teaching?  < 1 year  1-3 years  4-8 years  8-15 years  > 15 years AMATYC webinar April 2016 3

4 Who are you? How many years have you been teaching statistics?  Never  1-3 years  4-8 years  8-15 years  > 15 years AMATYC webinar April 2016 4

5 Who are you? What is your background in statistics?  No formal background  A course or two  Several courses but no degree  Undergraduate degree in statistics  Graduate degree in statistics  Other AMATYC webinar April 2016 5

6 Who are you? Have you used simulation in teaching statistics?  Never  A bit, to demonstrate probability ideas  Somewhat, to demonstrate sampling distributions  A great deal, as an inference tool as well as for pedagogical demonstrations AMATYC webinar April 2016 6

7 p-values Do your students ever struggle if you ask them to explain to you what a p-value represents?  Never  Sometimes  Always AMATYC webinar April 2016 7

8 p-values Do your students ever ask you whether they want a large p-value or a small p-value?  Never  Sometimes  Always AMATYC webinar April 2016 8

9 99 Ptolemaic Curriculum? “Ptolemy’s cosmology was needlessly complicated, because he put the earth at the center of his system, instead of putting the sun at the center. Our curriculum is needlessly complicated because we put the normal distribution, as an approximate sampling distribution for the mean, at the center of our curriculum, instead of putting the core logic of inference at the center.” – George Cobb (TISE, 2007) AMATYC webinar April 2016

10 10 Example 1: Helper/hinderer? Sixteen pre-verbal infants were shown two videos of a toy trying to climb a hill  One where a “helper” toy pushes the original toy up  One where a “hinderer” toy pushes the toy back down Infants were then presented with the two toys from the videos  Researchers noted which toy then infant chose to play with http://www.yale.edu/infantlab/socialevaluation/Helpe r-Hinderer.html http://www.yale.edu/infantlab/socialevaluation/Helpe r-Hinderer.html AMATYC webinar April 2016

11 11 Example 1: Helper/hinderer? Data: 14 of the 16 infants chose the “helper” toy Two possible explanations  Infants choose randomly, no genuine preference, researchers just got lucky  Infants have a genuine preference for the helper toy Core question of inference:  Is such an extreme result unlikely to occur by chance (random choice) alone …  … if there were no genuine preference (null model)? AMATYC webinar April 2016

12 12 Analysis options Could use the normal approximation to the binomial, but sample size is too small for CLT Could use a binomial probability calculation We prefer a simulation approach  To illustrate “how often would we get a result like this just by random chance?”  Starting with tactile simulation AMATYC webinar April 2016

13 13 Strategy Students flip a fair coin 16 times  Count number of heads, representing choices of helper and hinderer toys  Under the null model of no genuine preference Repeat several times, combine results  See how surprising it is to get 14 or more heads even with “such a small sample size”  Approximate (empirical) p-value Turn to applet for large number of repetitions: www.rossmanchance.com/ISIapplets.html (One Proportion) www.rossmanchance.com/ISIapplets.html AMATYC webinar April 2016

14 14 Results  Pretty unlikely to obtain 14 or more heads in 16 tosses of a fair coin, so …  Pretty strong evidence that pre-verbal infants do have a genuine preference for helper toy and were not just choosing at random AMATYC webinar April 2016

15 Follow-up activity Facial prototyping  Who is on the left – Bob or Tim? AMATYC webinar April 2016 15

16 Follow-up activity Facial prototyping  Does our sample result provide convincing evidence that people have a genuine tendency to assign the name Tim to the face on the left?  How can we use simulation to investigate this question?  What conclusion would you draw?  Explain reasoning process behind conclusion AMATYC webinar April 2016 16

17 17 Example 2: Dolphin therapy? Subjects who suffer from mild to moderate depression were flown to Honduras, randomly assigned to a treatment Is dolphin therapy more effective than control? Core question of inference:  Is such an extreme difference unlikely to occur by chance (random assignment) alone (if there were no treatment effect)? AMATYC webinar April 2016

18 18 Some approaches Could calculate test statistic, p-value from approximate sampling distribution (z, chi-square)  But it’s approximate  But conditions might not hold  But how does this relate to what “significance” means? Could conduct Fisher’s Exact Test  But there’s a lot of mathematical start-up required  But that’s still not closely tied to what “significance” means Even though this is a randomization test AMATYC webinar April 2016

19 19 Alternative approach Simulate random assignment process many times, see how often such an extreme result occurs  30 index cards representing 30 subjects  Assume no treatment effect (null model) 13 improver cards, 17 non-improver cards  Re-randomize 30 subjects to two groups of 15 and 15  Determine number of improvers in dolphin group Or, equivalently, difference in improvement proportions  Repeat large number of times (turn to computer)  Ask whether observed result is in tail of distribution AMATYC webinar April 2016 ? ?

20 20 Analysis www.rossmanchance.com/ISIapplets (Two Proportions) www.rossmanchance.com/ISIapplets AMATYC webinar April 2016 20

21 21 Conclusion Experimental result is statistically significant  And what is the logic behind that? Observed result very unlikely to occur by chance (random assignment) alone (if dolphin therapy was not effective) Providing evidence that dolphin therapy is more effective AMATYC webinar April 2016

22 22 Example 3: Lingering sleep deprivation? Does sleep deprivation have harmful effects on cognitive functioning three days later?  21 subjects; random assignment Core question of inference:  Is such an extreme difference unlikely to occur by chance (random assignment) alone (if there were no treatment effect)? AMATYC webinar April 2016

23 23 One approach Calculate test statistic, p-value from approximate sampling distribution AMATYC webinar April 2016

24 24 Another approach Simulate randomization process many times under null model, see how often such an extreme result (difference in group means) occurs AMATYC webinar April 2016

25 25 Advantages You can do this from beginning of course! Emphasizes entire process of conducting statistical investigations to answer real research questions  From data collection to inference in one day  As opposed to disconnected blocks of data analysis, then data collection, then probability, then statistical inference Leads to deeper understanding of concepts such as statistical significance, p-value, confidence Very powerful, easily generalized tool  Flexibility in choice of test statistic (e.g. medians, odds ratio)  Generalize to more than two groups AMATYC webinar April 2016

26 26 Implementation suggestions What about normal-based methods: why? Do not ignore them!  A common shape often arises for empirical randomization/sampling distributions Duh!  Students will see t-tests in other courses, research literature  Process of standardization has inherent value  Gain intuition through formulas AMATYC webinar April 2016 26

27 Implementation suggestions What about normal-based methods: how? Introduce after students have gained experience with randomization-based methods As a prediction of the simulation results Focus on role of standard deviation of statistic (standard error)  Don’t do it all for them (you or technology) Start with tactile simulation (“I am that dot”) Applet still requires some thought like sample size or entering the observed statistic to find the p-value AMATYC webinar April 2016 27

28 28 Implementation suggestions What about interval estimation? Two possible simulation-based approaches  Invert test Test “all” possible values of parameter, see which do not put observed result in tail Easy enough (but tedious) with one-proportion situation (sliders), but not as obvious how to do this with comparing two proportions  Estimate +/- margin-of-error Could estimate margin-of-error with simulated randomization distribution Rough confidence interval as statistic + 2×(SD of statistic) AMATYC webinar April 2016 28

29 29 Implementation suggestions Can we introduce SBI gradually? One class period:  Use helper/hinderer activity to introduce concepts of statistical significance, p-value, could this have happened by random chance alone Two class periods:  Also use dolphin therapy activity to introduce inference for comparing two groups (chance = random assignment) Three class periods:  Also use sleep deprivation activity prior to two-sample t- tests Four class periods:  Also use an activity (perhaps draft lottery) to introduce inference for correlation (chance = drawing of numbers) AMATYC webinar April 2016 29

30 Assessment suggestions Quick assessment of understanding of class activity  What did the cards represent?  What did shuffling and dealing the cards represent?  What implicit assumption about the two groups did the shuffling of cards represent?  What observational units were represented by the dots on the dotplot?  Why did we count the number of repetitions with 10 or more “successes” (that is, why 10 and why “or more”)? 30 AMATYC webinar April 2016 30

31 31 Assessment suggestions Conceptual understanding of logic of inference  Interpret p-value in context: Probability of observed data, or more extreme, under randomness hypothesis, if null model is true  Summarize conclusion in context, and explain reasoning process  Jargon-free multiple choice questions on interpretation, effect of changing sample size, etc.  Ability to apply to new studies, scenarios Define null model, design simulation, draw conclusion More complicated scenarios (e.g., compare 3 groups), new statistics (e.g., relative risk) AMATYC webinar April 2016 31

32 Assessment A graduate student is designing a research study. She is hoping to show that the results of an experiment are statistically significant. What type of p-value would she want to obtain?  The magnitude of a p-value has no impact on statistical significance.  A large p-value  A small p-value AMATYC webinar April 2016 32

33 Assessment A study of the effectiveness of a nicotine lozenge for helping smokers to quit found that of 459 nicotine lozenge users, 46.0% successfully abstained for 6 weeks, compared to 29.7% of the 458 smokers in the control group. What will be the purpose of the simulation analysis?  To increase the sample size of the study  To estimate the difference in the treatment probabilities  To determine if the observed difference is unlikely to have happened by chance alone  To create a normal distribution  To create similar groups AMATYC webinar April 2016 33

34 34 Assessment suggestions Multiple-choice example You want to investigate a claim that women are more likely than men to dream in color. You take a random sample of men and a random sample of women (in your community) and ask whether they dream in color, and compare the proportions of each gender that dream in color. AMATYC webinar April 2016 34

35 35 Assessment suggestions Multiple-choice example If the difference in the proportions (who dream in color) between the two samples has a small p-value, which would be the best interpretation? A. It would not be very surprising to obtain the observed sample results if there is really no difference between the proportions of men and women in your community that dream in color. B. It would be very surprising to obtain the observed sample results if there is really no difference between the proportions of men and women in your community that dream in color. C. It would be very surprising to obtain the observed sample results if there is really a difference between the proportion of men and women in your community that dream in color. D. The probability is very small that there is no difference between the proportions of men and women in your community that dream in color. E. The probability is very small that there is a difference between the proportions of men and women in your community that dream in color. AMATYC webinar April 2016 35

36 36 Assessment suggestions Multiple-choice example Suppose two more studies are conducted on this issue. Both studies find 30% of women sampled dream in color, compared to 20% of men. But Study C consists of 100 people of each sex, whereas Study D consists of 40 people of each gender. Which study would provide stronger evidence that there is a genuine difference between men and women on this issue? A. Study C B. Study D C. The strength of evidence would be the same for these two studies AMATYC webinar April 2016 36

37 37 Conclusions Put core logic of inference at center of course  Normal-based methods obscure this logic  Develop students’ understanding with experiential simulation-based inference  Emphasize connections among Randomness in design of study Inference procedure Scope of conclusions AMATYC webinar April 2016 37

38 Resources AMATYC webinar April 2016 38

39 Resources AMATYC webinar April 2016 39

40 Resources Simulation-based inference blog: www.causeweb.org/sbi/ www.causeweb.org/sbi/ ISI applets: www.rossmanchance.com/ISIapplets.html Statkey app: lock5stat.com/statkey AMATYC webinar April 2016 40

41 Thanks! Questions? arossman@calpoly.edu bchance@calpoly.edu AMATYC webinar April 2016 41


Download ppt "Teaching Introductory Statistics with Simulation-Based Inference Allan Rossman and Beth Chance Cal Poly – San Luis Obispo"

Similar presentations


Ads by Google