Download presentation
Presentation is loading. Please wait.
Published byJody Porter Modified over 8 years ago
1
Assessing Statistical Significance ROSS 2016 Lane-Getaz
2
Goals for The Dolphin Study Lesson(s) Assess a categorical response variable (yes/no) between two groups. Assess the design of the study. Examine numerical (and graphical) summaries of the data. Randomly assign subjects to groups (many times) to depict the null distribution. Assess where the observed value (or larger) falls in the null distribution. Draw an appropriate conclusion about statistical significance, within the confines of the study design.
3
3 Explanatory groups swam for one hour per day” Dolphin Therapy group swam with bottlenose dolphins. Control group did not swim with dolphins. Response variable was reported substantial improvement (yes or no). Dolphin Therapy (Part I for Thursday) Antonioli and Reveley (2005) investigated whether swimming with dolphins was therapeutic for reducing depression. Researchers recruited 30 subjects aged 18-56 with mild to moderate depression.
4
4 The Dolphins and Depression Data A Depression scale was used to measure the subjects’ depression level before/after the Dolphin treatment to obtain these results: The researchers wish to claim that the large count of “improvers” in the Dolphin group was caused by the therapy. How unlikely is it for the random assignment process alone to randomly place 10 or more of these 13 improvers into the dolphin therapy group?
5
Dolphins and Depression (The Shuffle) Randomization Test Observed Sample: Each card represents one subject, where 13 will improve regardless of treatment. Explanatory variable: Dolphin Therapy (or Control) Group Response variable: Reports substantial improvement (or not). Statistic: Count of number improved in Dolphin Group. (Note: Observed sample count = 10.) 5
6
6 Think about the Null Hypothesis There are some number of subjects who will improve regardless of the treatment. How often would we count of number improved in the Dolphin group (that is 10) just by luck, if the treatment had no effect. One method we can use to assess how unlikely we might see 10 or more successes, if the therapy had no effect (H o ), is to depict the null distribution and find the observed value. We count the observed and value )and others that would be even more convincing) out of all the values in the null distribution, that is the p-value. For every p-value there’s a null hypothesis waiting to be judged (Reject or not)!
8
Thought Questions What does this p-value mean relative to the null hypothesis? Do we “Reject the H o or not?” Based on the card shuffling simulation, can you say that the study results (the Counts of Success in Dolphin stack) are statistically significant? If we reject the Ho, the results are statistically significant. Given the design of this study, would you conclude that the Dolphin Therapy is causing the reduction in depression? Why or why not? 8
9
9 The Randomization (Reshuffle) Test This Dolphin study shuffle simulates the randomization test. The randomization test was described in the first half of the 20th century by Sir Ronald A. Fisher. Fisher’s Test finds all possible randomizations for the two-way table. (We just generated 60 of the possible random assignments and we were able to come to a conclusion.) The p-value is the ratio of samples with counts at least as extreme as observed divided by the number of re- randomizations.
10
Dolphin Therapy (Part II for Friday) Difference between Proportions Observed Sample: Each card represents one subject, where 13 will improve regardless of treatment. Explanatory variable: Dolphin Therapy (or Control) Group Response variable: Reports substantial improvement (or not). Statistic: Difference in Proportions of “improvers” in the Dolphin Group compared to the Control Group. (Note: Observed sample difference =.467) 10
11
Dolphin Randomization Sample Data Observed sample: 11 http://www.rossmanchance.com/applets/ChiSqShuffle.html?dolphins=1
12
Conduct the Randomization Simulation: “Randomize, Replicate…and Reject?” Null Hypothesis: There is no difference in the proportions who improve in the Dolphin Therapy Group compared to the Control Group. Randomize reshuffle the subjects into two groups. Replicate the “random reassignment” many times and examine the variation in the distribution of statistics (difference in proportions) that occur just by random assignment. Reject the null hypothesis if the sample statistic falls in the tails of the randomization distribution. 12
13
Dolphin Randomization Simulation with Technology Randomization distribution of proportion differences under the null hypothesis: 13
14
Thought Questions Based on the computer simulation analysis, would you say that the Difference in Proportions of Success in Dolphin Therapy group compared to the Control Group is statistically significant? What does this mean relative to the null hypothesis? What was the design of this study? Would you conclude that the Dolphin Therapy is causing the reduction in depression? Why or why not? Do any of these answers really change just because we are looking at the difference in proportion instead of the count of successes in the Dolphin group? 14
15
After the Dolphin activity are you able to… a)Identify types of variables in the study (categorical or quantitative?) b)State the hypotheses for the Dolphin Therapy study (H o and H a ). c)Describe how playing cards were used to depict the null distribution. d)Compute proportions for success in each study group. e)Compute the difference between proportions (p Dolphin – p Control ). f)Discuss if the observed difference is in the direction we hypothesized. g)Run the Two-way Table applet to create a null distribution. h)Report and interpret your p-value in the context of the study. i)Draw a conclusion in the context of the study design. (Are results generalizable, causal or neither?)
16
More about Randomization Tests Bradley Efron, statistician (born May 24, 1938) and Professor at Stanford University: Used compute-intensive statistical techniques to replace some of the algebra-based statistical analysis using computer simulations. Efron shuffled labels on data points to perform statistical significance tests (the randomization simulation) based on Fisher’s Test. – Rossman (2008, p. 17) suggests “simulation of the randomization test provides an informal and effective way to introduce “the logic of statistical inference.” – We use these compute-intensive methods to further develop conceptual understanding of statistical methods. 16
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.