Download presentation
Presentation is loading. Please wait.
1
An Introduction to Educational Research Statistics Graham McMahon MD MMSc gmcmahon@partners.org 1
2
Course Overview Last week: Stages of a trial from design to completion Stages of a trial from design to completion Generating hypotheses Generating hypotheses Working with the IRB Working with the IRB Considering the funding required Considering the funding required Trial Designs Trial Designs Today: Choosing an outcome variable Choosing an outcome variable Powering your study Powering your study Establishing inter-rater reliability Establishing inter-rater reliability Determining if there is a difference between two groups Determining if there is a difference between two groups Test development Test development Qualitative approaches Qualitative approaches 2
3
Stages of an Educational Interventional Trial StageActivities 1 Initial Design Hypothesis, Size 2 Protocol Design Define methods, collaborations, IRB 3Recruitment Subject Acquisition, Monitoring 4Followup Collect outcome data 5Analysis Prepare “Clean + Locked” Database Perform analysis 6Reporting Write and submit manuscript 7 Additional analyses Further explorations of trial data 3
4
Population & Sampling Must balance Variability [the smaller or more diverse the population, the more variable; variability creates error] Variability [the smaller or more diverse the population, the more variable; variability creates error] Generalizability [population can’t be too specific] Generalizability [population can’t be too specific] Access [you can only study those you have access to] Access [you can only study those you have access to] Cost [larger studies are much more expensive] Cost [larger studies are much more expensive] Consider Participation rate Participation rate Multiple sites Multiple sites Online projects Online projects Lower reimbursement Lower reimbursement 4
5
Outcome What is really important? What would colleagues care about? ‘Hard’ outcomes Death, attendence, Death, attendence, ‘Soft’ outcomes Satisfaction, self-confidence Satisfaction, self-confidence 5
6
Outcomes / Endpoints Primary Outcome What you power your study on What you power your study on Secondary Outcome Other related outcomes that may be interesting to test Other related outcomes that may be interesting to test Exploratory Outcomes Association studies, subgroups that may be interesting, but likely to be underpowered Association studies, subgroups that may be interesting, but likely to be underpowered May serve as pilot data for future studies May serve as pilot data for future studies Surrogate Endpoint In the causal pathway and affected by the intervention In the causal pathway and affected by the intervention 6
7
Group Activity Medical errors and patient safety continue to be an important concern for patients and physicians. Numerous reports have suggested that fatigue and sleepiness contribute to medical errors. You are the program director in an internal medicine residency that has 40 residents and want to make a contribution in this area. List an hypothesis that could be generated based on this reflection. How would you measure sleepiness?
8
You review the available sleepiness scales and must choose one. Which one is best? A Awake index B Sleepy score C Doze Index D Snory scale E Yawn score Scale Size 8100206012 Mean Rating for Residents 67215305 Standard Deviation for Residents 520493 DistributionMean>MedianMean=MedianMean=MedianMean<MedianMean=Median Expected Score Difference 3145104
9
Power and Error α is the probability of making a Type I error Power is the likelihood of avoiding a Type II error Use trial type, α and power to calculate sample size 9
10
Sample Size Calculations 10
11
Calculating Sample Size Effect Size 1 SD diff between groups with power of 0.8 requires 30-40 subjects 0.3 SD diff between groups with power of 0.8 requires 300-400 subjects 11
12
Simple Calculation N (per group) = 15.8 / (effect size) 2 for power of 80% and α=0.05 Remember to increase enrollment so that number completing ≥ expected sample size 12
13
You review the available sleepiness scales and must choose one. Which one is best? A Awake index B Sleepy score C Doze Index D Snory scale E Yawn score Scale Size 8100206012 Mean Rating for Residents 67215307 Standard Deviation for Residents 520493 DistributionMean>MedianMean=MedianMean=MedianMean<MedianMean=Median Expected Score Difference 3145104 Effect size = score difference / standard deviation
14
Power and Samples Sizes A Awake index B Sleepy score C Doze Index D Snory scale E Yawn score Scale Size 8100206012 Mean Rating for Residents 67215305 Standard Deviation for Residents 520493 DistributionMean>MedianMean=MedianMean=MedianMean<MedianMean=Median Expected Score Difference 3145104 N per group 2933262010 Power (N=15 per grp) 0.350.450.910.840.94
15
Calculating Sample Size using Software Difference between groups Standard Deviation Choose Test http:// biostat.mc.vanderbilt.edu/twiki /bin/view/Main/PowerSampleSize 15
16
Two faculty offer to measure the sleepiness of residents using your scale. How can you find out if they are good raters?
17
Interrater Reliability Interrater reliability is the extent to which two or more individuals (coders or raters) agree. Training, education and monitoring skills can enhance interrater reliability. Goal is generally reliability > 0.8 Categorial: measure % Ordinal: spearman rho Continuous: pearson r Rater 1Rater 2Rater 1Rater 2 1235 2134 3353 4456 5675 6873 7798 8597 Pearson 0.81 Pearson 0.56
18
Analyzing your Data Plan your analysis Consider consulting a specialist Test for normality Choose the right test Avoid statistical explorations with the data 18
19
You start your study and find that among the interns the M:F ratio was 12:5 and 8:9 and wonder if they are statistically unbalanced.
20
Categorical Counts Chi-square statistic: no cell in the table should have an expected frequency of <1, and no more than 20% of the cells should have an expected frequency of <5. Use Fisher’s exact test when numbers are small Group 1Group 2 Men128 Women59 Chi-square = 1.1 Fisher exact, p=0.29 20
21
You collect your baseline observations and find the following sleepiness in each group. Are they different? Grp 1 – 8, 6, 5, 2, 3, 9, 11, 6, 11 Grp 1 – 8, 6, 5, 2, 3, 9, 11, 6, 11 Grp 2 – 3, 5, 5, 2, 7, 4, 8, 10, 2 Grp 2 – 3, 5, 5, 2, 7, 4, 8, 10, 2
22
Summary of Tests Type of DataTwo Paired Groups Two Independent Groups Many Independent Groups Correlation CategoriesMcNemarChi-square ContinuousPaired t-testt-testANOVAPearson r RankWilcoxonKruskal-WallisSpearman r 22 Test for Normality!
23
t-test Comparing two means Check if paired or unpaired The more SE’s you are away from zero, the less likely that the difference occurred by chance Had Elective No Elective Number of students 14548 Mean Score76%64% SD1211 23
24
Testing difference between two groups over time t- test on between group difference at end t-test on change over time Time 1Time 2 24
25
ANOVA (analysis of variance) Extension of t-tests across more than two groups. Gives a single overall test of whether there are differences between groups. Avoids multiple comparisons Compares variance within a group to variance between groups. Generates an F-statistic and P-value Group 1 Group 2 Group 3 Group 4 25
26
Statistical Tests for Skewed or Rank Data These data don’t follow normal rules Non-parametric tests are less powerful Two groups Wilcoxon rank sum (=Mann-Whitney-U) Wilcoxon rank sum (=Mann-Whitney-U) Three or more groups Kruskal-Wallis Kruskal-Wallis 26
27
Wilcoxon Rank Sum Rank all observations in increasing order of magnitude, ignoring which group they come from. Add up the ranks in the smaller of the two groups. Look up the critical value of the sum of ranks for that size group. 27
28
Summary of Tests Type of DataTwo Paired Groups Two Independent Groups Many Independent Groups Correlation CategoriesMcNemarChi-square ContinuousPaired t-testt-testANOVAPearson r RankWilcoxonKruskal-WallisSpearman r 28
29
Summary Careful choice of your population will improve your chances of finding an effect Choose your outcome measure thoughtfully Estimate your power and sample size in advance Ensure internal consistency is good Determine normality and analyze your dataset accordingly
30
Graham McMahon gmcmahon@partners.org 30
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.