Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to Educational Research Statistics Graham McMahon MD MMSc 1.

Similar presentations


Presentation on theme: "An Introduction to Educational Research Statistics Graham McMahon MD MMSc 1."— Presentation transcript:

1 An Introduction to Educational Research Statistics Graham McMahon MD MMSc gmcmahon@partners.org 1

2 Course Overview  Last week: Stages of a trial from design to completion Stages of a trial from design to completion Generating hypotheses Generating hypotheses Working with the IRB Working with the IRB Considering the funding required Considering the funding required Trial Designs Trial Designs  Today: Choosing an outcome variable Choosing an outcome variable Powering your study Powering your study Establishing inter-rater reliability Establishing inter-rater reliability Determining if there is a difference between two groups Determining if there is a difference between two groups Test development Test development Qualitative approaches Qualitative approaches 2

3 Stages of an Educational Interventional Trial StageActivities 1 Initial Design Hypothesis, Size 2 Protocol Design Define methods, collaborations, IRB 3Recruitment Subject Acquisition, Monitoring 4Followup Collect outcome data 5Analysis Prepare “Clean + Locked” Database Perform analysis 6Reporting Write and submit manuscript 7 Additional analyses Further explorations of trial data 3

4 Population & Sampling  Must balance Variability [the smaller or more diverse the population, the more variable; variability creates error] Variability [the smaller or more diverse the population, the more variable; variability creates error] Generalizability [population can’t be too specific] Generalizability [population can’t be too specific] Access [you can only study those you have access to] Access [you can only study those you have access to] Cost [larger studies are much more expensive] Cost [larger studies are much more expensive]  Consider Participation rate Participation rate Multiple sites Multiple sites Online projects Online projects Lower reimbursement Lower reimbursement 4

5 Outcome  What is really important?  What would colleagues care about?  ‘Hard’ outcomes Death, attendence, Death, attendence,  ‘Soft’ outcomes Satisfaction, self-confidence Satisfaction, self-confidence 5

6 Outcomes / Endpoints  Primary Outcome What you power your study on What you power your study on  Secondary Outcome Other related outcomes that may be interesting to test Other related outcomes that may be interesting to test  Exploratory Outcomes Association studies, subgroups that may be interesting, but likely to be underpowered Association studies, subgroups that may be interesting, but likely to be underpowered May serve as pilot data for future studies May serve as pilot data for future studies  Surrogate Endpoint In the causal pathway and affected by the intervention In the causal pathway and affected by the intervention 6

7 Group Activity  Medical errors and patient safety continue to be an important concern for patients and physicians. Numerous reports have suggested that fatigue and sleepiness contribute to medical errors. You are the program director in an internal medicine residency that has 40 residents and want to make a contribution in this area.  List an hypothesis that could be generated based on this reflection.  How would you measure sleepiness?

8 You review the available sleepiness scales and must choose one. Which one is best? A Awake index B Sleepy score C Doze Index D Snory scale E Yawn score Scale Size 8100206012 Mean Rating for Residents 67215305 Standard Deviation for Residents 520493 DistributionMean>MedianMean=MedianMean=MedianMean<MedianMean=Median Expected Score Difference 3145104

9 Power and Error  α is the probability of making a Type I error  Power is the likelihood of avoiding a Type II error  Use trial type, α and power to calculate sample size 9

10 Sample Size Calculations 10

11 Calculating Sample Size Effect Size 1 SD diff between groups with power of 0.8 requires 30-40 subjects 0.3 SD diff between groups with power of 0.8 requires 300-400 subjects 11

12 Simple Calculation  N (per group) = 15.8 / (effect size) 2 for power of 80% and α=0.05  Remember to increase enrollment so that number completing ≥ expected sample size 12

13 You review the available sleepiness scales and must choose one. Which one is best? A Awake index B Sleepy score C Doze Index D Snory scale E Yawn score Scale Size 8100206012 Mean Rating for Residents 67215307 Standard Deviation for Residents 520493 DistributionMean>MedianMean=MedianMean=MedianMean<MedianMean=Median Expected Score Difference 3145104 Effect size = score difference / standard deviation

14 Power and Samples Sizes A Awake index B Sleepy score C Doze Index D Snory scale E Yawn score Scale Size 8100206012 Mean Rating for Residents 67215305 Standard Deviation for Residents 520493 DistributionMean>MedianMean=MedianMean=MedianMean<MedianMean=Median Expected Score Difference 3145104 N per group 2933262010 Power (N=15 per grp) 0.350.450.910.840.94

15 Calculating Sample Size using Software Difference between groups Standard Deviation Choose Test http:// biostat.mc.vanderbilt.edu/twiki /bin/view/Main/PowerSampleSize 15

16 Two faculty offer to measure the sleepiness of residents using your scale. How can you find out if they are good raters?

17 Interrater Reliability   Interrater reliability is the extent to which two or more individuals (coders or raters) agree.   Training, education and monitoring skills can enhance interrater reliability.   Goal is generally reliability > 0.8 Categorial: measure % Ordinal: spearman rho Continuous: pearson r Rater 1Rater 2Rater 1Rater 2 1235 2134 3353 4456 5675 6873 7798 8597 Pearson 0.81 Pearson 0.56

18 Analyzing your Data  Plan your analysis  Consider consulting a specialist  Test for normality  Choose the right test  Avoid statistical explorations with the data 18

19   You start your study and find that among the interns the M:F ratio was 12:5 and 8:9 and wonder if they are statistically unbalanced.

20 Categorical Counts  Chi-square statistic: no cell in the table should have an expected frequency of <1, and no more than 20% of the cells should have an expected frequency of <5.  Use Fisher’s exact test when numbers are small Group 1Group 2 Men128 Women59 Chi-square = 1.1 Fisher exact, p=0.29 20

21  You collect your baseline observations and find the following sleepiness in each group. Are they different? Grp 1 – 8, 6, 5, 2, 3, 9, 11, 6, 11 Grp 1 – 8, 6, 5, 2, 3, 9, 11, 6, 11 Grp 2 – 3, 5, 5, 2, 7, 4, 8, 10, 2 Grp 2 – 3, 5, 5, 2, 7, 4, 8, 10, 2

22 Summary of Tests Type of DataTwo Paired Groups Two Independent Groups Many Independent Groups Correlation CategoriesMcNemarChi-square ContinuousPaired t-testt-testANOVAPearson r RankWilcoxonKruskal-WallisSpearman r 22 Test for Normality!

23 t-test  Comparing two means  Check if paired or unpaired  The more SE’s you are away from zero, the less likely that the difference occurred by chance Had Elective No Elective Number of students 14548 Mean Score76%64% SD1211 23

24 Testing difference between two groups over time  t- test on between group difference at end  t-test on change over time Time 1Time 2 24

25 ANOVA (analysis of variance)  Extension of t-tests across more than two groups.  Gives a single overall test of whether there are differences between groups.  Avoids multiple comparisons  Compares variance within a group to variance between groups.  Generates an F-statistic and P-value Group 1 Group 2 Group 3 Group 4 25

26 Statistical Tests for Skewed or Rank Data  These data don’t follow normal rules  Non-parametric tests are less powerful  Two groups Wilcoxon rank sum (=Mann-Whitney-U) Wilcoxon rank sum (=Mann-Whitney-U)  Three or more groups Kruskal-Wallis Kruskal-Wallis 26

27 Wilcoxon Rank Sum  Rank all observations in increasing order of magnitude, ignoring which group they come from.  Add up the ranks in the smaller of the two groups.  Look up the critical value of the sum of ranks for that size group. 27

28 Summary of Tests Type of DataTwo Paired Groups Two Independent Groups Many Independent Groups Correlation CategoriesMcNemarChi-square ContinuousPaired t-testt-testANOVAPearson r RankWilcoxonKruskal-WallisSpearman r 28

29 Summary  Careful choice of your population will improve your chances of finding an effect  Choose your outcome measure thoughtfully  Estimate your power and sample size in advance  Ensure internal consistency is good  Determine normality and analyze your dataset accordingly

30 Graham McMahon gmcmahon@partners.org 30


Download ppt "An Introduction to Educational Research Statistics Graham McMahon MD MMSc 1."

Similar presentations


Ads by Google