Improving the Design of STEM Impact Studies: Considerations for Statistical Power Discussant Notes Cristofer Price SREE 03-01-2017.

Slides:



Advertisements
Similar presentations
Designing an impact evaluation: Randomization, statistical power, and some more fun…
Advertisements

Standardized Scales.
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control.
Beyond Null Hypothesis Testing Supplementary Statistical Techniques.
CS 589 Information Risk Management 30 January 2007.
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.
Today Concepts underlying inferential statistics
Chapter 14 Inferential Data Analysis
Research Methods. Research Projects  Background Literature  Aims and Hypothesis  Methods: Study Design Data collection approach Sample Size and Power.
The Campbell Collaborationwww.campbellcollaboration.org Introduction to Robust Standard Errors Emily E. Tanner-Smith Associate Editor, Methods Coordinating.
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Research Project Statistical Analysis. What type of statistical analysis will I use to analyze my data? SEM (does not tell you level of significance)
Problems with the Design and Implementation of Randomized Experiments By Larry V. Hedges Northwestern University Presented at the 2009 IES Research Conference.
From Theory to Practice: Inference about a Population Mean, Two Sample T Tests, Inference about a Population Proportion Chapters etc.
Day 3: Sampling Distributions. CCSS.Math.Content.HSS-IC.A.1 Understand statistics as a process for making inferences about population parameters based.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
The Campbell Collaborationwww.campbellcollaboration.org C2 Training: May 9 – 10, 2011 Introduction to meta-analysis.
CJT 765: Structural Equation Modeling Class 12: Wrap Up: Latent Growth Models, Pitfalls, Critique and Future Directions for SEM.
Sampling Distribution Models Chapter 18. Toss a penny 20 times and record the number of heads. Calculate the proportion of heads & mark it on the dot.
HLM Models. General Analysis Strategy Baseline Model - No Predictors Model 1- Level 1 Predictors Model 2 – Level 2 Predictors of Group Mean Model 3 –
Right Hand Side (Independent) Variables Ciaran S. Phibbs.
Sample Size Determination
Sampling and Nested Data in Practice-Based Research Stephen Zyzanski, PhD Department of Family Medicine Case Western Reserve University School of Medicine.
CHAPTER 8 ESTIMATING WITH CONFIDENCE 8.1 Confidence Intervals: The Basics Outcome: I will determine the point estimate and margin of error from a confidence.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
1 Consider the k studies as coming from a population of effects we want to understand. One way to model effects in meta-analysis is using random effects.
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
“Developing Research Questions”
Chapter 8: Estimating with Confidence
Analysis for Designs with Assignment of Both Clusters and Individuals
Chapter 8: Estimating with Confidence
Sample Size Determination
Sampling Distribution Models
Lurking inferential monsters
Intro to Single Paper Meta-Analyses
Improving Student Engagement Through Audience Response Systems
What we’ll cover today Transformations Inferential statistics
HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association.
12 Inferential Analysis.
More about Tests and Intervals
Inferential Statistics
Power, Sample Size, & Effect Size:
Stat 217 – Day 28 Review Stat 217.
Stat 217 – Day 17 Review.
Gerald Dyer, Jr., MPH October 20, 2016
Confidence Intervals: The Basics
Chapter 3D Chapter 3, part D Fall 2000.
12 Inferential Analysis.
Chapter 8: Estimating with Confidence
CHAPTER 8 Estimating with Confidence
Sampling and Power Slides by Jishnu Das.
Inferential Statistics
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Power analysis Chong-ho Yu, Ph.Ds..
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Non-Experimental designs
Chapter 8: Estimating with Confidence
Advanced Algebra Unit 1 Vocabulary
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Evidence for Enhancement: Making sense of surveys
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Improving the Design of STEM Impact Studies: Considerations for Statistical Power Discussant Notes Cristofer Price SREE 03-01-2017

Theme for my notes In the future, these conference presentations will be papers that you (today’s audience) can utilize when you are: Planning / designing a study of impacts on STEM outcomes; or Interpreting or contextualizing the results from a study with STEM outcomes When we open up the symposium for discussion, what feedback can you give the authors today that will make their papers as useful as possible to YOU in your future endeavors?

Some takeaways Paper 1-- Effect sizes for teacher outcomes Mean effect size = 0.44. Few significant differences in variation by: RCT/QED, publication type, discipline, dependent variable type, treatment components, test developer, duration, preservice vs in-service teachers, research design/quality Paper 2 – ICCs and R2s for teacher outcomes ICCs and R2s provide for multiple outcomes from multiple studies ICCs and R2s vary a lot among outcomes and studies Paper 3 – MDEs for studies with student and teacher outcomes Expect that studies with MDES in range of .17 - .33 for student outcomes would have MDESs in range of .44 - .59 for teacher outcomes

How will YOU use the information about effect sizes for teacher STEM outcomes? Paper 1 suggests that you might use them: During the design phase power analyses (and suggests several options) E.g., if average ES is .44, maybe you should plan your study to have power to detect effects on teacher outcomes that are in the ball park of .44. E.g., use estimates provided to tweak the expected MDE, e.g. to .44 + .10 = .54 if you are planning an RCT During the reporting phase To contextualize your findings When you imagine your future self using these results Is there something else that the authors could have provided that would be useful to you? Is there anything that they provide that you want to call out as being especially useful?

I could imagine my future self thinking… …I want to design a study that is powered to detect substantively meaningful impacts on teacher STEM outcomes I don’t care too much about detecting impacts smaller than that What is a substantively meaningful impact on a teacher outcome? It would be meaningful if the impact on teachers was big enough to be associated with impact on a student outcome of, say, .15 SD units I wish those SREE papers could provide some info on how big the impacts on teachers were when the impact on students were 0.15 SD units or larger

Are there enough studies in the authors’ data set to relate teacher and student impacts? If so, could you characterize how big teacher impacts were in studies where there impacts on student outcomes were found? If so, could you present correlations or other descriptors of the relationships of teacher and student impacts? If so, I would also love to see Paper 3’s results be updated to reflect the scenarios where we think the impacts on teachers would be big enough to be associated with impacts on students.

I could also imagine my future self thinking something like… … I am designing an RCT to evaluate the impacts of an intervention on outcomes for physics teachers I know that effect sizes for impacts on outcomes for physics teacher don’t differ much from the grand mean effect sizes, and those for RCTs where a bit bigger (estimate = .10), so maybe I expect that for the average intervention, the impact would be .44 + .10 = .54. I think I’m looking at a pretty good intervention, but what if it is actually below average? I wish I had some information about the distribution of effect sizes from studies that were like my study e.g., the min, 25th percentile, 75 percentile, and the max. (The funnel plot gives some sense of this for all of the studies)

I hope you are thinking about… …what additional information these papers could provide that would be of use to your future self

When I’m designing a study to look at impacts on STEM teacher outcomes… … my future self is going to use the results from Paper 2 to get plausible values of ICCs and R2s. My future self might be wishing for: R2s from another model, one with only covariates that are student pre-test and demographics aggregated to the teacher level My future self may be hoping for more guidance on: How to make use of the standard errors of ICCs? How wide of a range should I use for ICC values? Why do we think that the level-2 R2s (school-level) are often so much higher that the level-1 R2s (teacher-level)?

I could also imagine that it would be useful if the authors of Paper 2… … used their data to approximate the design parameter information that would be needed to do power calculations for studies where teachers are assigned to T and C conditions within schools If teachers were assigned to T and C conditions within schools, the schools would be “assignment blocks” and the ICC information could be used to approximate the proportion of variance explained by the assignment blocks. Ultimately, IMHO, I think it would be useful to present the total proportion of teacher outcome variance explained by the blocks and all other covariates (teacher covariates, and students data aggregated to the teacher level)

I hope you are thinking about… … whether you have any study data that you could contribute to the authors of Paper 2 to boost their data base and …what additional information these papers could provide that would be of use to your future self

If there enough studies in the authors’ data set to relate teacher and student impacts… … then, as I already said, I would also love to see Paper 3’s results be updated to reflect the scenarios where we think the impacts on teachers would be big enough to be associated with impacts on students. And, if Paper 2 were to present approximate design parameters for studies were teachers are assigned to T and C conditions within schools… Then I think it could be useful if Paper 3 presented results corresponding to scenarios were teachers are assigned to T and C conditions within schools, and students are clustered within teachers.

I hope you are thinking about… …what additional information these papers could provide that would be of use to your future self And That you are ready to start discussing those ideas!