Rerandomization to Improve Baseline Balance in Educational Experiments

Slides:

Advertisements

Similar presentations

Inference: Fishers Exact p-values STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke.

Advertisements

Rerandomization in Randomized Experiments STAT 320 Duke University Kari Lock Morgan.

A Guide to Education Research in the Era of NCLB Brian Jacob University of Michigan December 5, 2007.

Sources of bias in experiments and quasi-experiments sean f. reardon stanford university 11 december, 2006.

Achievement Analyses – Matched Cohort Groups Oklahoma A+ Schools® vs. Randomly Matched OKCPS Students  OKLAHOMA CITY PUBLIC SCHOOLS  PLANNING, RESEARCH,

Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting.

Evaluating Hypotheses

Treatment Effects: What works for Whom? Spyros Konstantopoulos Michigan State University.

Chapter 28 Design of Experiments (DOE). Objectives Define basic design of experiments (DOE) terminology. Apply DOE principles. Plan, organize, and evaluate.

Using Covariates in Experiments: Design and Analysis STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical.

Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.

Inferential statistics Hypothesis testing. Questions statistics can help us answer Is the mean score (or variance) for a given population different from.

-- Preliminary, Do Not Quote Without Permission -- VALUE-ADDED MODELS AND THE MEASUREMENT OF TEACHER QUALITY Douglas HarrisTim R. Sass Dept. of Ed. LeadershipDept.

ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?

Advanced Statistics for Interventional Cardiologists.

AADAPT Workshop Latin America Brasilia, November 16-20, 2009 Non-Experimental Methods Florence Kondylis.

Statistical Power 1. First: Effect Size The size of the distance between two means in standardized units (not inferential). A measure of the impact of.

SUTVA, Assignment Mechanism STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.

Copyright © 2011 Pearson Education, Inc. Analysis of Variance Chapter 26.

Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR 5/06.

CPUC Workshop on Best Practices & Lessons Learned in Time Variant Pricing TVP Pilot Design and Load Impact M&V Dr. Stephen George Senior Vice President.

Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.

MSRP Year 1 (Preliminary) Impact Research for Better Schools RMC Corporation.

ANOVA: Analysis of Variance.

Comments on Midterm Comments on HW4 Final Project Regression Example Sensitivity Analysis? Quiz STA 320 Design and Analysis of Causal Studies Dr. Kari.

Precision Gains from Publically Available School Proficiency Measures Compared to Study-Collected Test Scores in Education Cluster-Randomized Trials June.

Generalizing Observational Study Results Applying Propensity Score Methods to Complex Surveys Megan Schuler Eva DuGoff Elizabeth Stuart National Conference.

One-Way Analysis of Covariance (ANCOVA)

Matching STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.

Reasoning in Psychology Using Statistics Psychology

Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.

Strategies for Effective Program Evaluations U.S. Department of Education The contents of this presentation were produced by the Coalition for Evidence-Based.

REBECCA M. RYAN, PH.D. GEORGETOWN UNIVERSITY ANNA D. JOHNSON, M.P.A. TEACHERS COLLEGE, COLUMBIA UNIVERSITY ANNUAL MEETING OF THE CHILD CARE POLICY RESEARCH.

Effectiveness of Selected Supplemental Reading Comprehension Interventions: Impacts on a First Cohort of Fifth-Grade Students June 8, 2009 IES Annual Research.

Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11.

Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 26 Analysis of Variance.

Rerandomization in Randomized Experiments Kari Lock and Don Rubin Harvard University JSM 2010.

Randomized Control Trials: What, Why, How, When, and Where Mark L. Davison MESI Conference March 11, 2016 Department of Educational Psychology.

Randomized Evaluation: Dos and Don’ts An example from Peru Tania Alfonso Training Director, IPA.

Research and Evaluation Methodology Program College of Education A comparison of methods for imputation of missing covariate data prior to propensity score.

Evaluation Requirements for MSP and Characteristics of Designs to Estimate Impacts with Confidence Ellen Bobronnikov March 23, 2011.

Comparing Two Proportions

Analysis for Designs with Assignment of Both Clusters and Individuals

Is High Placebo Response Really a Problem in Clinical Trials?

Measuring Results and Impact Evaluation: From Promises into Evidence

Constructing Propensity score weighted and matched Samples Stacey L

Reasoning in Psychology Using Statistics

Lecture 18 Matched Case Control Studies

Randomized Trials: A Brief Overview

Challenges of statistical analysis in surgical trials

March 2017 Susan Edwards, RTI International

S1316 analysis details Garnet Anderson Katie Arnold

Making Statistical Inferences

Reporting the evidence:

Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.

Introduction to Econometrics

Impact Evaluation Methods

1 Causal Inference Counterfactuals False Counterfactuals

2017 EAST AFRICA EVIDENCE SUMMIT July 06, 2018 | KAMPALA, UGANDa

AP - The Advanced Placement Program

From GLM to HLM Working with Continuous Outcomes

Sampling and Power Slides by Jishnu Das.

Reasoning in Psychology Using Statistics

New Techniques and Technologies for Statistics 2017 Estimation of Response Propensities and Indicators of Representative Response Using Population-Level.

CCSSO National Conference on Student Assessment June 21, 2010

Acupuncture for Chronic Pain

Counterfactual models Time dependent confounding

Reasoning in Psychology Using Statistics

Enhancing Causal Inference in Observational Studies

Enhancing Causal Inference in Observational Studies

Presentation transcript:

Rerandomization to Improve Baseline Balance in Educational Experiments Kari Lock Morgan Department of Statistics Pennsylvania State University with Anna Saavedra and Amie Rapaport SREE March 1st, 2018

Motivation RCTs are the “gold standard” for estimating causal effects WHY? They eliminate confounding variables (balance covariates) They yield unbiased estimates … on average! For any particular experiment, covariate imbalance is possible (and likely!), and conditional bias exists

Typical RCT Randomize units to treatment groups Why not check balance before conducting the experiment, when you can still fix it? Conduct experiment Check baseline balance Analyze results

Rerandomization Collect covariate data Specify objective criteria for acceptable balance (Re)randomize units to treatment groups (Re)randomize units to treatment groups Randomize units to treatment groups Check balance unacceptable acceptable Conduct experiment Analyze results

Context Students learn Advanced Placement (AP) content through the Knowledge in Action (KIA) project-based learning approach designed to develop students’ deeper learning of skills and content RCT evaluation of KIA impact on student outcomes Recruited teachers across five districts, teachers in 76 schools enrolled Randomized at the school level within districts

KIA Covariates Only previous cohort data available at the time of randomization Covariates varied by district, but included Standardized test scores (PSAT/AP/8th grade) Socio-economic status % Nonwhite (some districts) Course (APES or APGOV) (some districts)

KIA Criteria Standardized difference in means: | 𝑋 1, 𝑇 − 𝑋 1, 𝐶 | 𝑠 1 , | 𝑋 2, 𝑇 − 𝑋 2, 𝐶 | 𝑠 2 , ... Thresholds varied by district: 0.05 – 0.25 Another option is Mahalanobis distance: 𝑿 𝑇 − 𝑿 𝑐 ′ cov 𝒙 −1 𝑿 𝑇 − 𝑿 𝑐

Covariate Balance: One District Percent reduction in variance: 𝑃𝑅𝐼𝑉= 𝑣𝑎𝑟 𝑥 𝑗, 𝑇 − 𝑥 𝑗, 𝐶 −𝑣𝑎𝑟 𝑥 𝑗, 𝑇 − 𝑥 𝑗, 𝐶 | 𝑟𝑒𝑟𝑎𝑛𝑑. 𝑣𝑎𝑟 𝑥 𝑗, 𝑇 − 𝑥 𝑗, 𝐶

Covariate Balance

Outcome Precision If PRIV is equal for all covariates, then PRIV for the outcome difference in means is 𝑃𝑅𝐼𝑉 𝑌 = 𝑅 2 × 𝑃𝑅𝐼𝑉 𝑋 Here, 𝑅 2 ≈0.75 and 𝑃𝑅𝐼𝑉 𝑋 ≈90%, so 𝑃𝑅𝐼𝑉 𝑌 ≈0.75×0.90=67.5% Precision increases by a factor of 1 1−0.675 = 3.1 Equivalent to more than tripling n!!! (Effective sample size goes from 76 to 234!) 76 to 293

More Power! Significance for smaller effect sizes! Use randomization test to take advantage of this; otherwise inference will be conservative

Regression Rerandomization reduces the need to account for covariates via modeling But if you do still choose to model… 𝑌 𝑖 =𝛼+𝜷 𝑿 𝑖 +𝜏 𝑊 𝑖 + 𝜀 𝑖 Better covariate balance: estimation of 𝜏 depends less on estimation of 𝜷… increases precision/power reduces reliance on modeling assumptions

Why Rerandomize? Avoid bad/unlucky randomizations Improve covariate balance Increase power Reduce reliance on assumptions

klm47@psu.edu Morgan, K.L., and Rubin, D.B. (2012). “Rerandomization to Improve Covariate Balance in Experiments,” Annals of Statistics, 40(2): 1262-1282. Morgan, K.L. and Rubin, D.B. (2015). “Rerandomization to Balance Tiers of Covariates,” JASA, 110(512): 1412 – 1421.