Rerandomization in Randomized Experiments Kari Lock and Don Rubin Harvard University JSM 2010.

Slides:



Advertisements
Similar presentations
Nonparametric estimation of non- response distribution in the Israeli Social Survey Yury Gubman Dmitri Romanov JSM 2009 Washington DC 4/8/2009.
Advertisements

Rerandomization in Randomized Experiments STAT 320 Duke University Kari Lock Morgan.
Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Topic 6: Introduction to Hypothesis Testing
1 MF-852 Financial Econometrics Lecture 4 Probability Distributions and Intro. to Hypothesis Tests Roy J. Epstein Fall 2003.
Using Statistics in Research Psych 231: Research Methods in Psychology.
BCOR 1020 Business Statistics Lecture 22 – April 10, 2008.
Simulation Modeling and Analysis Session 12 Comparing Alternative System Designs.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 9-1 Introduction to Statistics Chapter 10 Estimation and Hypothesis.
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
A Decision-Making Approach
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Statistics for the Social Sciences Psychology 340 Spring 2005 Hypothesis testing.
Inferences About Process Quality
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
BCOR 1020 Business Statistics Lecture 20 – April 3, 2008.
Analysis of Variance & Multivariate Analysis of Variance
Using Statistics in Research Psych 231: Research Methods in Psychology.
Weighting STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Topics: Significance Testing of Correlation Coefficients Inference about a population correlation coefficient: –Testing H 0 :  xy = 0 or some specific.
Chapter Nine: Evaluating Results from Samples Review of concepts of testing a null hypothesis. Test statistic and its null distribution Type I and Type.
Statistical Hypothesis Testing. Suppose you have a random variable X ( number of vehicle accidents in a year, stock market returns, time between el nino.
Tuesday, September 10, 2013 Introduction to hypothesis testing.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Random Sampling, Point Estimation and Maximum Likelihood.
Pengujian Hipotesis Dua Populasi By. Nurvita Arumsari, Ssi, MSi.
Statistics (cont.) Psych 231: Research Methods in Psychology.
1 Lecture 19: Hypothesis Tests Devore, Ch Topics I.Statistical Hypotheses (pl!) –Null and Alternative Hypotheses –Testing statistics and rejection.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Chap 9-1 Two-Sample Tests. Chap 9-2 Two Sample Tests Population Means, Independent Samples Means, Related Samples Population Variances Group 1 vs. independent.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
Matching STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Retain H o Refute hypothesis and model MODELS Explanations or Theories OBSERVATIONS Pattern in Space or Time HYPOTHESIS Predictions based on model NULL.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Chapter 5 Statistical Inference Estimation and Testing Hypotheses.
Sampling Theory and Some Important Sampling Distributions.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
AP Statistics. Chap 13-1 Chapter 13 Estimation and Hypothesis Testing for Two Population Parameters.
Correlation. u Definition u Formula Positive Correlation r =
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
SPSS Problem and slides Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of times would result in.
Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Inferential Statistics Psych 231: Research Methods in Psychology.
CWR 6536 Stochastic Subsurface Hydrology Optimal Estimation of Hydrologic Parameters.
Estimation & Hypothesis Testing for Two Population Parameters
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Hypothesis Testing: Hypotheses
Tests for Two Means – Normal Populations
Stats Tools for Analyzing Data
EC 331 The Theory of and applications of Maximum Likelihood Method
Chapter 3 Probability Sampling Theory Hypothesis Testing.
Psych 231: Research Methods in Psychology
Rerandomization to Improve Baseline Balance in Educational Experiments
Inferential Statistics
Psych 231: Research Methods in Psychology
CHAPTER 10 Comparing Two Populations or Groups
Psych 231: Research Methods in Psychology
Presentation transcript:

Rerandomization in Randomized Experiments Kari Lock and Don Rubin Harvard University JSM 2010

The “Gold Standard” Why are randomized experiments so good? They yield unbiased estimates of the treatment effect They eliminate (?) confounding factors… … ON AVERAGE. For any particular experiment, covariate imbalance is possible (and likely)

Rerandomization Suppose you are doing a randomized experiment and have covariate information available before conducting the experiment You randomize to treatment and control, but get a “bad” randomization Can you rerandomize? Yes, but you first need to specify a concrete definition of “bad”

Randomize subjects to treated and control Collect covariate data Specify a criteria determining when a randomization is unacceptable; based on covariate balance (Re)randomize subjects to treated and control Check covariate balance 1) 2) Conduct experiment unacceptable acceptable Analyze results with a Fisher randomization test 3) 4)

Unbiased To maintain an unbiased estimate of the treatment effect, the decision to rerandomize or not must be  automatic and specified in advance  blind to which group is treated Theorem: If the treated and control groups are the same size, and if for every unacceptable randomization the exact opposite randomization is also unacceptable, then rerandomization yields an unbiased estimate of the treatment effect.

Mahalanobis Distance Define overall covariate distance by M = D’r -1 D D j : Standardized difference between treated and control covariate means for covariate j k = number of covariates D = (D 1, …, D k ) r = covariate correlation matrix = cov(D) Choose a and rerandomize when M > a

Rerandomization Based on M Since M follows a known distribution, easy to specify the proportion of rejected randomizations M is affinely invariant Correlations between covariates are maintained The variance reduction on each covariate is the same (and known) The variance reduction for any linear combination of the covariates is known

Rerandomization Theorem: If n T = n C and rerandomization occurs when M > a, then and

Difference in Covariate Means Difference in Outcome Means

(theoretical v a =.16)

(theory =.58) Equivalent to increasing the sample size by a factor of 1.7 Difference in Outcome Means Under Null

Conclusion Rerandomization improves covariate balance between the treated and control means, and increases precision in estimating the treatment effect if the covariates are correlated with the response Rerandomization gives the researcher more power to detect a significant result, and more faith that an observed effect is really due to the treatment