Quantifying Statistical Control: the Threshold of Theoretical Randomization Kenneth A. Frank Minh Duong Spiro Maroulis Michigan State University Ben Kelcey.

Slides:

Advertisements

Similar presentations

Hierarchical Linear Modeling: An Introduction & Applications in Organizational Research Michael C. Rodriguez.

Advertisements

ADVANCED STATISTICS FOR MEDICAL STUDIES Mwarumba Mwavita, Ph.D. School of Educational Studies Research Evaluation Measurement and Statistics (REMS) Oklahoma.

Sources of bias in experiments and quasi-experiments sean f. reardon stanford university 11 december, 2006.

What role should probabilistic sensitivity analysis play in SMC decision making? Andrew Briggs, DPhil University of Oxford.

Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.

Models with Discrete Dependent Variables

Chapter 10 Simple Regression.

Pooled Cross Sections and Panel Data II

Chapter 12 Simple Regression

Experimental Design, Statistical Analysis CSCI 4800/6800 University of Georgia Spring 2007 Eileen Kraemer.

Simulation Modeling and Analysis Session 12 Comparing Alternative System Designs.

Sampling Distributions

Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.

Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Introduction to Probability and Statistics Linear Regression and Correlation.

Stat 112: Lecture 9 Notes Homework 3: Due next Thursday

Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.

S-005 Types of research in education. Types of research A wide variety of approaches: –Theoretical studies –Summaries of studies Reviews of the literature.

-- Preliminary, Do Not Quote Without Permission -- VALUE-ADDED MODELS AND THE MEASUREMENT OF TEACHER QUALITY Douglas HarrisTim R. Sass Dept. of Ed. LeadershipDept.

3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.

Regression and Correlation Methods Judy Zhong Ph.D.

Inference for regression - Simple linear regression

Hypothesis Testing II The Two-Sample Case.

Advanced Statistics for Interventional Cardiologists.

Replacement Cases Framework Conclusion Correlational Framework overview Thresholds for inference and % bias to invalidate The counterfactual paradigm Internal.

EVAL 6970: Cost Analysis for Evaluation Dr. Chris L. S. Coryn Nick Saxton Fall 2014.

BPS - 3rd Ed. Chapter 211 Inference for Regression.

Causal Inference for Time-varying Instructional Treatments Stephen W. Raudenbush University of Chicago Joint Work with Guanglei Hong The research reported.

T tests comparing two means t tests comparing two means.

بسم الله الرحمن الرحیم.. Multivariate Analysis of Variance.

Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.

Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.

Introduction to Linear Regression

Correlational Research Chapter Fifteen Bring Schraw et al.

Replacement Cases Framework overview Thresholds for inference and % bias to invalidate The counterfactual paradigm Internal validity example: kindergarten.

Estimating a Population Proportion

Elementary Statistics Correlation and Regression.

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.

Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.

Stat 112 Notes 9 Today: –Multicollinearity (Chapter 4.6) –Multiple regression and causal inference.

Lecture 10: Correlation and Regression Model.

S-005 Types of research in education. Types of research A wide variety of approaches: –Theoretical studies –Summaries of studies Reviews of the literature.

Randomized Assignment Difference-in-Differences

Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11.

Data Analysis: Statistics for Item Interactions. Purpose To provide a broad overview of statistical analyses appropriate for exploring interactions and.

BPS - 5th Ed. Chapter 231 Inference for Regression.

Replacement Cases Framework Conclusion Correlational Framework overview Thresholds for inference and % bias to invalidate The counterfactual paradigm Internal.

NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.

Methods of Presenting and Interpreting Information Class 9.

ECON 4009 Labor Economics 2017 Fall By Elliott Fan Economics, NTU

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

Lecture 8 Preview: Interval Estimates and Hypothesis Testing

Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.

AERA workshop April 4, 2014 (AERA on-line video – cost is $95)

Multivariate Analysis Lec 4

Quantitative Methods PSY302 Quiz Chapter 9 Statistical Significance

Essential Statistics (a.k.a: The statistical bare minimum I should take along from STAT 101)

R Shiny app KonFound-it! (konfound-it.com/)

Welcome to on-line audience, ask questions with microphone

(AERA on-line video – cost is $105)

Rerandomization to Improve Baseline Balance in Educational Experiments

R Shiny app KonFound-it! (konfound-it.com/)

Lecture Slides Elementary Statistics Twelfth Edition

(AERA on-line video – cost is $105)

Alternative Scenarios and Related Techniques

(AERA on-line video – cost is $105)

(AERA on-line video – cost is $105)

(AERA on-line video – cost is $105)

Presentation transcript:

Quantifying Statistical Control: the Threshold of Theoretical Randomization Kenneth A. Frank Minh Duong Spiro Maroulis Michigan State University Ben Kelcey University of Michigan Presented at Groningen May

Focal Example: The Effect of Kindergarten Retention on Reading and Math Achievement (Hong and Raudenbush 2005) 1. What is the average effect of kindergarten retention policy? (Example used here) Should we expect to see a change in children’s average learning outcomes if a school changes its retention policy? Propensity based questions (not explored here) 2. What is the average impact of a school’s retention policy on children who would be promoted if the policy were adopted? Use principal stratification (Frangakis and Rubin 2002). 3. What is the effect of kindergarten retention on those who are retained? How much more or less kindergarten retainees would have learned, on average, had they been promoted to the first grade rather than retained.

Data Early Childhood Longitudinal Study Kindergarten cohort (ECLSK) –US National Center for Education Statistics (NCES). Nationally representative Kindergarten and 1 st grade –observed Fall 1998, Spring 1998, Spring 1999 Student –background and educational experiences –Math and reading achievement (dependent variable) –experience in class Parenting information and style Teacher assessment of student School conditions Analytic sample (1,080 schools that do retain some children) –471 kindergarten retainees –10,255 promoted students

Effect of Retention on Reading Scores (Hong and Raudenbush)

Possible Confounding Variables Gender Two Parent Household Poverty Mother’s level of Education (especially relevant for reading achievement)

What is the Impact of a Confounding Variable on an Inference for a Regression Coefficient? (Frank, K “Impact of a Confounding Variable on the Inference of a Regression Coefficient.” Sociological Methods and Research, 29(2), )

Impact appears in Partial Correlation r ty is the sample correlation between the treatment and the outcome r yv is the sample correlation between a confound and the outcome r tv is the sample correlation between a confound and the treatment Correlation is reduced by the product of two relevant correlations (values in denominator can only increase the partial) Inference for regression coefficient is same as that for partial correlation

Impacts of Covariates on Correlation between Retention and Reading Achievement Component Correlations covariate impact with with achievement retention Mother’s Education Female Two parent poverty Negative impact would reduce the magnitude of the coefficient for retention

Covariates and Absorbers (dependent variable: Reading in Spring 1999) Covariates –Mother’s education –Poverty –Gender –Two parent home –References Hong and Raudenbush; Shepard; Coleman Absorbers –Schools as fixed effects –Pre-test Spring 1998 –Growth trajectory: Fall 1998-Spring1998 –References; Shadish et al; Heckman and Hotz (1988: JASA)

Extent to which Pre-test Absorbs the Impacts of Covariates on Inference Regarding Effect of Retention on Reading Achievement Controlling for pre-test absorbs 87% of the impact of Mother’s Education; once controlling for pre-test there is less of a need to control for mother’s education No Control for Pre-testControl for Pre-test% Reduction (absorption) Family background Impactr vt r vy Impactr vt r vy Mother’s education Female Two parent home Poverty

Capacity of Controls to Absorb the Impacts of Covariates

Effect of Retention on Achievement After Adding each Covariate ControlsEstSet School School+Pre School+Pre2+(Pre2-Pre1) School+Pre2+(Pre2-Pre1)+ Momed School+Pre2+(Pre2-Pre1)+ Female School+Pre2+(Pre2-Pre1)+ 2parent School+Pre2+(Pre2-Pre1)+ poverty Hong and Raudenbush (model based) n=10,065, R 2 =.40 Note: 1 year’s growth is about 10 points, so retention effect > 1 year growth

Randomization as the Gold Standard Randomization preferred Works in “long run”: What is “long run”? Relationship between n and impact in theoretical randomized experiment –Alternative “Silver Standard” –Quantify statistical control in a quasi- experiment

Need for Simulation? Predicting Mean Impact Using Wei Pan’s Approximation (UGLY!) Pan, W., and Frank, K.A., “A probability index of the robustness of a causal inference,” Journal of Educational and Behavioral Statistics, 28, ρ tv = correlation between treatment and confound ρ yv = correlation between outcome and confound s, a, b coefficients to obtain approximation

Pan’s Approximation (UGLY!): But Works Simulate mean impact n: (20,100,1000) ρ tv, ρ ty, ρ vy : (.1,.3,.5,.7) Bias of predicted mean impact (Pan 2003) across simulations is with standard deviation of We have a function for the impacts across a range of conditions

What is the Impact of a Confounding Variable in an Randomized Experiment?  0 in RCT

Predicting Mean Impact Using Wei Pan’s Approximation Assuming ρ tv =0: (No Correlation between treatment and confound, as in randomized experiment). Elegant! Where ρ ty = correlation between treatment and outcome ρ yv = correlation between outcome and unobserved confound

Solving Pan’s Approximation for n (assuming randomized experiment): Allows us to predict effective n of a theoretical randomized experiment given a mean impact and hypothetical correlation between outcome and confound Can predict an effective n given an impact in a quasi-experiment

Predicted Sample Size as a Function of Impact Of mother’s education

Quantitative Crosswalk between RCT and Quasi-experiment Quasi-experiment can achieve same or better level of control as randomized experiment –Red line: Hong and Raudenbush achieve control equivalent to randomized experiment of size 200  better than a small RCT But, with a randomized experiment –Guaranteed no bias in long run –Confidence interval captures uncertainty Trade off between precision versus bias –Quasi-experiment could be more precise, but possibly biased –Key assumption: impacts of measured covariates represent impacts of unmeasured covariates.

Asymptotics of Randomization “Elbow” in relationship between n and impact. Imprecise prediction for small impact (where we care the most) Leverage the shape by defining a single threshold (first derivative=-25/.001=-25000). 25 change in n for.001 change in impact

Mean Impact and Effective N of each cell given at threshold. Aymptotics of Precision for Randomization Across Levels of Correlation between Outcome and the Treatment (ρ yt ) and Outcome and a Confound (ρ yv )

Interpretations Cut offs appear reasonable – on the way to asymptotic land More affected by treatment effect (can be estimated) than by relationship between outcome and unobserved confound (unknown). Good.

Discussion Characterize control in terms of impact Theoretical randomized experiment as “gold standard” –Departure from Cook, who used actual experiments Quasi-experiments (legitimacy) –Can equate to theoretical experiment –Obtain effective n –Use effective n as weight in meta-analysis –Cross threshold? Procedure –Establish impact of good covariates –Establish absorption due to pre-test, etc –Equate to randomized experiment

What must be the Impact of an Unmeasured Confounding Variable Invalidate the Inference? Step 1: Establish Correlation Between Retention and Score Step 2: Define a Threshold for Inference Step 3: Calculate the Threshold for the Impact Necessary to Invalidate the Inference Step 4: Multivariate Extension, with measured Covariates

Step 1: Establish Correlation Between Retention and Score t taken from regression, = n is the sample size q is the number of parameters estimated N-q-1=9012

Step 2: Define a Threshold for Inference Define r # as the value of r that is just statistically significant: n is the sample size q is the number of parameters estimated t critical is the critical value of the t-distribution for making an inference r # can also be defined in terms of effect sizes

Step 3: Calculate the Threshold for the Impact Necessary to Invalidate the Inference Set r x∙y|cv =r # and solve for k to find the threshold for the impact of a confounding variable (TICV). Define the impact: k = rx∙cv x r y∙cv and assume r x∙cv =r y∙cv (which maximizes the impact of the confounding variable – Frank 2000). impact of an unmeasured confound >.25 → inference invalid

Calculations made easy!

Step 4: Multivariate Extension, with Covariates k=r x ∙cv|z × r y ∙ cv|z Maximizing the impact with covariates z in the model implies And =.21

Multivariate Calculations

What must be the Impact of an Unmeasured Confound to Invalidate the Inference? If k >.25 (or.21 without covariates) then the inference is invalid. Maximum for multivariate model occurs when r x cv =.46 and r y cv, =.45. Furthermore, correlations of unobserved confound must be partialled for covariates z. The magnitude of the impact of mother’s education (strongest measured covariate) =.0015;  Impact of unmeasured confound would have to be more than 100 times greater than the impact of mother’s education to invalidate the inference. Hmmm….

Extensions Logistic Regression –See Imbens, Guido “Sensitivity to Exogeneity Assumptions in Program Evaluation” Recent Advances in Econometric Methdology ( , especially 128) –David J. Harding "Counterfactual Models of Neighborhood Effects: The Effect of Neighborhood Poverty on Dropping Out and Teenage Pregnancy." American Journal of Sociology 109(3): –Logistic regression (Ben Kelcey at U of M) Use weighted least squares Use odds ratios Multilevels –Seltzer and Frank (AERA 2007) Multiple thresholds –Statistical significance: simply redefine H 0 ≠0. –Point estimates: define impact necessary to reduce coefficient below a series of thresholds, each one representing a separate decision. Half- way between Bayesian and Frequentist

Actual Randomized Experiment Effect of Technology on Teaching Strong Methods Randomization Still some Confounding

Relationship Between background Characteristics and Treatment Assignment in a Randomized Study of the Effect of Technology on Achievement