Statistical Analysis of the Regression-Discontinuity Design.

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

Chicago Insurance Redlining Example Were insurance companies in Chicago denying insurance in neighborhoods based on race?
Objectives (BPS chapter 24)
Assumption MLR.3 Notes (No Perfect Collinearity)
Problems in Applying the Linear Regression Model Appendix 4A
Statistical Analysis of the Regression Point Displacement Design (RPD)
Multiple Linear Regression Model
Feb 21, 2006Lecture 6Slide #1 Adjusted R 2, Residuals, and Review Adjusted R 2 Residual Analysis Stata Regression Output revisited –The Overall Model –Analyzing.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Multiple Regression MARE 250 Dr. Jason Turner.
Linear Regression MARE 250 Dr. Jason Turner.
Statistical Analysis of the Nonequivalent Groups Design.
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Chapter 11 Multiple Regression.
MARE 250 Dr. Jason Turner Correlation & Linear Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Multiple Regression Research Methods and Statistics.
Simple Linear Regression Analysis
Relationships Among Variables
Introduction to Linear Regression and Correlation Analysis
Hypothesis Testing in Linear Regression Analysis
Simple linear regression Linear regression with one predictor variable.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
STA302/ week 111 Multicollinearity Multicollinearity occurs when explanatory variables are highly correlated, in which case, it is difficult or impossible.
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
Regression Analysis Introduction to Regression Analysis (RA) Regression Analysis is used to estimate a function f ( ) that describes the relationship.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons Business Statistics, 4e by Ken Black Chapter 15 Building Multiple Regression Models.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
Fitting Curves to Data 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 5: Fitting Curves to Data Terry Dielman Applied Regression.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Week 5Slide #1 Adjusted R 2, Residuals, and Review Adjusted R 2 Residual Analysis Stata Regression Output revisited –The Overall Model –Analyzing Residuals.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
Statistics and Numerical Method Part I: Statistics Week VI: Empirical Model 1/2555 สมศักดิ์ ศิวดำรงพงศ์ 1.
1 Building the Regression Model –I Selection and Validation KNN Ch. 9 (pp )
Variance Stabilizing Transformations. Variance is Related to Mean Usual Assumption in ANOVA and Regression is that the variance of each observation is.
Variable selection and model building Part I. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Assumptions of Multiple Regression 1. Form of Relationship: –linear vs nonlinear –Main effects vs interaction effects 2. All relevant variables present.
The General Linear Model. Estimation -- The General Linear Model Formula for a straight line y = b 0 + b 1 x x y.
Interaction regression models. What is an additive model? A regression model with p-1 predictor variables contains additive effects if the response function.
Agenda 1.Exam 2 Review 2.Regression a.Prediction b.Polynomial Regression.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Remember the equation of a line: Basic Linear Regression As scientists, we find it an irresistible temptation to put a straight line though something that.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
David Housman for Math 323 Probability and Statistics Class 05 Ion Sensitive Electrodes.
Announcements There’s an in class exam one week from today (4/30). It will not include ANOVA or regression. On Thursday, I will list covered material and.
1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)
Chapter 15 Multiple Regression Model Building
AP Statistics Chapter 14 Section 1.
Meadowfoam Example Continuation
B&A ; and REGRESSION - ANCOVA B&A ; and
Kin 304 Regression Linear Regression Least Sum of Squares
Statistical Analysis of the Randomized Block Design
Multiple Regression.
BPK 304W Regression Linear Regression Least Sum of Squares
12 Inferential Analysis.
BPK 304W Correlation.
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
No notecard for this quiz!!
12 Inferential Analysis.
Regression III.
Presentation transcript:

Statistical Analysis of the Regression-Discontinuity Design

Analysis Requirements l Pre-post l Two-group l Treatment-control (dummy-code) COXOCOOCOXOCOO

Assumptions in the Analysis l Cutoff criterion perfectly followed. l Pre-post distribution is a polynomial or can be transformed to one. l Comparison group has sufficient variance on pretest. l Pretest distribution continuous. l Program uniformly implemented.

The Curvilinearilty Problem pre p o s t e f f If the true pre-post relationship is not linear...

The Curvilinearilty Problem pre p o s t e f f and we fit parallel straight lines as the model...

The Curvilinearilty Problem pre p o s t e f f and we fit parallel straight lines as the model... The result will be biased.

The Curvilinearilty Problem pre p o s t e f f And even if the lines aren’t parallel (interaction effect)...

The Curvilinearilty Problem pre p o s t e f f And even if the lines aren’t parallel (interaction effect)... The result will still be biased.

Model Specification l If you specify the model exactly, there is no bias. l If you overspecify the model (add more terms than needed), the result is unbiased, but inefficient l If you underspecify the model (omit one or more necessary terms, the result is biased.

Model Specification y i =  0 +  1 X i +  2 Z i For instance, if the true function is

Model Specification y i =  0 +  1 X i +  2 Z i For instance, if the true function is And we fit: y i =  0 +  1 X i +  2 Z i + e i

Model Specification y i =  0 +  1 X i +  2 Z i For instance, if the true function is: And we fit: y i =  0 +  1 X i +  2 Z i + e i Our model is exactly specified and we obtain an unbiased and efficient estimate.

Model Specification y i =  0 +  1 X i +  2 Z i On the other hand, if the true function is

Model Specification y i =  0 +  1 X i +  2 Z i On the other hand, if the true model is And we fit: y i =  0 +  1 X i +  2 Z i +  2 X i Z i + e i

Model Specification y i =  0 +  1 X i +  2 Z i On the other hand, if the true function is And we fit: y i =  0 +  1 X i +  2 Z i +  2 X i Z i + e i Our model is overspecified; we included some unnecessary terms, and we obtain an inefficient estimate.

Model Specification y i =  0 +  1 X i +  2 Z i +  2 X i Z i +  2 Z i And finally, if the true function is 2

Model Specification y i =  0 +  1 X i +  2 Z i +  2 X i Z i +  2 Z i And finally, if the true model is And we fit: y i =  0 +  1 X i +  2 Z i + e i 2

Model Specification y i =  0 +  1 X i +  2 Z i +  2 X i Z i +  2 Z i And finally, if the true function is: And we fit: y i =  0 +  1 X i +  2 Z i + e i Our model is underspecified; we excluded some necessary terms, and we obtain a biased estimate. 2

Overall Strategy l Best option is to exactly specify the true function. l We would prefer to err by overspecifying our model because that only leads to inefficiency. l Therefore, start with a likely overspecified model and reduce it.

Steps in the Analysis 1.Transform pretest by subtracting the cutoff. 2.Examine the relationship visually. 3.Specify higher-order terms and interactions. 4.Estimate initial model. 5.Refine the model by eliminating unneeded higher-order terms.

Transform the Pretest l Do this because we want to estimate the jump at the cutoff. l When we subtract the cutoff from x, then x=0 at the cutoff (becomes the intercept). X i = X i - X c ~

Examine Relationship Visually Count the number of flexion points (bends) across both groups...

Examine Relationship Visually Here, there are no bends, so we can assume a linear relationship. Count the number of flexion points (bends) across both groups...

Specify the Initial Model l The rule of thumb is to include polynomials to (number of flexion points) + 2. l Here, there were no flexion points so... l Specify to 0+2 = 2 polynomials (i.E., To the quadratic).

y i =  0 +  1 X i +  2 Z i +  3 X i Z i +  4 X i +  5 X i Z i + e i The RD Analysis Model y i = outcome score for the i th unit  0 =coefficient for the intercept  1 =linear pretest coefficient  2 =mean difference for treatment  3 =linear interaction  4 =quadratic pretest coefficient  5 =quadratic interaction X i =transformed pretest Z i =dummy variable for treatment(0 = control, 1= treatment) e i =residual for the i th unit where: ~~~~ 22

Data to Analyze

Initial (Full) Model The regression equation is posteff = *precut *group *linint *quad quadint Predictor Coef Stdev t-ratio p Constant precut group linint quad quadint s = R-sq = 47.7% R-sq(adj) = 47.1%

Without Quadratic The regression equation is posteff = *precut *group *linint Predictor Coef Stdev t-ratio p Constant precut group linint s = R-sq = 47.5% R-sq(adj) = 47.2%

Final Model The regression equation is posteff = *precut *group Predictor Coef Stdev t-ratio p Constant precut group s = R-sq = 47.5% R-sq(adj) = 47.3%

Final Fitted Model