Difference in Difference 1. Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Slides:



Advertisements
Similar presentations
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.16 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Advertisements

Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
EC220 - Introduction to econometrics (chapter 7)
Lecture 4 This week’s reading: Ch. 1 Today:
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Adaptive expectations and partial adjustment Presented by: Monika Tarsalewska Piotrek Jeżak Justyna Koper Magdalena Prędota.
Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
Regression Example Using Pop Quiz Data. Second Pop Quiz At my former school (Irvine), I gave a “pop quiz” to my econometrics students. The quiz consisted.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.
Sociology 601 Class 26: December 1, 2009 (partial) Review –curvilinear regression results –cubic polynomial Interaction effects –example: earnings on married.
EC220 - Introduction to econometrics (chapter 1)
1 INTERPRETATION OF A REGRESSION EQUATION The scatter diagram shows hourly earnings in 2002 plotted against years of schooling, defined as highest grade.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: precision of the multiple regression coefficients Original citation:
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
LT6: IV2 Sam Marden Question 1 & 2 We estimate the following demand equation ln(packpc) = b 0 + b 1 ln(avgprs) +u What do we require.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
Econometrics 1. Lecture 1 Syllabus Introduction of Econometrics: Why we study econometrics? 2.
1 PROXY VARIABLES Suppose that a variable Y is hypothesized to depend on a set of explanatory variables X 2,..., X k as shown above, and suppose that for.
EXERCISE 5.5 The Stata output shows the result of a semilogarithmic regression of earnings on highest educational qualification obtained, work experience,
Country Gini IndexCountryGini IndexCountryGini IndexCountryGini Index Albania28.2Georgia40.4Mozambique39.6Turkey38 Algeria35.3Germany28.3Nepal47.2Turkmenistan40.8.
Serial Correlation and the Housing price function Aka “Autocorrelation”
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Addressing Alternative Explanations: Multiple Regression
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two.
Introduction 1 Panel Data Analysis. And now for… Panel Data! Panel data has both a time series and cross- section component Observe same (eg) people over.
Simple regression model: Y =  1 +  2 X + u 1 We have seen that the regression coefficients b 1 and b 2 are random variables. They provide point estimates.
. reg LGEARN S WEIGHT85 Source | SS df MS Number of obs = F( 2, 537) = Model |
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
POSSIBLE DIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY 1 What can you do about multicollinearity if you encounter it? We will discuss some possible.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: exercise 4.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
(1)Combine the correlated variables. 1 In this sequence, we look at four possible indirect methods for alleviating a problem of multicollinearity. POSSIBLE.
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
Lecture 5. Linear Models for Correlated Data: Inference.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
1 Regression-based Approach for Calculating CBL Dr. Sunil Maheshwari Dominion Virginia Power.
1 NONLINEAR REGRESSION Suppose you believe that a variable Y depends on a variable X according to the relationship shown and you wish to obtain estimates.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
SEMILOGARITHMIC MODELS 1 This sequence introduces the semilogarithmic model and shows how it may be applied to an earnings function. The dependent variable.
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
1 In the Monte Carlo experiment in the previous sequence we used the rate of unemployment, U, as an instrument for w in the price inflation equation. SIMULTANEOUS.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE In this sequence we will investigate the consequences of including an irrelevant variable.
VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.
Diff-inDiff Are exports from i to j, the same as imports in i from j? Should they be?. gen test=xij-mji (14 missing values generated). sum test,
QM222 Class 9 Section A1 Coefficient statistics
assignment 7 solutions ► office networks ► super staffing
QM222 Class 16 & 17 Today’s New topic: Estimating nonlinear relationships QM222 Fall 2017 Section A1.
QM222 Class 11 Section A1 Multiple Regression
QM222 Class 8 Section A1 Using categorical data in regression
QM222 Your regressions and the test
QM222 Class 15 Section D1 Review for test Multicollinearity
Introduction to Econometrics, 5th edition
Presentation transcript:

Difference in Difference 1

Preliminaries Office Hours: Fridays 4-5pm 32Lif, 3.01 I will post slides from class on my website

Diff-in-Diff Principle tool in non-experimental applied micro over the last twenty years Takes idea from experimental literature – control groups – and applies it in non experimental circumstances Suitability of control group is key as control group provides counterfactual

Counterfactuals ✔ Describe, as if to a policymaker with no background in econometrics, what a counterfactual is and why it is important for establishing the impact of a particular program?

Counterfactual Suppose you are interested in assessing the effect of reducing class sizes on children’s final exam grades. You have test scores from students in classes where the class size was reduced starting in the prior year and from students in classes where the size remained the same. Under what conditions would the difference in the average test scores across these two groups be a valid estimate for the effect of reducing class sizes? Explicitly state the counterfactual you need and how it relates to the comparison group you actually have (i.e., students in classes where the size remained the same).

Re-Cap Rubin Causal Model We would like to know the effect of the ‘treatment’ on the treatment group E(Y T i |T) - E(Y C i |T) What do these mean? Do we observe (the sample analogue of) both of these objects?

Re-Cap Rubin Causal Model We would like to know the effect of the ‘treatment’ on the treatment group E(Y T i |T) - E(Y C i |T) What do these mean? We don’t observe (the sample analogue of) E(Y C i |T) Instead we often estimate E(Y T i |T) - E(Y C i |C) What is the problem?

Selection bias Is Occurs when E(Y C i |T) ≠ E(Y C i |C) Note (E(Y T i |T) - E(Y C i |T)) - (E(Y T i |T) - E(Y C i |C)) = E(Y C i |T) ≠ E(Y C i |C) Examples What do we do with Diff-in-Diff? Estimate E(ΔY T i |T) - E(ΔY C i |C) So biased if E(ΔY C i |T) ≠ E(ΔY C i |C) What do we call this assumption

2. Productivity of Cocoa Farmers a)Time Series Estimate? Any good? Over or under-estimate? What is the identification assumption b)Cross Section Estimate? Any good? Over or under-estimate? What is the identification assumption c)DiD Estimate? Any good? What problems has it solved? What is the identification assumption? Do we believe it

Stata Part We are trying to find the effect of the announcement of an Incinerator on house prices. We have two years 1978 and 1981 The treated group are houses within three miles of the incinerator, the control are houses further than three miles

Treatment Effect in reg lrprice nearinc if year==1981, robust Linear regression Number of obs = 142 F( 1, 140) = Prob > F = R-squared = Root MSE = | Robust lrprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] nearinc | _cons | What is the estimated treatment effect? What is the identification assumption? What do we learn? What are the means of houseprice in 1981 for treatment and control groups?

Treatment Effect in 1978 reg lrprice nearinc if year==1978, robust Linear regression Number of obs = 179 F( 1, 177) = Prob > F = R-squared = Root MSE = | Robust lrprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] nearinc | _cons | What is the estimated ‘treatment’ effect? What has this to do with our identification assumption? What do we learn? What are the means of house price in 1981 for treatment and control groups?

Diff-in-Diff Linear regression Number of obs = 321 F( 3, 317) = Prob > F = R-squared = Root MSE = | Robust lrprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] nearinc | y81 | y81_nearinc | _cons | How would we write down the estimating equation? Which is the variable of interest? What is the estimated ‘treatment’ effect? What is our identification assumption? What do we learn? How do the estimated coefficients relate to the previous tables?

Diff-in-Diff plus Controls Linear regression Number of obs = 321 F( 3, 317) = Prob > F = R-squared = Root MSE = | Robust lrprice | Coef. Std. Err. t P>|t| [95% Conf. Interval] nearinc | y81 | y81_nearinc | age | agesq | lintst | lland | larea | rooms | baths | lcbd | _cons | Which is the variable of interest? What is the estimated ‘treatment’ effect? How does this change? What is our identification assumption? How has it changed?

What are the policy implications