Policy Evaluation Antoine Bozio Institute for Fiscal Studies University of Oxford - January 2008.

Slides:



Advertisements
Similar presentations
Active labour market measures and entrepreneurship in Poland Rafał Trzciński Impact Evaluation Spring School Hungary,
Advertisements

Impact analysis and counterfactuals in practise: the case of Structural Funds support for enterprise Gerhard Untiedt GEFRA-Münster,Germany Conference:
Analysis by design Statistics is involved in the analysis of data generated from an experiment. It is essential to spend time and effort in advance to.
The Role of Pilots Alex Bryson Policy Studies Institute ESRC Methods Festival, St Catherines College, Oxford University, 1st July 2004.
1 Panel Data Analysis – Advantages and Challenges Cheng Hsiao.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
REGRESSION, IV, MATCHING Treatment effect Boualem RABTA Center for World Food Studies (SOW-VU) Vrije Universiteit - Amsterdam.
Treatment Evaluation. Identification Graduate and professional economics mainly concerned with identification in empirical work. Concept of understanding.
#ieGovern Impact Evaluation Workshop Istanbul, Turkey January 27-30, 2015 Measuring Impact 1 Non-experimental methods 2 Experiments Vincenzo Di Maro Development.
Random Assignment Experiments
What are the causes of age discrimination in employment?
The World Bank Human Development Network Spanish Impact Evaluation Fund.
GROUP-LEVEL DESIGNS Chapter 9.
The counterfactual logic for public policy evaluation Alberto Martini hard at first, natural later 1.
Introduction to Statistics: Political Science (Class 7) Part I: Interactions Wrap-up Part II: Why Experiment in Political Science?
PHSSR IG CyberSeminar Introductory Remarks Bryan Dowd Division of Health Policy and Management School of Public Health University of Minnesota.
Nick Bloom, Applied Econometrics, Winter 2010 APPLIED ECONOMETRICS Lecture 1 - Identification.
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
Chapter 51 Experiments, Good and Bad. Chapter 52 Experimentation u An experiment is the process of subjecting experimental units to treatments and observing.
Impact Evaluation: The case of Bogotá’s concession schools Felipe Barrera-Osorio World Bank 1 October 2010.
Chapter 2 – Tools of Positive Analysis
© Institute for Fiscal Studies The role of evaluation in social research: current perspectives and new developments Lorraine Dearden, Institute of Education.
Experimental Research
CAUSAL-COMPARATIVE RESEARCH Prepared for: Eddy Luaran Prepared by: Nur Hazwani Mohd Nor ( ) Noriziati Abd Halim ( ) Noor fadzilah.
Methodology: How Social Psychologists Do Research
PAI786: Urban Policy Class 2: Evaluating Social Programs.
I want to test a wound treatment or educational program in my clinical setting with patient groups that are convenient or that already exist, How do I.
Chapter 3 The Research Design. Research Design A research design is a plan of action for executing a research project, specifying The theory to be tested.
JDS Special program: Pre-training1 Carrying out an Empirical Project Empirical Analysis & Style Hint.
Global Workshop on Development Impact Evaluation in Finance and Private Sector Rio de Janeiro, June 6-10, 2011 Mattea Stein Quasi Experimental Methods.
Quasi Experimental Methods I Nethra Palaniswamy Development Strategy and Governance International Food Policy Research Institute.
Designing a Random Assignment Social Experiment In the U.K.; The Employment Retention and Advancement Demonstration (ERA)
Welfare Reform and Lone Parents Employment in the UK Paul Gregg and Susan Harkness.
CAUSAL INFERENCE Shwetlena Sabarwal Africa Program for Education Impact Evaluation Accra, Ghana, May 2010.
Beyond surveys: the research frontier moves to the use of administrative data to evaluate R&D grants Oliver Herrmann Ministry of Business, Innovation.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Causal Inference Nandini Krishnan Africa Impact Evaluation.
CAUSAL INFERENCE Presented by: Dan Dowhower Alysia Cohen H 615 Friday, October 4, 2013.
Methods of Economic Investigation: Lent Term Radha Iyengar Office Hour: Monday Office: R425.
Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.
Public Policy Analysis ECON 3386 Anant Nyshadham.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Applying impact evaluation tools A hypothetical fertilizer project.
McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. CHAPTER 2 Tools of Positive Analysis.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
1 General Elements in Evaluation Research. 2 Types of Evaluations.
Evaluation Research Dr. Guerette. Introduction Evaluation Research – Evaluation Research – The purpose is to evaluate the impact of policies The purpose.
Chapter Nine Primary Data Collection: Experimentation and
Methodology: How Social Psychologists Do Research
CJ490: Research Methods in Criminal Justice UNIT #4 SEMINAR Professor Jeffrey Hauck.
How Psychologists Do Research Chapter 2. How Psychologists Do Research What makes psychological research scientific? Research Methods Descriptive studies.
1 The Training Benefits Program – A Methodological Exposition To: The Research Coordination Committee By: Jonathan Adam Lind Date: 04/01/16.
Copyright © 2015 Inter-American Development Bank. This work is licensed under a Creative Commons IGO 3.0 Attribution-Non Commercial-No Derivatives (CC-IGO.
Do European Social Fund labour market interventions work? Counterfactual evidence from the Czech Republic. Vladimir Kváča, Czech Ministry of Labour and.
Impact Evaluation Methods Randomization and Causal Inference Slides by Paul J. Gertler & Sebastian Martinez.
Measuring Results and Impact Evaluation: From Promises into Evidence
Quasi Experimental Methods I
General belief that roads are good for development & living standards
Quasi Experimental Methods I
Impact evaluation: The quantitative methods with applications
Matching Methods & Propensity Scores
Matching Methods & Propensity Scores
Quasi-Experimental Design
Matching Methods & Propensity Scores
Impact Evaluation Methods: Difference in difference & Matching
Evaluating Impacts: An Overview of Quantitative Methods
Class 2: Evaluating Social Programs
Sampling for Impact Evaluation -theory and application-
Class 2: Evaluating Social Programs
Positive analysis in public finance
Presentation transcript:

Policy Evaluation Antoine Bozio Institute for Fiscal Studies University of Oxford - January 2008

Outline I.Why is policy evaluation important? 1.Policy view 2.Academic view II.What are the evaluation problems? 1.The Quest for the Holy Grail: causality 2.The generic problem: the counterfactual 3.Specific problems: selection, endogeneity III.The evaluation methods 1.Randomised social experiments 2.Controlling for observables: regression, matching 3.Natural experiments: diff-in-diff, regression discontinuity 4.Instrument variable strategy

I. Why is policy evaluation important? (policy view) 1.Policy interventions are very expensive: UK government spends 41.5 % of GDP Need to know whether money is well spent There are many alternative policies possible 2.Evaluation is key to modern democracies –Citizens can differ in their preferences, in the goals they want policy to achieve => politics –Policies are means to achieve these goals –Citizens need to be informed on the efficiency of these means => policy evaluation

I. Why is policy evaluation important? (policy view) 3.Its rarely obvious to know what are a policys effects –Unless you have strong beliefs (ideology), policies can have wide-ranging effects (economy is very complex and hard to predict) –Economic theory is very useful but leaves many of the policy conclusions indeterminate => it depends on parameters (individuals behaviour, markets…) –Correlation is not causation…

I. Why is policy evaluation important? (academic view) Evaluation is now a crucial part of applied economics: -To estimate parameters -To test models and theories Evaluations techniques have become a field in themselves -Many advances in the last ten years -Turning away from descriptive correlations, and aiming at causal relationship

I. Why is policy evaluation important? (academic view) Many fields of economics now rely heavily on evaluations work Labour market policies Impact of taxes Impact of savings incentives Education policies Aid to developing countries use of micro data different from macro analysis (cross-country…)

II. What are the evaluation problems? 1.The Quest for the Holy Grail: causal relationships 2.The generic problem: the counterfactuals are missing 3.Specific obstacles: selection, endogeneity

II -1. The Quest for Causality Correlation is not causality! Post hoc, ergo propter hoc : looking at what happens after the introduction of a policy is not proper evaluation. –Long term trends –Macroeconomics changes –Selection effects Rubins model for causal inference: the experience setting and language - See Holland (1986)

II- 1. The Quest for Causality We want to establish causal inference of a policy T on a population U composed agents u To measure how treatment or the cause (T) affects the outcome of interest (Y) Define Y(u) as the outcome of interest defined over U (It can measure income, employment status, health, …)

II – 2. Looking for counterfactuals Looking for counterfactual: What would have happened to this persons behaviour under an alternative policy ? E.G.: -Do people work more when marginal taxes are lower ? -Do people earn more when they have more education ? -Do unemployed find more easily a job when unemployment benefit duration is reduced ?

II – 2. Looking for counterfactuals Treated: agent that has been exposed to treatment (T) Control: agent that has not been exposed to treatment (C, non T) The role of Y is to measure the effect of treatment (causes have effects) Y T (u) and Y C (u) outcome that would be observed had unit u been exposed to treatment outcome that would be observed had unit u not been exposed to treatment

II – 2. Looking for counterfactuals The causal effect of treatment T on unit u as measured by outcome Y is α(u) = Y T (u) - Y C (u) Its a missing data problem Fundamental problem of causal inference It is impossible to observe Y T (u) and Y C (u) simultaneously on the same unit Therefore, it is impossible to observe α(u) for any unit u

II – 2. Looking for counterfactuals time Y(u ) k T C α(u) = Y T (u)- Y C (u)

The statistical solution Use population to compute the average causal effect of T over U Need data on many individuals (micro data) Average outcome of the treated: E[Y T (u) | T] Average outcome of the control : E[Y C (u) | C] Compute the difference between averages: D = E[Y T (u) | T] - E[ Y C (u)| C]

II – 2. Selection bias Compute the difference between averages: D = E[Y T (u) | T] - E[ Y C (u)| C] D= E[Y T (u) - Y C (u)| T ] + E[Y C (u) | T] - E[ Y C (u)| C] D = α + E[Y C (u) | T] - E[ Y C (u)| C] D = average causal effect + selection bias

Illustration: impact of advanced course in maths Treatment: advanced course in maths (against standard course in maths) Treated: students scoring above x in maths test at beginning of year Outcome of interest: score in maths test at end of year Measure impact of treatment by comparing average score of treated and controls by the end of year Problem? On average, treated students are better at maths than control students Best students would always perform better on average!! Selection bias : E[Y C (u) | T] - E[ Y C (u)| C] > 0 Overestimation of the true effect of the course

Illustration: impact of advanced course in maths (cont) Compare before (t-1) and after treatment (t+1): before the advanced class and after. D = E[Y T (u) | T, t+1] - E[ Y T (u)| T, t-1] Problems? -Many other things might change : student grow older, smarter (trend issue) -Hard to disentangle the impact of the advanced class from the regular class -Grading before and after might not be equivalent… => Estimates likely to be biased

III. Evaluation methods: how to construct the counterfactual 1.Randomised social experiments 2.Controlling for observables: OLS, matching 3.Natural experiment: Difference in difference, regression discontinuity design 4.Instrument variable methodology 5.Other methods : Selection model and structural estimation

III- 1. Randomized social experiments Experiments solve the selection problem by randomly assigning units to treatment Because assignment to treatment is not based on any criterion related with the characteristics of the units, it will be independent of possible outcomes E[Y C (u) | C] = E[Y C (u) | T] now holds D = E[Y T (u) | T] - E[Y C (u) | C] D = α = E[Y T (u) - Y C (u)] = causal effect Convincing results More and more randomized policies

Reform to incapacity benefits Source: House of Commons Work and Pensions Select Committee Report (2006). October 2003 April 2004 Six-month off-flow rate from incapacity benefits

Is the policy working? Source: House of Commons Work and Pensions Select Committee Report (2006). October 2003 April 2004 Six-month off-flow rate from incapacity benefits

Is the policy working? Source: House of Commons Work and Pensions Select Committee Report (2006). October 2003 April 2004 Six-month off-flow rate from incapacity benefits

Is the policy working? Source: House of Commons Work and Pensions Select Committee Report (2006). October 2003 April 2004 Six-month off-flow rate from incapacity benefits

Drawbacks to social experiments Experiments are costly and therefore rare (less rare now) –Cost and ethical reasons, feasibility Threats to internal validity –Non response bias. Non-random dropouts –Substitution between treated and control Threats to external validity –Limited duration –Experiment specificity (region, timing…) –Agents know they are observed –General equilibrium effects Threats to power –Small sample

Non-experimental approaches Aim at recovering randomisation, thus recovering the missing counterfactual E(Y C |T) This is done in different ways by different methods Which one is more appropriate depends on the treatment being studied, question of interest and available data

III 2. Controlling on observables 1/ Regression analysis (OLS) Y = a + b X1 + c X2 + d X3 Problems : a/ Omitted variables lead to bias and some variables may be unobservable Example: Effect of education on earnings Ability or preference to work is hardly observable b/ Explanatory variables might be endogenous Example: Effect of unemployment benefit duration When unemployment increases, policies tend to increase unemployment benefit duration => Correlation is NOT causality !!

III 2. Controlling on observables 2/ Matching on observables: It is possible to compare groups that are similar relative to the variables we observe (same education, income…) Explores all observable information Take X to represent the observed characteristics of the units other than Y and D It assumes that units with the same X are identical with respect to Y except possibly for the treatment status Formally, what is being assumed is E[Y C |D=T,X] = E[Y C |D=C,X]

Matching (cont) time Y kk+1 α M (X) = Y D=T (k+1,X) – Y D=C (k+1,X) α T|D=C,X T|D=T,X C|D=C Use X characteristics to ensure comparable units are being compared C|D=C,X

III - 3. Natural experiments Explore sudden changes or spatial variation in the rules governing behaviour Typically involve one group that is affected by the phenomena (the treated) and one other group that is not affected (the control) Observe how behaviour (outcome of interest) changes as compared to change in unaffected group Difference in difference Regression discontinuity estimation

Difference in differences (DD) Suppose a change in policy occurs at time k We observe agents affected by policy change before and after the policy change, say at times k-1 and k+1 : A= E[Y T |t=k+1] – E[Y T |t=k-1] We also observe agents not affected by the policy change at the same time periods: B=E[Y D |t=k+1] – E[Y D |t=k-1] DD =A- B = true effect of the policy

Difference in differences time Y kk+1k-1 α T|D=C T|D=T C|D=C Use fact that difference between T and C remains fixed over time in C regime

Under certain conditions 1.No composition changes within groups 2.Common time effects across groups Checking the strategy -Checks the DD strategy before the reform -Use different control groups -Use an outcome variable not affected by the reform Has to be a careful study ! Can take into account unobservable variables !

Difference in differences: differential trends time Y kk+1k-1 E[Y C (k+1) - Y C (k-1)|D=C] E[Y C (k+1) - Y C (k-1)|D=T] α T|D=C T|D=T C|D=C Time trend in T and C groups are different

Difference in differences: Ashenfelters dip time Y kk+1k-1 DD assumption holds only in certain periods α T|D=C T|D=T C|D=C T often experience particularly bad shocks before deciding to enrol into treatment k-2

Diff-in-Diff : Esther Duflo (AER 2001) Policy: school construction in Indonesia Regional difference: low and high Children young enough to be affected = treated Children too old = control Estimate the impact of building school on education Estimate the impact of education on earnings

Diff-in-Diff : Esther Duflo (AER 2001)

Common problems with DD Long-term versus reliability trade-off: -Impact most reliable in the short term -True impact might take time Heterogeneous behavioural response –Average effect might hide high/low effect for certain groups Local estimation –Truly DD estimates are hard to generalize Need many estimations to establish general causal effect

Regression discontinuity design (RD) Use the discontinuity in the treatment Example: New deal for young people in the UK Program targeted to young unemployed aged 18 to 24 Unemployed just older than 25 are in the control group Unemployed just younger than 25 are in the treated group

De Giorgi (2006) : New deal for young people

III - 4. Instrumental variable Use the fact that a variable Z (the instrument) might be correlated with the endogenous variable X BUT not with the outcome Y Except through the variable X E.G.: number of student per class is endogenous to outcomes test scores => How to find a good instrument ?

Angrist and Lavy (QJE 1999)

III - 5. Other methods Matching mixed with diff-in-diff Selection estimator Structural estimation There is a trade-off between reliability of the causal inference (identification) and the generalization of the results

Conclusion Policy evaluation is crucial –For conducting efficient policies –For improving scientific knowledge Correlation is not causality Beware of the selection effect or of endogenous variable ! Methods to draw causal inference are available => Need careful analysis !