Impact Evaluation Sebastian Galiani November 2006 Causal Inference.

Slides:

Advertisements

Similar presentations

Designing an impact evaluation: Randomization, statistical power, and some more fun…

Advertisements

Chapter 16 Inferential Statistics

SEM PURPOSE Model phenomena from observed or theoretical stances

Random Assignment Experiments

Holland on Rubin’s Model Part II. Formalizing These Intuitions. In the 1920 ’ s and 30 ’ s Jerzy Neyman, a Polish statistician, developed a mathematical.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

© 2011 Pearson Education, Inc

Chapter 8: Estimating with Confidence

3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.

Omitted Variable Bias Methods of Economic Investigation Lecture 7 1.

10 Further Time Series OLS Issues Chapter 10 covered OLS properties for finite (small) sample time series data -If our Chapter 10 assumptions fail, we.

PHSSR IG CyberSeminar Introductory Remarks Bryan Dowd Division of Health Policy and Management School of Public Health University of Minnesota.

ASYMPTOTIC PROPERTIES OF ESTIMATORS: PLIMS AND CONSISTENCY

Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.

1. Introduction Consistency of learning processes To explain when a learning machine that minimizes empirical risk can achieve a small value of actual.

Specifying a Purpose, Research Questions or Hypothesis

How Science Works Glossary AS Level. Accuracy An accurate measurement is one which is close to the true value.

TOOLS OF POSITIVE ANALYSIS

Maximum likelihood (ML)

Chapter One: The Science of Psychology

Science and Engineering Practices

Chapter 2: The Research Enterprise in Psychology

Chapter 5 Research Methods in the Study of Abnormal Behavior Ch 5.

ECON 6012 Cost Benefit Analysis Memorial University of Newfoundland

Chapter 2: The Research Enterprise in Psychology

Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,

Introduction Parameters are numerical descriptive measures for populations. For the normal distribution, the location and shape are described by  and.

Magister of Electrical Engineering Udayana University September 2011

Chapter 1: Introduction to Statistics

Understanding Statistics

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Correlational Research Chapter Fifteen Bring Schraw et al.

© 2000 John Wiley & Sons, Inc. Davison and Neale: Abnormal Psychology, 8e Abnormal Psychology, Eighth Edition by Gerald C. Davison and John M. Neale Lecture.

Econometrics ECO 54 History of Economic Thought Udayan Roy.

Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Causal Inference Nandini Krishnan Africa Impact Evaluation.

ECON 3039 Labor Economics By Elliott Fan Economics, NTU Elliott Fan: Labor 2015 Fall Lecture 21.

Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Developing and Evaluating Theories of Behavior.

Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.

Is the association causal, or are there alternative explanations? Epidemiology matters: a new introduction to methodological foundations Chapter 8.

Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.

Research Methods in Psychology Chapter 2. The Research ProcessPsychological MeasurementEthical Issues in Human and Animal ResearchBecoming a Critical.

Question paper 1997.

McGraw-Hill/Irwin Copyright © 2008 by The McGraw-Hill Companies, Inc. All rights reserved. CHAPTER 2 Tools of Positive Analysis.

Chapter 8: Simple Linear Regression Yang Zhenlin.

Statistics What is the probability that 7 heads will be observed in 10 tosses of a fair coin? This is a ________ problem. Have probabilities on a fundamental.

Bangor Transfer Abroad Programme Marketing Research SAMPLING (Zikmund, Chapter 12)

Graduate School for Social Research Autumn 2015 Research Methodology and Methods of Social Inquiry socialinquiry.wordpress.com Causality.

Ch 1: Scientific Understanding of Behavior Ch 4: Studying Behavior.

Public Finance and Public Policy Jonathan Gruber Third Edition Copyright © 2010 Worth Publishers 1 of 24 Copyright © 2010 Worth Publishers.

CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.

Copyright © 2015 Inter-American Development Bank. This work is licensed under a Creative Commons IGO 3.0 Attribution-Non Commercial-No Derivatives (CC-IGO.

Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses pt.1.

Hypothesis Tests for 1-Proportion Presentation 9.

+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.

Experimental Evaluations Methods of Economic Investigation Lecture 4.

Statistical Concepts Basic Principles An Overview of Today’s Class What: Inductive inference on characterizing a population Why : How will doing this allow.

Scientific Method Vocabulary Observation Hypothesis Prediction Experiment Variable Experimental group Control group Data Correlation Statistics Mean Distribution.

Statistica /Statistics Statistics is a discipline that has as its goal the study of quantity and quality of a particular phenomenon in conditions of.

Intro to Research Methods

Unit 5: Hypothesis Testing

Purpose of Research Research may be broadly classified into two areas; basic and applied research. The primary purpose of basic research (as opposed to.

Statistical Data Analysis

Chapter 4: The Nature of Regression Analysis

Explanation of slide: Logos, to show while the audience arrive.

Developing and Evaluating Theories of Behavior

Statistical Data Analysis

Positive analysis in public finance

Two Halves to Statistics

Chapter 4: The Nature of Regression Analysis

Presentation transcript:

Impact Evaluation Sebastian Galiani November 2006 Causal Inference

WBISARHDNWBISARHDN 2 Motivation  The research questions that motivate most studies in the health sciences are causal in nature. For example: What is the efficacy of a given drug in Impact: given population? What fraction of deaths from a given disease could have been avoided by a given treatment or policy?

WBISARHDNWBISARHDN 3 Motivation  The most challenging empirical questions in economics also involve causal-effect relationships: Does school decentralization improve schools quality?

WBISARHDNWBISARHDN 4 Motivation  Interest in these questions is motivated by: Policy concerns  Does privatization of water systems improve child health? Theoretical considerations Problems facing individual decision makers

WBISARHDNWBISARHDN 5 Causal Analysis  The aim of standard statistical analysis, typified by likelihood and other estimation techniques, is to infer parameters of a distribution from samples drawn of that distribution.  With the help of such parameters, one can: 1. Infer association among variables, 2. Estimate the likelihood of past and future events, 3. As well as update the likelihood of events in light of new evidence or new measurement.

WBISARHDNWBISARHDN 6 Causal Analysis  These tasks are managed well by standard statistical analysis as long as experimental conditions remain the same.  Causal analysis goes one step further: Its aim is to infer aspects of the data generation process. With the help of such aspects, one can deduce not only the likelihood of events under static conditions, but also the dynamics of events under changing conditions.

WBISARHDNWBISARHDN 7 Causal Analysis  This capability includes: 1. Predicting the effects of interventions 2. Predicting the effects of spontaneous changes 3. Identifying causes of reported events  This distinction implies that causal and associational concepts do not mix.

WBISARHDNWBISARHDN 8 Causal Analysis The word cause is not in the vocabulary of standard probability theory.  All Probability theory allows us to say is that two events are mutually correlated, or dependent – meaning that if we find one, we can expect to encounter the other.  Scientists seeking causal explanations for complex phenomena or rationales for policy decisions must therefore supplement the language of probability with a vocabulary for causality.

WBISARHDNWBISARHDN 9 Causal Analysis  Two languages for causality have been proposed: 1. Structural equation modeling (ESM) (Haavelmo 1943). 2. The Neyman-Rubin potential outcome model (RCM) (Neyman, 1923; Rubin, 1974).

WBISARHDNWBISARHDN 10 The Rubin Causal Model  Define the population by U. Each unit in U is denoted by u.  For each u  U, there is associated a value Y(u) of the variable of interest Y, which we call: the response variable.  Let A be a second variable defined on U. We call A an attribute of the units in U.

WBISARHDN 11  The key notion is the potential for exposing or not exposing each unit to the action of a cause:  Each unit has to be potentially exposable to any one of the causes.  Thus, Rubin takes the position that causes are only those things that could be treatments in hypothetical experiments.  An attribute cannot be a cause in an experiment, because the notion of potential exposability does not apply to it.

WBISARHDN 12  For simplicity, we assume that there are just two causes or level of treatment.  Let D be a variable that indicates the cause to which each unit in U is exposed: In a controlled study, D is constructed by the experimenter. In an uncontrolled study, it is determined by factors beyond the experimenter’s control.

WBISARHDN 13  The values of Y are potentially affected by the particular cause, t or c, to which the unit is exposed.  Thus, we need two response variables: Y t (u), Y c (u)  Y t is the value of the response that would be observed if the unit were exposed to t and  Y c is the value that would be observed on the same unit if it were exposed to c.

WBISARHDN 14  Let D also be expressed as a binary variable: D = 1 if D = t and D = 0 if D = c  Then, the outcome of each individual can be written as: Y(U) = D Y 1 + (1 – D) Y 0

WBISARHDN 15  Definition: For every unit u treatment {D u = 1 instead of D u = 0} causes the effect  u = Y 1 (u) – Y 0 (u)  This definition of a causal effect assumes that the treatment status of one individual does not affect the potential outcomes of other individuals.  Fundamental Problem of Causal Inference: It is impossible to observe the value of Y 1 (u) and Y 0 (u) on the same unit and, therefore, it is impossible to observe the effect of t on u.  Another way to express this problem is to say that we cannot infer the effect of treatment because we do not have the counterfactual evidence i.e. what would have happened in the absence of treatment.

WBISARHDN 16  Given that the causal effect for a single unit u cannot be observed, we aim to identify the average causal effect for the entire population or for sub-populations.  The average treatment effect ATE of t (relative to c) over U (or any sub-population) is given by: ATE =E [Y 1 (u) – Y 0 (u)] = E [Y 1 (u)] – E [Y 0 (u)] (1)

WBISARHDN 17  The statistical solution replaces the impossible- to-observe causal effect of t on a specific unit with the possible-to-estimate average causal effect of t over a population of units.  Although E(Y 1 ) and E(Y 0 ) cannot both be calculated, they can be estimated.  Most econometrics methods attempt to construct from observational data consistent estimates of

WBISARHDN 18  Consider the following simple estimator of ATE:  Note that equation (1) is defined for the whole population, whereas equation (2) represents an estimator to be evaluated on a sample drawn from that population

WBISARHDN 19  Let  equal the proportion of the population that would be assigned to the treatment group.  Decomposing ATE, we have:

WBISARHDN 20  If we assume that  Which is consistently estimated by its sample analog estimator:

WBISARHDN 21  Thus, a sufficient condition for the standard estimator to consistently estimate the true ATE is that:  In this situation, the average outcome under the treatment and the average outcome under the control do not differ between the treatment and control groups.  In order to satisfy these conditions, it is sufficient that treatment assignment D be uncorrelated with the potential outcome distributions of Y 1 and Y 2.  The principal way to achieve this uncorrelatedness is through random assignment of treatment.

WBISARHDN 22  In most circumstances, there is simply no information available on how those in the control group would have reacted if they had received the treatment instead.  This is the basis for an important insight into the potential biases of the standard estimator (2).  After a bit of algebra, it can be shown that:

WBISARHDN 23  This equation specifies the two sources of biases that need to be eliminated from estimates of causal effects from observational studies. 1. Selection Bias: Baseline difference. 2. Treatment Heterogeneity.  Most of the methods available only deal with selection bias, simply assuming that the treatment effect is constant in the population or by redefining the parameter of interest in the population.

WBISARHDNWBISARHDN 24 Treatment on the Treated  ATE is not always the parameter of interest.  In a variety of policy contexts, it is the average treatment effect for the treated that is of substantive interest: TOT =E [Y 1 (u) – Y 0 (u)| D = 1] = E [Y 1 (u)| D = 1] – E [Y 0 (u)| D = 1]

WBISARHDNWBISARHDN 25 Treatment on the Treated  The standard estimator (2) consistently estimates TOT if:

WBISARHDNWBISARHDN 26 Structural Equation Modeling  Structural equation modeling was originally developed by geneticists (Wright 1921) and economists (Haavelmo 1943).

WBISARHDNWBISARHDN 27 Structural Equations  Definition: An equation y = β x + ε (8) is said to be structural if it is to be interpreted as follows:  In an ideal experiment where we control X to x and any other set Z of variables (not containing X or Y) to z, the value y of Y is given by β x + ε, where ε is not a function of the settings x and z.  This definition is in the spirit of Haavelmo (1943), who explicitly interpreted each structural equation as a statement about a hypothetical controlled experiment.

WBISARHDN 28  Thus, to the often asked question, “Under what conditions can we give causal interpretation to structural coefficients?”  Haavelmo would have answered: Always!  According to the founding father of SEM, the conditions that make the equation y = β x + ε structural are precisely those that make the causal connection between X and Y have no other value but β, and ensuring that nothing about the statistical relationship between x and ε can ever change this interpretation of β.

WBISARHDN 29  The average causal effect: The average causal effect on Y of treatment level x is the difference in the conditional expectations: E(Y|X = x) – E(Y|X = 0)  In the context of dichotomous interventions (x = 1), this causal effect is called the average treatment effect (ATE).

WBISARHDNWBISARHDN 30 Representing Interventions  Consider the structural model M: z = f z (w) x = f x (z, ) y = f y (x, u)  We represent an intervention in the model through a mathematical operator denoted d 0 (x).  d 0 (x) simulates physical interventions by deleting certain functions from the model, replacing them by a constant X = x, while keeping the rest of the model unchanged.

WBISARHDN 31  To emulate an intervention d 0 (x 0 ) that holds X constant (at X = x 0 ) in model M, replace the equation for x with x = x 0, and obtain a new model, M x0 z = f z (w) x = x 0 y = f y (x, u)  The joint distribution associated with the modified model, denoted P(z, y| d 0 (x 0 )) describes the post-intervention (“experimental”) distribution.  From this distribution, one is able to assess treatment efficacy by comparing aspects of this distribution at different levels of x 0.

WBISARHDNWBISARHDN 32 Structural Parameters  Definition: The interpretation of a structural equation as a statement about the behavior of Y under a hypothetical intervention yields a simple definition for the structural parameters. The meaning of β in the equation y = β x + ε is simply

WBISARHDNWBISARHDN 33 Counterfactual Analysis in Structural Models  Consider again model M xo. Call the solution of Y the potential response of Y to x 0.  We denote it as Y x0 (u,, w).  This entity can be given a counterfactual interpretation, for it stands for the way an individual with characteristics (u,, w) would respond, had the treatment been x 0, rather than the x = f x (z, ) actually received by the individual.

WBISARHDN 34  In our example, Y x0 (u,, w) = Y x0 (u) = y = f y (x 0, u ) This interpretation of counterfactuals, cast as solutions to modified systems of equations, provides the conceptual and formal link between structural equation modeling and the Rubin potential-outcome framework. It ensures us that the end results of the two approaches will be the same. Thus, the choice of model is strictly a matter of convenience or insight.

WBISARHDNWBISARHDN 35 References  Judea Pearl (2000): Causality: Models, Reasoning and Inference, CUP. Chapters 1, 5 and 7.  Trygve Haavelmo (1944): “The probability approach in econometrics”, Econometrica 12, pp. iii-vi  Arthur Goldberger (1972): “Structural Equations Methods in the Social Sciences”, Econometrica 40, pp  Donald B. Rubin (1974): “Estimating causal effects of treatments in randomized and nonrandomized experiments”, Journal of Educational Psychology 66, pp  Paul W. Holland (1986): “Statistics and Causal Inference”, Journal of the American Statistical Association 81, pp , with discussion.