A stochastic dominance approach to program evaluation

Slides:



Advertisements
Similar presentations
Statistical Techniques I EXST7005 Start here Measures of Dispersion.
Advertisements

CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.
Micro-level Estimation of Child Undernutrition Indicators in Cambodia Tomoki FUJII Singapore Management
Presented by Malte Lierl (Yale University).  How do we measure program impact when random assignment is not possible ?  e.g. universal take-up  non-excludable.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Stressful Life Events and Its Effects on Educational Attainment: An Agent Based Simulation of the Process CS 460 December 8, 2005.
Douglas Almond Joseph J. Doyle, Jr. Amanda E. Kowalski Heidi Williams
1 A REVIEW OF QUME 232  The Statistical Analysis of Economic (and related) Data.
Descriptive Statistics: Numerical Measures
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Anthropometry Technique of measuring people Measure Index Indicator Reference Information.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 9: Hypothesis Tests for Means: One Sample.
Clustered or Multilevel Data
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
Getting Started with Hypothesis Testing The Single Sample.
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
By Jayelle Hegewald, Michele Houtappels and Melinda Gray 2013.
I OWA S TATE U NIVERSITY Department of Animal Science U.S – 2013 Pork Industry Productivity Analysis J. Stock 1, C. E. Abell 1, C. Hostetler 2, and.
Lecture 16 Correlation and Coefficient of Correlation
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
Think of a topic to study Review the previous literature and research Develop research questions and hypotheses Specify how to measure the variables in.
The Impact of Air Pollution on Infant Mortality: Evidence from Geographic Variation in Pollution Shocks Induced by a Recession Kenneth Y. Chay and Michael.
Chapter 3 – Descriptive Statistics
1.3 Psychology Statistics AP Psychology Mr. Loomis.
STOCHASTIC DOMINANCE APPROACH TO PORTFOLIO OPTIMIZATION Nesrin Alptekin Anadolu University, TURKEY.
Lecture 3 A Brief Review of Some Important Statistical Concepts.
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
Finding Meaning in Our Measures: Overcoming Challenges to Quantitative Food Security USDA Economic Research Service February 9, 2015 Food Security As Resilience:
The Land Leverage Hypothesis Land leverage reflects the proportion of the total property value embodied in the value of the land (as distinct from improvements),
Felix Naschold Cornell University & University of Wyoming Christopher B. Barrett Cornell University AAEA 27 July 2010 A stochastic dominance approach to.
Developing a Tool to Measure Health Worker Motivation in District Hospitals in Kenya Patrick Mbindyo, Duane Blaauw, Lucy Gilson, Mike English.
Welfare Regimes and Poverty Dynamics: The Duration and Recurrence of Poverty Spells in Europe Didier Fouarge & Richard Layte Presented by Anna Manzoni.
Permanent effects of economic crises on household welfare: Evidence and projections from Argentina’s downturns Guillermo Cruces Pablo Gluzmann CEDLAS –
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Measures of Relative Standing Percentiles Percentiles z-scores z-scores T-scores T-scores.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
CORRELATION: Correlation analysis Correlation analysis is used to measure the strength of association (linear relationship) between two quantitative variables.
Discussion of time series and panel models
Applying impact evaluation tools A hypothetical fertilizer project.
Non-experimental methods Markus Goldstein The World Bank DECRG & AFTPM.
Percentiles Corlia van Vuuren February 2011.
Chapter 3, Part B Descriptive Statistics: Numerical Measures n Measures of Distribution Shape, Relative Location, and Detecting Outliers n Exploratory.
Simple linear regression Tron Anders Moger
Food Security As Resilience: Reconciling Definition and Measurement
FINAL MEETING – OTHER METHODS Development Workshop.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
CROSS-COUNTRY INCOME MOBILITY COMPARISONS UNDER PANEL ATTRITION: THE RELEVANCE OF WEIGHTING SCHEMES Luis Ayala (IEF, URJC) Carolina Navarro (UNED) Mercedes.
Randomized Assignment Difference-in-Differences
Bilal Siddiqi Istanbul, May 12, 2015 Measuring Impact: Non-Experimental Methods.
Determinants of Corruption in Local Health Care Provision: Evidence from 105 Municipalities in Bolivia Roberta Gatti, George Gray-Molina and Jeni Klugman.
1 Measuring Poverty: Inequality Measures Charting Inequality Share of Expenditure of Poor Dispersion Ratios Lorenz Curve Gini Coefficient Theil Index Comparisons.
Hypothesis Testing and Statistical Significance
POVERTY IN KENYA, 1994 – 1997: A STOCHASTIC DOMINANCE APPROACH.
Simulation-based inference beyond the introductory course Beth Chance Department of Statistics Cal Poly – San Luis Obispo
Stats Methods at IC Lecture 3: Regression.
Estimating standard error using bootstrap
BAE 6520 Applied Environmental Statistics
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Simple Linear Regression
Simple Linear Regression
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Summary of Slide Content
Presentation transcript:

A stochastic dominance approach to program evaluation And an application to child nutritional status in arid and semi-arid Kenya A theoretical paper: program evaluation VS. stochastic dominance Empirical application: ALRMP evaluation project AAEA paper  comments, please Felix Naschold University of Wyoming Christopher B. Barrett Cornell University May 2012 seminar presentation University of Sydney

Motivation Program Evaluation Methods Stochastic dominance By design they focus on mean Ex: “average treatment effect” (ATE) In practice, often interested in broader distributional impact Limited possibility for doing this by splitting sample Stochastic dominance By design, look at entire distribution Now commonly used in snapshot welfare comparisons But not for program evaluation. Ex: “differences-in-differences” This paper merges the two  Diff-in-Diff (DD) evaluation using stochastic dominance (SD) to compare changes in distributions over time between intervention and control populations As much as this is true for many parts of the worlds in terms of MDGs, also for villages in rural India What do we do about this as researchers? One aspect: who moves in and out of poverty? Characterstics and targeting?

Main Contributions Proposes DD-based SD method for program evaluation First application to evaluating welfare changes over time Specific application to new dataset on changes in child nutrition in arid and semi-arid lands (ASAL) of Kenya Unique, large dataset of 600,000+ observations collected by the Arid Lands Resource Management Project (ALRMP II) in Kenya (One of) first to use Z-scores of Mid-upper arm circumference (MUAC)

Main Results Methodology Empirical results (relatively) straight-forward extension of SD to dynamic context: static SD results carry over Interpretation differs (as based on cdfs) Only feasible up to second order SD Empirical results Child malnutrition in Kenyan ASALs remains dire No average treatment effect of ALRMP expenditures Differential impact with fewer negative changes in treatment sublocations ALRMP a nutritional safety net?

Program evaluation (PE) methods Fundamental problem of PE: want to but cannot observe a person’s outcomes in treatment and control state Solution 1: make treatment and control look the same (randomization) Gives average treatment effect as Solution 2: compare changes across treatment and control (Difference-in-Difference) Gives average treatment effect as: To next slide: So what is the key drawback of standard PE methods?

New PE method based on SD Objective: to look beyond the ‘average treatment effect’ Approach: SD compares entire distributions not just their summary statistics Two advantages Circumvents (highly controversial) cut-off point Examples: poverty line, MUAC Z-score cut-off Unifies analysis for broad classes of welfare indicators

Stochastic Dominance First order: A FOD B up to iff Sth order: A sth order dominates B iff MUAC Z-score Cumulative % of population FA(x) FB(x) xmax

SD and single differences These SD dominance criteria Apply directly to single difference evaluation (across time OR across treatment and control groups) Do not directly apply to DD Literature to date: Single paper: Verme (2010) on single differences SD entirely absent from the program evaluation literature (e.g., Handbook of Development Economics)

Expanding SD to DD estimation - Method Practical importance: evaluate beyond-mean effect in non- experimental data Let , and G denote the set of probability density functions of Δ, with 𝑔 𝐴 ∆ , 𝑔 𝐵 ∆ ∈𝑮 The respective cdfs of changes are GA(Δ) and GB(Δ) Then A FOD B iff A Sth order dominates B iff

Expanding SD to DD: interpretation differences 1. Cut-off point in terms of changes not levels. Cdf orders change from most negative to most positive  ‘initial poverty blind’ or ‘initial malnutrition blind’. (Partial) remedy: run on subset of ever-poor/always-poor 2. Interpretation of dominance orders FOD: differences in distributions of changes between intervention and control sublocations SOD: degree of concentration of these changes at lower end of distributions TOD: additional weight to lower end of distribution. Is there any value to doing this for welfare changes irrespective of absolute welfare? Probably not.

Setting and data Arid and Semi-arid districts in Kenya Data Characterized by pastoralism Highest poverty incidences in Kenya, high infant mortality and malnutrition levels above emergency thresholds Data From Arid Lands Resource Management Project (ALRMP) Phase II 28 districts, 128 sublocations, June 05- Aug 09, 602,000 child obs. Welfare Indicator: MUAC Z-scores Severe malnutrition in 2005/6: Median child MUAC z-score -1.22/-1.12 (Intervention/Control) 10 percent of children had Z-scores below -2.31/-2.14 (I/C) 25 percent of children had Z-scores below -1.80/-1.67 (I/C)

The pseudo panel Sublocation-specific pseudo panel 2005/06-2008/09 Why pseudo-panel? Inconsistent child identifiers MUAC data not available for all children in all months Graduation out of and birth into the sample How? 14 summary statistics for annual mean monthly sublocation - specific stats: mean & percentiles and ‘poverty measures’ Focus on malnourished children Thus, present analysis median MUAC Z-score of children z ≤ 0 Control and intervention according to project investment

Results: DD Regression Pseudo panel regression model where D is the intervention dummy variable of interest NDVI is a control for agrometeorological conditions L are District fixed effects to control for unobservables within major jurisdictions No statistically significant average program impact

DD regression panel results (1) (2) (3) (4) (5) VARIABLES median of MUAC Z <0 10th percentile 25th percentile median of MUAC Z <-1 median of MUAC Z <-2 intervention dummy 0.0735 0.0832 0.0661 0.0793 0.0531 (0.248) (0.316) (0.371) (0.188) (0.155) change in NDVI 2005/06-08/09 1.308* 2.611*** 2.058*** 0.927* 0.768* (0.0545) (0.00294) (0.00754) (0.0997) (0.0767) (change in NDVI)2 2005/06-08/09 -12.91** -8.672 -12.70* -0.954 1.924 (0.0293) (0.136) (0.0510) (0.802) (0.479) Constant 0.501*** 0.892*** 0.839*** 0.203*** 0.120*** (2.99e-07) (1.40e-08) (8.70e-09) (0.000133) (0.00114) Observations 114 106 R-squared 0.319 0.299 0.297 0.249 0.280 Not significant dummy Few observations What do you think of the fit? Not bad given just location dummies and NDVI? Robust p-values in parentheses *** p<0.01, ** p<0.05, * p<0.1 District dummy variables included.

SD Results Three steps: Steps 1 & 2: Simple differences SD within control and treatment over time: No difference in trends. Both improved slightly. SD control vs. treatment at beginning and at end: Control sublocations dominate in most cases, intervention never dominates. Step 3: SD on Diff-in-Diff (results focus for today)

Expanding SD to DD – controlling for covariates In regression Diff-in-Diff: simply add (linear) controls In SD-DD need a two step method Regress outcome variable on covariates Use residuals (the unexplained variation) in SD-DD In application below, use first stage controls for agro- meteorological conditions (as reflected in remotely-sensed vegetation measure, NDVI).

For (drought-adjusted) median MUAC z-scores: Below z=0.2, intervention sites FOD control sites, although not at 5% statistical significance level. ALRMP interventions appear moderately effective in preventing worsening nutritional status among children.

Similar results at other quantile breaks

Similar results at other quantile breaks

Conclusions Existing program evaluation approaches focus on estimating the average treatment effect. In some cases, that is not really the impact statistic of interest. This paper introduces a new SD-based method to evaluate impact across entire distribution for non-experimental data Results show the practical importance of looking beyond averages Standard Diff-in-Diff regressions: no impact at the mean SD DD: intervention locations had fewer negative observations and of smaller magnitude, especially median and below ALRMP II may have functioned as nutritional safety net (though only correlation, there is no way to establish causality)

Thank you for your time, interest and comments

SD, poverty & social welfare orderings (1) 1. SD and Poverty orderings Let SDs denote stochastic dominance of order s and Pα stand for poverty ordering (‘has less poverty’) Let α=s-1 Then A Pα B iff A SDs B SD and Poverty orderings are nested A SD1 B  A SD2 B  A SD3B A P1 B  A P2 B  A P3 B Define U1 as the subset of U for which u’>0. U1 monotonic utilitarian welfare functions. Less malnutrition is better, regardless for whom. Let U2 be a subset of U1 such that u’’<0. This subset of social welfare functions represents equality preference in that a mean preserving progressive transfer increases U2. Finally, define U3 as the subset of U2 for which u’’’>0. U3 contains the transfer sensitive social welfare functions which value a transfer more highly the lower in the distribution it occurs. Again, with a nutritional indicator this is only defensible up to a certain point, but certainly up to xmax.

SD, poverty & social welfare orderings (2) 2. Poverty and Welfare orderings (Foster and Shorrocks 1988) Let U(F) be the class of symmetric utilitarian welfare functions Then A Pα B iff A Uα B Examples: U1 represents the monotonic utilitarian welfare functions such that u’>0. Less malnutrition is better, regardless for whom. U2 represents equality preference welfare functions such that u’’<0. A mean preserving progressive transfer increases U2. U3 represents transfer sensitive social welfare functions such that u’’’>0. A transfer is valued more lower in the distribution Bottom line: For welfare levels tests up to third order make sense

The data (2) – extent of malnutrition

DD Regression 2 Individual MUAC Z-score regression To test program impact with much larger data set Still no statistically significant average program impact

Results – DD regression indiv data Dependent variable: Individual MUAC Z-score VARIABLES time dummy (=1 for 2008/09) 0.0785 (0.290) control - intervention by investment -0.0576 (0.425) Diff in diff 0.0245 (0.782) Normalized Difference Vegetation Index 1.029*** (6.25e-07) Constant -1.391*** (0) Observations 271061 R-squared 0.033 Not significant, even with very large sample size Robust p-values in parentheses *** p<0.01, ** p<0.05, * p<0.1 District dummy variables included.

Full SD results 2 data sets; 2 indicators for panel data Significant results for individual data 08 and control dominate DD only for panel data (focus for the rest, as new method for this)  Look at some of these results in more detail (across whole distribution)