Download presentation
Presentation is loading. Please wait.
Published byCory Osborne Ramsey Modified over 9 years ago
1
An introduction to Impact Evaluation and application to the Ethiopia NFSP Workshop on Approaches to Evaluating The Impact of the National Food Security Program, Addis Ababa, January 16, 2006 Arianna Legovini, The World Bank
2
ET NFSP Arianna Legovini2 Content 1. 1.Context 2. 2.Motivation 3. 3.Defining impact 4. 4.The evaluation problem 5. 5.Problems with naïve evaluations 6. 6.Application to the NFSP
3
ET NFSP Arianna Legovini3 1. Context Most social programs are targeted to a specific group of the population. Within that group some units (individuals, households, communities) get the program and others do not, or some get it at the beginning of program implementation and others get it later, … depending on budget constraints, implementation capacity, and targeting efficiency.
4
ET NFSP Arianna Legovini4 2. Motivating rigorous impact evaluation Monitoring tells us whether things are moving in the right direction but cannot tell us the cause of those movements Rigorous impact evaluation causally links activities to results. There are three main functions for impact evaluation: It is a political tool to manage public opinion It is a fiscal tool to allocate budget resources more efficiently, and It is a management tool to improve program design, results and sustainability
5
ET NFSP Arianna Legovini5 Political tool Public opinion can easily shift in response to few publicized failures Empirical evidence on program impacts give governments the ability to effectively manage public opinion (media, parliament)
6
ET NFSP Arianna Legovini6 Fiscal tool Shift in budget allocation are often made on hunches of what might deliver more results Impact evaluation turns hunches into quantitative estimates of the cost- effectiveness of alternative programs in reaching desired objectives It provides the basis for shifting resources towards more effective programs
7
ET NFSP Arianna Legovini7 Management tool This is where impact evaluation can make huge contributions By testing alternative ways of doing things within the same program, impact evaluation can measure the relative effectiveness of alternatives Feedback can be used to modify program design during implementation to ensure success. It is the ultimate managing-by-results tool.
8
ET NFSP Arianna Legovini8 3. Defining Impact In the language of the impact evaluation profession, the impact of a program is the amount of change in any outcome (short, medium or long-term) which is caused by the program. Impact can be measured at any time: Even just after program announcement (e.g. change in expectations may lead to changes in asset prices even before a program is implemented), or Any time after implementation (e.g. some changes are instantaneous and some take time to realize BUT if nothing moves in the short-run, you can rest assured nothing will move in the long run either)
9
ET NFSP Arianna Legovini9 4. The evaluation problem The impact of the program is the difference between the outcome with the program and the outcome without the program. BUT we only observe outcomes with OR without the program, not both at the same time. This means that the problem of measuring impact is a problem of missing data—not because we do not collect them, but because they don’t exist.
10
ET NFSP Arianna Legovini10 Impact evaluation is a set of rigorous analytical methods applied to solve the missing data problem by building a counterfactual and measuring observed outcomes against it.
11
ET NFSP Arianna Legovini11 Observed changes and changes against a counterfactual
12
ET NFSP Arianna Legovini12 Solving the missing data problem Need to build a control or comparison group that looks the same at time zero as the group getting the program (treatment group) What does “look the same” mean? It means that, on average, the comparison and treatment group only differ by the fact that one gets the program and the other does not.
13
ET NFSP Arianna Legovini13 Building a control Experimental approach : Randomization Give all units in the eligible population an equal chance to participate; and compare those who win the lottery with those who did not win the lottery Give all units in the eligible population an equal chance to be included in either phase I or phase II of the program; and then use units in phase II as controls for phase I
14
ET NFSP Arianna Legovini14 Building a comparison Non-experimental approach : Matching: match characteristics of participants and non-participants at time zero; and then compare changes in outcomes in treated population against changes in outcomes in matched observations Regression discontinuity: assume that because of measurement error, units just above or just below the eligibility threshold are the same on average; and compare participants just below the threshold with non-participants just above the threshold Pipeline comparisons: compare current participants to prospective participants Instrumental variables: identify exogenous variable that influences participation but not outcomes to identify exogenous variation in outcomes due to the program
15
ET NFSP Arianna Legovini15 Estimating impact Single-difference (outcome of treatment minus outcome of control at time t+1) can be used in experimental setting with sufficiently large samples Double-difference (before and after and against a control) is used in experimental setting with small samples and non- experimental settings to control for selection bias
16
ET NFSP Arianna Legovini16 Data requirements Surveys with representative samples of beneficiaries and non-beneficiaries Questionnaires include outcomes of interest (consumption, income, assets etc), questions about the program in question and questions about other programs
17
ET NFSP Arianna Legovini17 Data requirements Experimental approach with large samples only requires survey at time t+1 (single diff). In this case baseline is needed only if we are interested in measuring targeting efficiency. Experimental with small sample and non- experimental generally require baseline and follow up panel data (double diff)
18
ET NFSP Arianna Legovini18 5. Problems with naive evaluation techniques Comparisons of beneficiaries before and after: outcomes change for many reasons, how to determine which portion of that change is due to the program? Comparisons ex-post relative to the rest of the population: if program targets the worse off, comparisons ex-post will compare beneficiaries with non-beneficiaries that were initially somewhat better off and wrongly conclude program did not have positive impacts
19
ET NFSP Arianna Legovini19 6. Application to the National Food Security Program The NFSP M&E plan: Is a good monitoring plan using administrative data with a good logical framework However, it has no plan for assessing targeting efficiency and the plan for evaluation is flawed (no comparison) The sampling for the baseline only surveys beneficiaries and the questionnaire does not permit to estimate consumption
20
ET NFSP Arianna Legovini20 Current NFSP evaluation plan Plan to track changes in outcomes before and after and against (unspecified) standards BUT program outcomes can improve or worsen because of e.g. the weather. These effects might be much larger and completely overshadow the effects of the program. Adverse weather will have us conclude that the program is ineffective irrespectively
21
ET NFSP Arianna Legovini21 Strengthening NFSP impact evaluation Define for which activities we want to estimate impact and ask the right questions to improve program overtime Implement framework to answer these questions by adding variation of treatment of beneficiaries Build comparison groups via randomized experiments or the estimation of a counterfactual using statistical methods Expand survey sample to include comparison, include a consumption module in the questionnaire, and collect panel data Estimate impacts using double difference estimators
22
ET NFSP Arianna Legovini22 Impact of which NFSP activities? Productive safety-nets: Public works Workfare/cash or food transfers Unconditional transfers to labor-poor households Resettlement Other food security activities Financial services Agricultural extension Water Land conservation, etc.
23
ET NFSP Arianna Legovini23 Opportunities to find a comparison Use phase-in of the program to compare current beneficiaries with prospective beneficiaries since not all 15 M are beneficiaries at this time (pipeline comparison) Compare woredas or kebeles with the program with similar woredas or kebeles without the program (matching) Compare beneficiary households that barely qualified for the program with non-beneficiary households that barely did not (regression discontinuity design)
24
ET NFSP Arianna Legovini24 Data and analysis The questionnaire should include questions to capture the outcomes of the program. Currently the questionnaire includes questions on assets, savings and credit, and agricultural activities Including a consumption module would provide the basis for measuring welfare and graduation in a reliable and quantitative manner. Collect panel data (baseline and follow up survey on the same sample of beneficiaries and non-beneficiaries). Estimate impacts
25
ET NFSP Arianna Legovini25 Yet, the most compelling reason to do impact evaluation is to learn how to improve program effectiveness over time This requires adding variation in program treatment to determine which mechanisms work best to achieve e.g. graduation
26
ET NFSP Arianna Legovini26 Examples of what can be learned: is farm only best? Is a wider menu of farm and non-farm support more effective than farm support only?—Take a random sample of localities where we include support to non-farm activities; and compare results overtime.
27
ET NFSP Arianna Legovini27 More examples of what can be learned—how much is enough? To secure household graduation, is it more effective to spread support thin across all eligible households (or communities) or to focus NFSP effort on a portion of the households (or communities) this year, another portion next year, etc?—randomly select a small group of beneficiaries, double the fiscal effort on these households, and monitor whether graduation rates more than double or not
28
ET NFSP Arianna Legovini28 More examples of what can be learned—credit pick up rates and effectiveness? Does one size fit all? Would modifying the size of the credit package improve credit pick up rates?— randomly select some communities where household can choose the size of the credit; and compare with one size credit communities To improve the effectiveness of credit, would a package of credit-training-technical assistance improve credit effectiveness? (evidence from the rest of the world says yes)—randomly select some communities where training and/or TA is attached to credit; and compare with credit only communities
29
ET NFSP Arianna Legovini29 More examples of what can be learned—Quality of productive community assets Does the threat of audit significantly increase the quality of productive community assets?—take a random sample of communities and communicate intention to audit project implementation (Evidence from Indonesia says yes)
30
ET NFSP Arianna Legovini30 Concluding remarks Credible impact estimates require a strengthened framework for evaluation. This requires taking the following steps: Agreeing on the questions that can be asked Designing a framework that can answer these questions (program delivery, evaluation method, sampling, questionnaire) Implementing the framework, analyzing and reporting periodically Improving program overtime based on evidence
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.