IMPACT EVALUATION AND ANALYSIS OF DEVELOPMENT INTERVENTIONS Increasing power through repeated measures Ruth Vargas Hill, IFPRI.

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

Promoting Rational Drug Use in the Community Monitoring and evaluation.
Design Supplemental.
Jean KABONGO KALONJI Development Channels Manager FINCA DRC FINCA DRC pilot phase experience with PoS Agency Network to deliver financial services to Youth.
BUSINESS AND FINANCIAL LITERACY FOR YOUNG ENTREPRENEURS: EVIDENCE FROM BOSNIA-HERZEGOVINA Miriam Bruhn and Bilal Zia (World Bank, DECFP)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Evaluating a Microfinance Expansion in Egypt David Mckenzie.
1 Statistical Inference H Plan: –Discuss statistical methods in simulations –Define concepts and terminology –Traditional approaches: u Hypothesis testing.
Break-Even Analysis What is it? By John Birchall.
Statistics Micro Mini Threats to Your Experiment!
Non-Experimental designs: Developmental designs & Small-N designs
Consumer Expenditure Survey Redesign Jennifer Edgar Bureau of Labor Statistics COPAFS Quarterly Meeting March 4, 2011.
Agenda: Block Watch: Random Assignment, Outcomes, and indicators Issues in Impact and Random Assignment: Youth Transition Demonstration –Who is randomized?
Practical Sampling for Impact Evaluations Cyrus Samii, Columbia University.
Starting a Business Conducting Start-up Market Research METHODS OF PRIMARY AND SECONDARY MARKET RESEARCH QUALITATIVE AND QUANTITATIVE RESEARCH SIZE AND.
Experiments and Observational Studies.  A study at a high school in California compared academic performance of music students with that of non-music.
Tobacco Control Interventions – Design Trade-Offs K. S. (Steve) Brown Department of Statistics and Actuarial Science Health Behaviour Research Group University.
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
Do financial management tools improve credit access among disadvantaged sectors? Evidence from the use of an Integrated Platform for Company Management.
TRANSLATING VISITS INTO PATIENTS USING AMBULATORY VISIT DATA (Hypertensive patient case study) by Esther Hing, M.P.H. and Julia Holmes, Ph.D U.S. DEPARTMENT.
Copyright © 2010 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Experiments and Observational Studies. Observational Studies In an observational study, researchers don’t assign choices; they simply observe them. look.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 11 Experimental Designs.
Povertyactionlab.org Planning Sample Size for Randomized Evaluations Esther Duflo MIT and Poverty Action Lab.
SEDA IMPACT EVALUATION WESTERN CAPE (SOUTH AFRICA) Varsha Harinath (the dti) Francisco Campos (World Bank) Finance and Private Sector Development IE Workshop.
Probability, contd. Learning Objectives By the end of this lecture, you should be able to: – Describe the difference between discrete random variables.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys Bangkok,
Framework for Monitoring Learning & Evaluation
Evidencing Outcomes Ruth Mann / George Box Commissioning Strategies Group, NOMS February 2014 UNCLASSIFIED.
Assisting GPRA Report for MSP Xiaodong Zhang, Westat MSP Regional Conference Miami, January 7-9, 2008.
Research Seminars in IT in Education (MIT6003) Research Methodology I Dr Jacky Pow.
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
Rigorous Quasi-Experimental Evaluations: Design Considerations Sung-Woo Cho, Ph.D. June 11, 2015 Success from the Start: Round 4 Convening US Department.
McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. Using Single-Subject Designs.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
Data Triangulation Applications in Population and Health Programs- -The Global Experience.
WHAT IS THE COST OF FORMALITY: EXPERIMENTALLY ESTIMATING THE DEMAND FOR FORMALIZATION David McKenzie, World Bank (With Suresh de Mel and Chris Woodruff)
PSY2004 Research Methods PSY2005 Applied Research Methods Week Six.
Impact Evaluation “Randomized Evaluations” Jim Berry Asst. Professor of Economics Cornell University.
Reliability of one cognitive process
14 Statistical Testing of Differences and Relationships.
Quasi Experimental and single case experimental designs
Evaluation Designs Adrienne DiTommaso, MPA, CNCS Office of Research and Evaluation.
4. Marketing research After carefully studying this chapter, you should be able to: Define marketing research; Identify and explain the major forms of.
MARKET APPRAISAL. Steps in Market Appraisal Situational Analysis and Specification of Objectives Collection of Secondary Information Conduct of Market.
Demand Forecasting Prof. Ravikesh Srivastava Lecture-11.
Plan for Today: Chapter 1: Where Do Data Come From? Chapter 2: Samples, Good and Bad Chapter 3: What Do Samples Tell US? Chapter 4: Sample Surveys in the.
A mortgage is a loan that a person obtains to buy a house For most people, this will be the largest purchase they will make in the course of their lifetime….
An example of work in progress: Evaluating the ILO’s Business Training Program for Female Entrepreneurs in Kenya David McKenzie, World Bank Susana Puerto,
Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 4: Estimating parameters with confidence.
Common Pitfalls in Randomized Evaluations Jenny C. Aker Tufts University.
1 Market research. 2 Market research is the process of gathering and interpreting data about customers and competitors within a firm’s target market.
STAT 203 Observational Studies and Experiments Dr. Bruce Dunham Department of Statistics UBC Lecture 20.
Chapter 11: Test for Comparing Group Means: Part I.
IMPACT EVALUATION PBAF 526 Class 5, October 31, 2011.
DATA FOR EVIDENCE-BASED POLICY MAKING Dr. Tara Vishwanath, World Bank.
Demand Estimation and Forecasting Dr. Nihal Hennayake.
DSCI 346 Yamasaki Lecture 7 Forecasting.
Market research THE TIMES 100.
Bubbles , Info Cascades & Behavioral Economics.
Improving the Design of STEM Impact Studies: Considerations for Statistical Power Discussant Notes Cristofer Price SREE
Chapter Six Training Evaluation.
Statistical Methods Carey Williamson Department of Computer Science
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Solving the Surveyor’s Dilemma: Estimating Future Outcomes from Innovation Programs – the case of the Air Force and Navy SBIR/STTR programs Robin Gaster.
Carey Williamson Department of Computer Science University of Calgary
Sample Sizes for IE Power Calculations.
Presentation transcript:

IMPACT EVALUATION AND ANALYSIS OF DEVELOPMENT INTERVENTIONS Increasing power through repeated measures Ruth Vargas Hill, IFPRI

Choosing outcome variables  Choosing too many outcome variables will inevitably result in one of them being positive.  Tendency to do this in an attempt to find some impact!  Avoid doing this:  Pre-specify outcomes of interest (publish protocols online)  Report results on all measured outcomes, even null results  Correct statistical tests In genetics they do this through the Bonferroni correction, this over-corrects. But it can be a good discipline!

Choosing outcome variables  However, can be easy to end up with a lot of outcomes!  Variables of ultimate interest (e.g. consumption per capita) have many determinants, so it is unlikely that the intervention will have a large detectable effect.  We will want to look at intermediate outcomes.  Intermediate outcomes can be chosen carefully by thinking through the theory of change  Stated changes:  Back these up with measurement of the underlying change in behavior

Choosing outcome variables  Even when you know where you want to see impact, how do you collect data to document this?  Document behaviour, rather than perceived changes  Respondents may be tempted to report changes in stated outcomes (did you change your behavior as a result of….) that do not reflect change in underlying behavior  Back these up with measurement of the underlying change in behavior  Think of outcomes that are likely to be well measured  Highly variable outcomes or outcomes measured with a lot of noise, have a very large MDE for a given randomization design.

Working with outcomes with a high variance  Highly variable outcomes or outcomes measured with a lot of noise:  Take repeated measures. Although be aware of increasing saliency.  Improve accuracy of measurement with shorter recall or other means of collection (diaries, regular visits, records at marketing place of extension agent)  Must do the same for both treatment and control.

Improving measurement and repeated measures Improving measurement:  Careful supervision of surveys, use PDAs where possible, multiple questions key outcome variables  Visiting a household at the right time reduces recall error, conduct surveys after the main agricultural events to be assessed – planting, fertilizer application, harvest, sales of harvest  Visiting a household more often reduces recall error: number of loans taken in a year, number of gifts given or received.

Improving measurement and repeated measures Improving measurement:  Rely on more than just survey responses: field visits, extension officer reports, MFI loan data, sales data, data collected by traders or in markets Repeated measures:  If the outcome of interest is highly variable with little autocorrelation across time (e.g. trader sales) then repeated surveys increases power McKenzie Beyond baseline and follow-up. The case for more T in experiments. World Bank Policy Research Working Paper 5639

Repeated measurement is standard practice

Difference in difference

ANCOVA

Comparing dif-dif and ancova Estimation method mrEstimatorVariance of estimator Dif-dif11 (Y(T) 1 - Y(C) 1 ) - (Y(T) 0 - Y(C) 0 ) 4  2 (1-  )/n Ancova11 (Y(T) 1 - Y(C) 1 ) -   (Y(T) 0 - Y(C) 0 )4  2 (1-   )/n More power in ancova

ANCOVA

Other (better?) ways to use repeated measures  McKenzie Beyond baseline and follow-up. The case for more T in experiments. World Bank Policy Research Working Paper 5639  Why do we usually only do baseline and follow-up, even when in some cases we have reason to believe the baseline data will not increase our power?  How about collecting repeated measures over time before or after the intervention?  Will that help? It might.

Motivation  McKenzie conducting a number of experiments on entrepreneur training and wanted to look at the impact on profits.  Highly variable outcome  Profits vary a lot over time, some months are very good, some months are very bad  Recall is a big problem if the recall period is too long.  Often small n  Often not very many firms that can be randomized  So can’t increase the n in the sample easily.

Two different ways to use repeated measures 1. Collecting data at repeated points in time after the intervention can help us see what the short and long run impact of the intervention is  E.g. Measuring impact at 1 year and 2 years  Here, there is no power gain. Use only t=1 for measuring impact after 1 year and only t=2 for measuring impact after 2 years. 2. Using multiple measures to estimate an average impact of the intervention.  E.g. estimating impact at 9, 12 and 15 months after the intervention  Using all measures to increase power of test of the impact after about 1 year after the intervention.

Two different ways to use repeated measures 1. Collecting data at repeated points in time after the intervention can help us see what the short and long run impact of the intervention is  E.g. Measuring impact at 1 year and 2 years  Here, there is no power gain. Use only t=1 for measuring impact after 1 year and only t=2 for measuring impact after 2 years. 2. Using multiple measures to estimate an average impact of the intervention.  E.g. estimating impact at 9, 12 and 15 months after the intervention  Using all measures to increase power of test of the impact after about 1 year after the intervention.

Repeated measures, DIF-DIF

Repeated measures, ANCOVA

Repeated measures Metho d EstimatorVariance of estimator Dif-dif (Y(T) POST - Y(C) POST ) - (Y(T) PRE - Y(C) PRE ) Ancova (Y(T) POST - Y(C) POST ) -   (Y(T) PRE - Y(C) PRE )

Implications

 How to split a survey budget between pre and post-treatment rounds?  the lower the autocorrelation, the more post- treatment survey rounds should be conducted  If  =0.25 and there is only a budget for 3 rounds, it is best to have three follow-up waves and no baseline.

Implications  How to choose n and T with a fixed budget for nT surveys?  high n when  is high  high T when  is low

Repeated measures: example Tanzanian milk firm, 131 households in 77 clans in contract with the firm Experiment to look at the best contract design  Surveys pre and post intervention (wide variety of variables)  Daily delivery data collected at firm for one month prior to the intervention and 3 months after the intervention (key variable of interest)

Repeated measures: example MeasureSourceNo. of obs per individual Treatment effect Prob of daily delivery Delivery dataAbout *** (0.011) Number of monthly deliveries Delivery data52.275** (1.026) Number of monthly deliveries in season Delivery data (1.532) Survey data (2.698)