IMPACT EVALUATION PBAF 526 Class 5, October 31, 2011.

Slides:



Advertisements
Similar presentations
Povertyactionlab.org Planning Sample Size for Randomized Evaluations Esther Duflo J-PAL.
Advertisements

Designing an impact evaluation: Randomization, statistical power, and some more fun…
ASSESSING RESPONSIVENESS OF HEALTH MEASUREMENTS. Link validity & reliability testing to purpose of the measure Some examples: In a diagnostic instrument,
Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting.
Chapter 19 Confidence Intervals for Proportions.
Chapter 3 Producing Data 1. During most of this semester we go about statistics as if we already have data to work with. This is okay, but a little misleading.
1 Managing Threats to Randomization. Threat (1): Spillovers If people in the control group get treated, randomization is no more perfect Choose the appropriate.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
Spring INTRODUCTION There exists a lot of methods used for identifying high risk locations or sites that experience more crashes than one would.
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
Differentially expressed genes
Power Analysis for Correlation & Multiple Regression Sample Size & multiple regression Subject-to-variable ratios Stability of correlation values Useful.
Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
Agenda: Block Watch outcome map Program Theory overview Evaluation theory overview Mentoring Evaluation Assignment 1 Evaluation Debrief.
Today Concepts underlying inferential statistics
Agenda: Block Watch: Random Assignment, Outcomes, and indicators Issues in Impact and Random Assignment: Youth Transition Demonstration –Who is randomized?
Impact Evaluation Session VII Sampling and Power Jishnu Das November 2006.
SAMPLING AND STATISTICAL POWER Erich Battistin Kinnon Scott Erich Battistin Kinnon Scott University of Padua DECRG, World Bank University of Padua DECRG,
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Sampling: Theory and Methods
Copyright © 2010 Pearson Education, Inc. Slide
Chapter 15 Correlation and Regression
Povertyactionlab.org Planning Sample Size for Randomized Evaluations Esther Duflo MIT and Poverty Action Lab.
RMTD 404 Lecture 8. 2 Power Recall what you learned about statistical errors in Chapter 4: Type I Error: Finding a difference when there is no true difference.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Understanding the Variability of Your Data: Dependent Variable Two "Sources" of Variability in DV (Response Variable) –Independent (Predictor/Explanatory)
Statistics and Research methods Wiskunde voor HMI Bijeenkomst 3 Relating statistics and experimental design.
Chapter 8 Introduction to Hypothesis Testing
Evaluating the Options Analyst’s job is to: gather the best evidence possible in the time allowed to compare the potential impacts of policies.
Study design P.Olliaro Nov04. Study designs: observational vs. experimental studies What happened?  Case-control study What’s happening?  Cross-sectional.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Topic - 3. Figure 7.1 Population, sample and individual cases.
Sample Size And Power Warren Browner and Stephen Hulley  The ingredients for sample size planning, and how to design them  An example, with strategies.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Classifying Designs of MSP Evaluations Lessons Learned and Recommendations Barbara E. Lovitts June 11, 2008.
5-4-1 Unit 4: Sampling approaches After completing this unit you should be able to: Outline the purpose of sampling Understand key theoretical.
AFRICA IMPACT EVALUATION INITIATIVE, AFTRL Africa Program for Education Impact Evaluation David Evans Impact Evaluation Cluster, AFTRL Slides by Paul J.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
통계적 추론 (Statistical Inference) 삼성생명과학연구소 통계지원팀 김선우 1.
1 Nonparametric Statistical Techniques Chapter 17.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
Chapter 16 Data Analysis: Testing for Associations.
Applying impact evaluation tools A hypothetical fertilizer project.
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
The Disability Employment Initiative (DEI): Impact Evaluation Design October 21, 2015 Sung-Woo Cho, Ph.D.
Framework of Preferred Evaluation Methodologies for TAACCCT Impact/Outcomes Analysis Random Assignment (Experimental Design) preferred – High proportion.
Chapter 6 Conducting & Reading Research Baumgartner et al Chapter 6 Selection of Research Participants: Sampling Procedures.
IMPACT EVALUATION WORKSHOP ISTANBUL, TURKEY MAY
Randomized Assignment Difference-in-Differences
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Course: Research in Biomedicine and Health III Seminar 5: Critical assessment of evidence.
Effectiveness of Selected Supplemental Reading Comprehension Interventions: Impacts on a First Cohort of Fifth-Grade Students June 8, 2009 IES Annual Research.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 6 –Multiple hypothesis testing Marshall University Genomics.
Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Sample Size Mahmoud Alhussami, DSc., PhD. Sample Size Determination Is the act of choosing the number of observations or replicates to include in a statistical.
SUMMARY EQT 271 MADAM SITI AISYAH ZAKARIA SEMESTER /2015.
Common Pitfalls in Randomized Evaluations Jenny C. Aker Tufts University.
Uses of Diagnostic Tests Screen (mammography for breast cancer) Diagnose (electrocardiogram for acute myocardial infarction) Grade (stage of cancer) Monitor.
Simulation-based inference beyond the introductory course Beth Chance Department of Statistics Cal Poly – San Luis Obispo
Presenter Disclosures
Sampling and Sampling Distribution
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Power, Sample Size, & Effect Size:
CHAPTER 10 Comparing Two Populations or Groups
Sample Sizes for IE Power Calculations.
Presentation transcript:

IMPACT EVALUATION PBAF 526 Class 5, October 31, 2011

Today Reflections on Assignment 2? Continue thinking about research design Impact Evaluation How certain can we be? Do we have to be? Block Watch Random Assignment, Outcomes, and Indicators Issues in Impact and Random Assignment: Youth Transition Demonstration Who is randomized? Sample size, power, and effect size Who’s in the average?

Block Watch Random Assignment, Outcomes, and Indicators What random assignment protocol would you use to assess the impacts of Block Watch? What are the strengths and weaknesses of your approach? What are the key outcomes you want to assess? What are indicators for those?

Youth Transition Demonstration Evaluation Plan Background on YTD evaluation plan The basics of Impact size and significance Power and sample size No Shows/ Intent to Treat vs. Treatment on the Treated Multiple Comparisons Regression adjusted comparisons

Youth Transition Demonstration Targets youth receiving disability payments to help in transition to adult life and employment Goals: increase earnings, decrease costs, facilitate transition to self-sufficiency Six program sites with variation in programs Services Waiver of benefit decrease with earnings Education, job training, work placements Case management, counseling, referral to services

YTD Evaluation: Selected 6 sites for demonstration and evaluation Intervention built on research from past programs and evaluations Randomly assigned youth to treatment or control Large sample sizes to allow identification of smaller effects and sub-group effects Process and Impact Evaluation Data collected from administrative files, surveys before and after program Advisory group of experts

Sampling Why did they divide the list of potential participants (sampling frame) into groups of 10 for contact? Why did they randomize 55 percent to the treatment? Why get pre-intervention characteristics if they are randomly assigning groups?

Comparisons may be: -over time -across intervention groups with and without program; levels of intervention (“dosage”) Impact here!

Statistical significance When can we rule out having an impact IF there is no impact? Compare 2 means from independent samples: Means: Proportions: Pooled sample variance:

Compare 2 means from independent samples: Means: Proportions: Pooled sample variance:

Compare 2 means from independent samples: Means: Proportions: Pooled sample variance:

Compare 2 means from independent samples: Means: Proportions: Pooled sample variance:

So, it’s easier to say impact is “real” (not just randomness) if: Size of impact is larger Variation in outcomes is small (S) Sample sizes are larger Same factors figure into deciding how big a sample we need to find the effect if it’s there! [Power, sample size, minimally detectable effects]

Power and sample size: Given randomness, what % of time will you be able to rule out the null, IF it is NOT true (there IS an impact)? How big a sample size do you need to rule out NO effect if the program DOES have an impact? (Rossi et al p.312)

Online Calculators for Sample size and Power: Sample size: Power: Lots of other sites: To calculate sample size and power, you need to estimate both effect of the program and the amount of statistical noise.

Minimum Detectable Impacts What are the smallest effects you will be able to detect given n and predicted S?

Adjustments to impact assessment: Regression adjusted impacts decrease S and increase power by controlling for “noise” using baseline characteristics Multiple Comparisons are a problem because randomness happens if you look long enough! MDRC picked “primary outcomes” Use adjustments to account for multiple comparisons

Showing estimated impacts over time in program

Who’s in the average? “No shows” in treatment group didn’t get any services Unlikely to be similar to “shows” If drop, then may overstate potential impacts “Intent to Treat” outcomes include outcomes for no-shows “Treatment on the Treated” outcomes do not include no-shows Non-response to follow-up surveys could bias impact assessments Use administrative data available for all for key outcomes Put resources into follow up to minimize non-response Construct weights to make survey sample estimates comparable to baseline sample

Lessons from Summary: Randomization is hard Need to use power analysis to choose target sample sizes Even randomization may not give comparable baseline characteristics Regression may increase comparability and precision Worry about who we have outcome information for (both control and treatment)

EXTRA SLIDES

A Note About Sample Size When you want to calculate the sample size needed to estimate the differences between two groups, we usually want equal sample sizes. We use the same equation that one would for making an estimate for one sample, but use a measure of the variance that combines information for both populations. For sample size for estimating the difference between population means For sample size for estimating the difference between 2 population proportions For small populations, use the finite population correction (without replacement) This is the with replacement n. Where N is the size of the population, n is the with replacement sample size, and n wor.

Practical Significance of Statistical Significance Difference on the original measurement scale Comparison with test norms of performance of a normative population Differences between criterion groups Proportion over a diagnostic or other success threshold Proportion over an arbitrary success theshold Comparison with the effects of similar programs Conventional guidelines Rossi p

Adjustments for Multiple Testing Solution by Bonferroni: If k=number of comparisons, then α b = α/k. Very conservative. Solution by Benjamini-Hochberg (BH): Adjusts for false discovery rate. Rank p values from smallest to largest Largest p value remains as it is Second largest value is multiplied by the number of comparisons left in the list divided by its rank. If less than.05, then significant. And so on. Other solutions, too!