Propensity Score Matching A Primer in R 1 David Zepeda Assistant Professor Supply Chain & Information Management Center for Health Policy.

Slides:



Advertisements
Similar presentations
IM Symposium: VBCM Doug Thompson PhD Tom Cavin ASA, MAAA August 2012.
Advertisements

Introduction to Propensity Score Matching
Mywish K. Maredia Michigan State University
"Estimating the Determinants and Effects of Participation in the USDA's Conservation Reserve Program." Prepared for: Camp Resources XV August 7-8, 2008.
Propensity Score Matching Lava Timsina Kristina Rabarison CPH Doctoral Seminar Fall 2012.
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Method Reading Group September 22, 2008 Matching.
Key Concepts and Skills
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 5 Making Systematic Observations.
Today Concepts underlying inferential statistics
CHAPTER 23 Consumer Finance Operations. Chapter Objectives n Identify the main sources and uses of finance company funds n Describe the risk exposure.
M obile C omputing G roup A quick-and-dirty tutorial on the chi2 test for goodness-of-fit testing.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Part 1: Introduction 1-1/22 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Bayes Net Perspectives on Causation and Causal Inference
Difference Two Groups 1. Content Experimental Research Methods: Prospective Randomization, Manipulation Control Research designs Validity Construct Internal.
What is R By: Wase Siddiqui. Introduction R is a programming language which is used for statistical computing and graphics. “R is a language and environment.
Chapter 2: The Research Enterprise in Psychology
Research Methods Key Points What is empirical research? What is the scientific method? How do psychologists conduct research? What are some important.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
LECTURE 8 Thursday, 19 February STA291 Fall 2008.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Propensity Score Matching and Variations on the Balancing Test Wang-Sheng Lee Melbourne Institute of Applied Economic and Social Research The University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #5 Jose M. Cruz Assistant Professor.
© Nuffield Trust 22 June 2015 Matched Control Studies: Methods and case studies Cono Ariti
Matching Estimators Methods of Economic Investigation Lecture 11.
Estimating Causal Effects from Large Data Sets Using Propensity Scores Hal V. Barron, MD TICR 5/06.
Handbook on Residential Property Price Indices Chapter 5: Methods Jan de Haan UNECE/ILO Meeting, May 2010.
Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.
Nigeria Impact Evaluation Community of Practice Abuja, Nigeria, April 2, 2014 Measuring Program Impacts Through Randomization David Evans (World Bank)
Objectives To understand the difference between parametric and nonparametric Know the difference between medically and statistically significant Understand.
Generalizing Observational Study Results Applying Propensity Score Methods to Complex Surveys Megan Schuler Eva DuGoff Elizabeth Stuart National Conference.
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Impact Evaluation Sebastian Galiani November 2006 Matching Techniques.
Research Design ED 592A Fall Research Concepts 1. Quantitative vs. Qualitative & Mixed Methods 2. Sampling 3. Instrumentation 4. Validity and Reliability.
Matching STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
FINAL MEETING – OTHER METHODS Development Workshop.
Africa Program for Education Impact Evaluation Dakar, Senegal December 15-19, 2008 Experimental Methods Muna Meky Economist Africa Impact Evaluation Initiative.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Using Propensity Score Matching in Observational Services Research Neal Wallace, Ph.D. Portland State University February
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
Randomized Assignment Difference-in-Differences
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Collecting and Processing Information Foundations of Technology Collecting and Processing Information © 2013 International Technology and Engineering Educators.
Descriptive Statistics.  MEAN  MODE  MEDIAN  Measures of central tendency are statistical measures which describe the position of a distribution.
Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, Randomization.
Chapter 6 Becoming Acquainted With Statistical Concepts.
Effects of migration and remittances on poverty and inequality A comparison between Burkina Faso, Kenya, Nigeria, Senegal, South Africa, and Uganda Y.
MATCHING Eva Hromádková, Applied Econometrics JEM007, IES Lecture 4.
CHAPTER 18 SHORT-TERM FINANCE AND PLANNING Copyright © 2016 by McGraw-Hill Global Education LLC. All rights reserved.
(ARM 2004) 1 INNOVATIVE STATISTICAL APPROACHES IN HSR: BAYESIAN, MULTIPLE INFORMANTS, & PROPENSITY SCORES Thomas R. Belin, UCLA.
Alexander Spermann University of Freiburg, SS 2008 Matching and DiD 1 Overview of non- experimental approaches: Matching and Difference in Difference Estimators.
Patricia Gonzalez, OSEP June 14, The purpose of annual performance reporting is to demonstrate that IDEA funds are being used to improve or benefit.
September 2005Winterhager/Heinze/Spermann1 Deregulating Job Placement in Europe: A Microeconometric Evaluation of an Innovative Voucher Scheme in Germany.
Matching methods for estimating causal effects Danilo Fusco Rome, October 15, 2012.
Looking for statistical twins
Eastern Michigan University
Becoming Acquainted With Statistical Concepts
Constructing Propensity score weighted and matched Samples Stacey L
Evidence of a Program's Effectiveness in Improving Colorectal Cancer Screening Rates in Federally Qualified Health Centers Robert L. Stephens, PhD, MPH1;
Mean Absolute Deviation
Methods of Economic Investigation Lecture 12
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Explanation of slide: Logos, to show while the audience arrive.
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Day 91 Learning Target: Students can use statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile.
Randomization This presentation draws on previous presentations by Muna Meky, Arianna Legovini, Jed Friedman, David Evans and Sebastian Martinez.
Chapter 3 Hernán & Robins Observational Studies
Presentation transcript:

Propensity Score Matching A Primer in R 1 David Zepeda Assistant Professor Supply Chain & Information Management Center for Health Policy and Healthcare Research Brown Bag Series April 1, 2015

Outline 1. Problem description 2. Theory 3. Two-Step Approach 4. Implementation in R 5. Example 1 – Hospitals 6. Example 2 – Primary Care Clinics 7. Example 3 – Farm Land 8. References 2

Problem 3

4

5 An observational unit is generally assigned only one of the two treatments. The treatment is not randomly assigned. Results in a number of potential problems regarding bias and model dependence.

Problem 6 Source: Ho, D. E., Imai, K., King, G. & Stuart, E.A Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis, 15:

Theory 7

8

9

10

Two-Step Approach 11

What is R? A language and environment for statistical computing and graphics Provides a wide variety of statistical and graphical techniques Is highly extensible Provides an Open Source route to participation Great care has been taken over the defaults for the minor design choices in graphics User retains full control Available as Free Software! Allows users to add additional functionality Can be extended (easily) via packages. The R Project for Statistical Computing Implementation in R 12

M ATCH I T Package Dichotomous treatment variable Experimental and observational data Improving parametric statistical models Reduces model dependence Semi-parametric and non-parametric preprocessing Assess covariate distributions in the two groups (i.e., balance) Large range of matching methods Exact Subclassification Nearest neighbor Optimal Genetic Implementation in R 13

Exact matching Simplest version of matching Match each treated unit to all possible control units Exactly the same values on all the covariates Sufficient matches often cannot be found Subclassification Forms subclasses with “close” distributions of covariates Various subclassification schemes Can be used in conjunction with other matching methods Nearest neighbor matching Selects “best” control matches for each treated unit Chooses the control unit not yet matched closest to treated unit Implementation in R 14

Optimal matching Finds matched samples with smallest average absolute distance Helpful when there are not many appropriate control matches Genetic matching Uses a genetic search algorithm Optimal balance achieved after matching Performs statistical tests for determining balance Variety of options for matching methods Number of matched control units Matching with or without replacement Kernel matching Discard treated units, control units, or both Number of subclasses Distance measurement (i.e., logit) Implementation in R 15

Association between hospital system affiliation and hospital inventory in California hospitals (Zepeda, Nyaga, & Young, WP 2015) California hospital data from 2007 – observations (126 affiliated with smaller hospital systems) Preprocessing of data on affiliation with smaller hospital systems Genetic matching method 2 control observations with replacement for every treated observation 126 observations in treatment group 156 observations in control group Propensity score balancing improved by 95% Example 1 16

Example 1 17

Association between IT-leveraging capability and high quality diabetes care in Minnesota primary care clinics (Zepeda & Sinha, WP 2015) Minnesota primary care clinics in observations (135 with high IT-leveraging capability) Preprocessing of data on high IT-leveraging capability Optimal matching method 1 control observations without replacement for every treated observation 135 observations in treatment group 135 observations in control group Propensity score balancing improved by 76% Example 2 18

Example 2 19

Effect of easements on the selling price of farms in Minnesota (Taff & Weisberg, 2007) Federal Conservation Reserve Program (CRP) Temporary conservation easement by USDA (10-15 years) Annual payment by USDA for enrolled land Land valuation theory predicts that temporary easements should have no effect on value of properties Data Oct 1, 2002 – Sep 30, 2004 Farm properties with short-term conservation easements Farm properties with no conservation easements Covariates 2,937 property sales (271 were restricted by CRP contracts) Example 3 20

The primary objective Compare 271 sales with CRP restrictions to sales without Standard observational study approach Use all sales with no CRP as a comparison group Potential problem Properties sold without a random assignment Differences between observable sample and target population may be a cause for bias Using propensity score matching Mimic a randomized experiment Sample of non-CRP and CRP sales Closely agree on salient property characteristics (i.e., balance) Example 3 21

Example 3 22 Medians Upper 75% Lower 25% Dotted lines = 95%

Six models developed and tested Models 1 – 3: use all data, CRP and portion of land RESTRICTED Model 4: restricts data to sales with PRODUCTIVITY measure Model 5: matched sample on CRP restriction Model 6: matched sample with PRODUCTIVITY measure Consistency in results CRP contracts negatively associated with sales prices Most of CRP effect is captured by RESTRICTED amount Counter to land valuation theory Example 3 23

Example 3 24

The R Project for Statistical Computing M ATCH I T R Package Ho, D. E., Imai, K., King, G. & Stuart, E.A Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference. Political Analysis, 15: Examples Zepeda, D., Nyaga, G., & Young, G Supply Chain Risk Management and Hospital Inventory: Effects of System Affiliation. Working Paper. Zepeda, D. & Sinha, K. IT-Leveraging Capability for Reducing Health Care Disparities: An Empirical Analysis of Primary Care Operations. Working Paper. Taff, S.J. & Weisberg, S Compensated short-term conservation restrictions may reduce sales prices. The Appraisal Journal, Winter. References 25

Thank You! 26 David Zepeda Assistant Professor Supply Chain & Information Management