Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to PSM: Practical Issues, Concerns, & Strategies Shenyang Guo, Ph.D. School of Social Work University of North Carolina at Chapel Hill January.

Similar presentations


Presentation on theme: "Introduction to PSM: Practical Issues, Concerns, & Strategies Shenyang Guo, Ph.D. School of Social Work University of North Carolina at Chapel Hill January."— Presentation transcript:

1 Introduction to PSM: Practical Issues, Concerns, & Strategies Shenyang Guo, Ph.D. School of Social Work University of North Carolina at Chapel Hill January 29, 2005 For Workshop Conducted at the School of Social Work, University of Illinois – Urbana-Champaign

2 1. Incomplete match versus inexact match It is almost always the case that users would encounter two types of bias in caliper matching: while trying to maximize exact matches, cases may be excluded due to incomplete matching; while trying to maximize cases, inexact matching may result. Try different caliper sizes and do sensitivity analysis.

3 2. Three limitations with PSM (In Rubin’s View) According to Rubin (1997), propensity scores: 1. cannot adjust for unobserved covariates, 2. work better in larger samples, and 3. handle a covariate that is related to treatment assignment but not to outcome in the same way as a covariate with the same relation to treatment assignment but strongly related to outcome.

4 3. Problem associated with the common-support region It is almost always the case that the propensity score matching excludes subjects from study, because nonparticipants outside the upper end of the common-support region and participants outside the lower end of the common-support region cannot find their matches.

5 Illustration of the Common-Support Region (P score is log[(1-p)/p] ) Mediam =3.62 Mediam =.494 Common Support Region Cases outside the common support region are eliminated.

6 4. Using appropriate set of conditioning variables The literature of propensity score almost unanimously emphasizes the importance of including “correct” or appropriate set of conditioning variables in the model predicting propensity scores. Simulation and replication studies found that results of treatment effects are sensitive to different specifications of conditioning variables.

7 4. Using appropriate set of conditioning variables (continued) Smith and Todd (2004) found that using more conditioning variables could exacerbate the common-support problem. Use variable as conditioning variables in the logistic regression or use them as selection variables. [e.g.] The four need variables in evaluation of substance abuse treatment on child well-being. Endogeneity bias: a conditioning variable that is highly correlated to the treatment assignment. [e.g.] “CWW Report of Need” is almost the same as service use.

8 5. The issue of selection biases Various types of selection biases: self-selection, triaging (the selection and service provision based on higher client level of need), bureaucratic selection, geographic selection, attrition selection, analytic selection.

9 5. The issue of selection biases (continued) Selection based on observables and unobservables. PSM cannot control for selection bias due to unobservables. Using HLM or random-effects modeling after matching may reduce selection bias due to unobservables. Random effects: a lump sum of unobserved variables.

10 6. Various types of covariates and strategies for using them Covariates correlated to treatment assignment and outcome approximately equally, Covariates correlated to treatment assignment but not to outcome, Covariates correlated more strongly to outcome than to treatment assignment. Little is known about best strategies to deal with the last type. Monte Carlo study or data simulation is needed.

11 7. Removing differences in covariate distributions between groups It is a common practice that one runs bivariate analysis before and after matching using t-test, chi- square, or other bivariate methods to test whether or not the treated and control groups differ on each covariate included in the logistic regression. In rerunning the propensity score model, one may include a square term of the covariate that shows significance after matching, or a product of two covariates if one believes that the correlation between these two covariates differs between groups (Rosenbaum & Rubin, 1984).

12 8. Choosing best bandwidth value (in DID) via cross validation In practice a data-driven bandwidth choice is necessary to select the bandwidth value. The optimal bandwidth can be found through searching cross-validation function that minimizes: Currently this procedure is not available in Stata. We plan to program it by ourselves.

13 9. The issue of sample size and statistical power PSM requires a large sample. The bottom line is that the size of the matched sample should be large enough to allow the multivariate analysis to have an adequate power. This may be a problem when the size of treatment group is small (say, below 50). Sample size is usually not a problem when using the Heckman’s DID approach. We plan to use the Cohen’s general framework of statistical power analysis and the statistical /econometric literature to develop a framework of statistical power analysis for PSM. It will explicitly take into consideration variations in split fractions defining the size of treated and nontreated groups

14 10. Standardized or metric-free measure of effect size in the context of PSM In DID, the standard errors are estimated by the bootstrapping procedure. It’s uncertain whether or not one can use such SE to calculate the so-called metric-free measure of effect size. The standard measure of E.S. usually employs a pooled within-sample estimate of the standard error. The comparability of such E.S. to those produced by DID and bootstrapping needs to be determined. We plan to explore this issue, and develop a procedure for calculating standardized effect size so that researchers can compare differences on an outcome measure between intervention conditions estimated by the DID estimator to those produced by other studies and interventions.

15 Questions and Discussions

16 Discussion Example 1: Disruptions of subsidized guardianship vs. adoption In Illinois, a federal IV-E waiver supports the placement of foster children in private guardian homes. We've created a Stata computer file that lists all of the adoptions and guardianships in the state from 1998 to 2004 and tracked the rate of disruption. We find that SG disrupts more the adoption.

17 Discussion Example 1: Disruptions of subsidized guardianship vs. adoption (Continued) Research Question: whether the disruption rate would have been lower if the children who had gone into guardianship had instead been adopted (counterfactual). Variables: the file includes predictors on race, type of placement, age of child, region of state, and permanency cohort.

18 Discussion Example 1: Disruptions of subsidized guardianship vs. adoption (Continued) Analytic strategies: Because the disruption rate needs to be analyzed by event history analysis, we cannot use Heckman’s DID. 1. Run nearest neighbor within caliper to create a new sample in which the SG and A children are similar on observed characteristics. 2. Survival analysis (Kaplan Meier/Cox regression) to see difference in disruption rate between SG and A.

19 Discussion Example 2: Crisis Nursery Services The intervention: crisis nursery services, 5 sites. Research objective: match children and families with similar characteristics to see if the interventions made a difference in interface with the child welfare system or future mental health problems. Analytic strategies: Outcome measures Matching variables Matching methods

20 Discussion Example 3: IV-E waiver experiment testing the effects of a training program for caseworkers on child welfare outcomes The problem: teams of workers were randomly assigned to treatment and control conditions. In the treatment condition, not all of the workers managed to enroll in the program and some dropped out before completing treatment. Research objective: We've taken an intent-to-treat approach to comparing the two groups, but we're wondering if PSM can help us in conducting an analysis on the subset of treatment cases that actually completed the full sequence of instruction.


Download ppt "Introduction to PSM: Practical Issues, Concerns, & Strategies Shenyang Guo, Ph.D. School of Social Work University of North Carolina at Chapel Hill January."

Similar presentations


Ads by Google