Average Treatment Effects and Propensity Score Methods Zhehui Luo, MS, PhD Department of Epidemiology Michigan State University
What experimental data to use? Treatment group Control group Both treatment and control groups Where to draw comparison sample? Specifications for propensity scores (balancing tests) Mean difference by p-score strata Standardized differences Hotelling’s T 2 tests of joint null Regression based tests Matching methods Nearest Neighbor With replacement Without replacement Seed for the random number generator Kernel/local linearBandwidth CaliperRadius Bias estimation Differentials b/t treatment and comparison against experimental benchmark Differentials b/t control and comparison against zero Stratification
Cognitive Behavioral Intervention on reducing symptom severity during chemotherapy for cancer patients, – CBI Cognitive Behavioral Intervention on reducing symptom severity during chemotherapy for cancer patients, – CBI Problem Solving Therapy on reducing symptom severity during chemotherapy for cancer patients, – WCI Problem Solving Therapy on reducing symptom severity during chemotherapy for cancer patients, – WCI Community Based Family Care Study, , short term impact of cancer treatments on newly diagnosed patients – FCS Community Based Family Care Study, , short term impact of cancer treatments on newly diagnosed patients – FCS Data provided by research supported by the following: Given, B., Given, C.W., McCorkle, R. Family Home Care for Cancer, a community-based model. Grant #R01 CA79280, funded by NCI, NINR, NIA P30 AG08808, in collaboration with Walther Cancer Institute, IN Data provided by research supported by the following: Given, B., Given, C.W., McCorkle, R. Family Home Care for Cancer, a community-based model. Grant #R01 CA79280, funded by NCI, NINR, NIA P30 AG08808, in collaboration with Walther Cancer Institute, IN
CBI treatment group CBI control group FCS all WCI allCBI all WCI treatment group WCI control group FCS-1 FCS-2 FCS-3 FCS-1: Undergoing Chemo Therapy FCS-2: Advanced Cancer Stage FCS-3: Advanced and Undergoing Chemo Therapy
Table 1 Characteristics of CBI, WCI and FCS patients. CBI- Exp CBI- Con WCI- Exp WCI- Con FCS- All FCS-1FCS-2FCS-3 Baseline N week 79 (67.5%) 86 (72.9%) 38 (61.3%) 46 (74.2%) 678 (76.4%) 206 (70.6%) 157 (60.2%) 85 (59.0%) % Breast Cancer % Colon Cancer % Lung Cancer % Late Stage Age (std) (9.80) (11.18) (13.06) (11.65) (5.15) (4.79) (5.12) (4.79)
Table 1 continued. CBI- Exp CBI- Con WCI- Exp WCI- Con FCS- All FCS-1 FCS-2FCS-3 SF-36 PF Baseline (28.59) (29.98) (27.38) (27.27) (28.79) (28.37) (30.05) (29.00) SF-36 PF 20-week (22.90) (30.23) (27.35) (23.99) (26.59) (29.01) (30.48) (30.38) SF-36 MH Baseline (17.25) (18.63) (19.16) (17.77) (15.34) (15.78) (15.99) (15.28) SF-36 MH 20-week (16.38) (19.21) (13.94) (19.20) (15.20) (16.09) (15.32) (13.94)
p-value CBI-exp & WCI-con p-value WCI-exp & CBI-con p-value CBI-exp & FCS p-value WCI-exp & FCS p-value CBI -exp & FCS-1 p-value WCI -exp & FCS-1 p-value CBI -exp & FCS-2 p-value WCI -exp & FCS-2 p-value CBI -exp & FCS-3 p-value WCI -exp & FCS-3 10-week N 20-week N Cancer Sites Late Stage Female White Education Married Age Severity baseline Severity 20-week CESD baseline CESD 20-week SF36 PF baseline SF36 PF 20-week SF36 MH baseline SF36 MH 20-week Table 2 Testing between 10 composite samples Note: Yellow cells indicate significant differences at 0.05 level for baseline variables. Light green cells indicate significant differences at 0.05 or 0.1 levels at 20-week.
Table 3 Treatment effects on SF-36 Physical Functioning, using CBI-treated group and various comparisons Experimental Estimates Model 1Model 2Model (4.30)**5.17 (3.87)7.17 (3.81)* Non- experimental Estimates FCS-1FCS-2FCS-3FCS-allWCI-con Model (3.64)*6.14 (3.90)12.40 (4.32)**1.84 (3.17)19.03 (4.62)*** Model (3.05)1.53 (3.40)6.64 (3.94)*-1.14 (2.63)11.39 (3.97)*** Model (4.21)-2.48 (4.74)-0.93 (5.74)-0.44 (3.24)8.22 (4.56)* Model 1 | p7.51 (6.54)6.05 (6.57)7.33 (8.34)3.54 (4.96)14.27 (5.68)** attnd-8.01 (16.05)48.16 (28.77)*-7.13 (19.17)35.29 (21.69)12.43 (6.24)** atts6.98 (7.37)4.97 (8.25)7.26 (15.23)34.90 (15.45)**15.37 (5.04)*** Attk (0.06)9.05 (9.32)12.64 (8.73)6.72 (16.21)15.35 (8.76)*14.69 (5.33)*** psmatch2, n(1)25.77 (25.97)55.77 (27.94)*70.77 (32.38)** (26.44)35.77 (25.66) psmatch2, n(4) (12.21)***42.02 (12.99)** (13.59)**-5.21 (5.67)10.49 (11.90) match, n(4)7.33 (3.29)**5.64 (3.76)-5.47 (4.65)-7.70 (3.07)**4.23 (4.47) * significant at 10%; ** significant at 5%; *** significant at 1%
Table 4 Treatment effects on SF-36 Mental Health, using CBI-treated group and various comparisons Experimental Estimates Model 1Model 2Model (3.00)2.50 (2.66)1.97 (2.78) Non- experimental Estimates FCS-1FCS-2FCS-3FCS-allWCI-con Model (2.34)2.00 (2.38)1.74 (2.74)1.85 (1.86)0.67 (3.61) Model (2.21)0.73 (2.36)0.04 (2.75)1.23 (1.84)0.58 (3.59) Model (3.22)0.73 (3.46)-0.15 (4.27)2.97 (2.35)-0.23 (4.20) Model 1 | p8.09 (4.03)**7.46 (4.02)*9.66 (5.27)*7.93 (2.91)***-0.26 (4.14) attnd5.78 (5.34)17.00 (7.60)**6.76 (3.75)*10.96 (5.58)*-1.46 (6.07) atts7.96 (3.18)**7.49 (3.84)*8.54 (4.80)*9.88 (2.52)***1.11 (4.23) attk7.55 (5.15)9.66 (4.87)*6.38 (5.75)8.99 (3.04)***-0.39 (6.44) psmatch2, n(1) (19.10)11.68 (15.18)7.68 (15.93) (14.41) (20.38) psmatch2, n(4)-4.32 (8.14)1.68 (8.60)-3.32 (8.05) (7.43)12.15 (10.85) match, n(4)7.12 (2.37)***3.41 (2.38)7.41 (2.64)***12.08 (2.02)***-1.24 (3.81) * significant at 10%; ** significant at 5%; *** significant at 1%
Concluding Remarks Cautionary tale Cautionary tale Comparison Samples Comparison Samples Stata Commands Stata Commands Specification Tests Specification Tests
Questions to the audience Why did psmatch2 produce such drastically different results? Why did psmatch2 produce such drastically different results? Why should differences between estimated ATE using various econometric techniques and ATE from randomized trial be viewed as specification errors in econometric methods and not as selection bias in the experimental results? Why should differences between estimated ATE using various econometric techniques and ATE from randomized trial be viewed as specification errors in econometric methods and not as selection bias in the experimental results?
Pre-treatment Treated group, with treatment Treated group, without treatment Control group, w/o treatment Comparison group Post-treatment