Propensity Scores How to do it – Part 1
X 11 X 12 X 13 X 21 X 22 X 23 X 31 X 32 X 33 No matrices were harmed in this presentation
WHY YOU NEED IT TWO NON-EQUIVALENT GROUPS Patients in specialized units People who attend a fundraising event
Research Question Are nursing homes dangerous for seniors? Does admittance to a nursing home increase risk of death in adults over 65 years of age when controlling for age, gender, race, and number of emergency room visits?
Propensity Score Matching or Do nursing homes kill you?
ANY TIME YOU CAN ASK THE QUESTION …. Is there a difference on OUTCOME between levels of “treatment” A, controlling for X, Y and Z ?
Examples OUTCOME“TREATMENT” LEVELS COVARIATES DROP OUTPUBLIC, PRIVATEINCOME PARENT EDUCATION GR. 8 ACHIEVEMENT BMIDAILY SOFT DRINKS NO SOFT DRINKS GENDER AGE RACE EXERCISE FREQ. DEATHLIVES AT HOME NURSING HOME AGE GENDER TOTAL ER VISITS
1. Make sure there are pre- existing differences (Thank you, Captain Obvious)
2a. Decide on covariates Are the differences pre-existing or could they possibly be due to the different “treatment” levels? Race and gender are good choices for covariates. If more students at private vs public schools are black or female, the schooling probably didn’t cause that Differences in grade 10 math scores may be a result of the type of school
2b. Decide on covariates Don’t use your outcome variable as one of your covariates
3. Run logistic regression to generate propensity scores LOGISTIC REGRESSION VARIABLES dep /METHOD=ENTER indep1 indep2 indep3 /SAVE=PRED /CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5). RENAME VARIABLES (PRE_1=propen). SAVE OUTFILE= "test.sav".
4. Select matching method 1. Quintiles 2. Nearest neighbors 3. Calipers ALL OF THE ABOVE CAN BE DONE EITHER WITH OR WITHOUT REPLACEMENT
5. Run matching program & test its effectiveness 6. Run your analysis using the matched data set
An actual example Do specialized hospital units save lives?
Our problem We have cities with and without specialized care units (trauma center, burn unit) We want to see if the cities with specialized units have higher survival rates, controlling for other variables
Creating Propensity Scores What variables are related to group? Example: Age group and gender were significantly related to city.
Preparing the data Maximum likelihood solutions are large sample methods. You may wish to combine or delete categories with small numbers
Consider dropping or combining categories… (this was done) MECHANISM Frequency Cumulative Percent Fall GSW MVC Other Accidents Shark attacks HWB Total
Start SPSS Open example.sav File > Open > Data Note: This is real data with some changes made for confidentiality
An appearance by Captain Obvious Because propensity score matching essentially checks that the difference between groups disappears once pre- existing differences are controlled, before you go to all of this trouble, test to see that the groups are,in fact, significantly different.
Syntax vs Pointy-clicky stuff
EDIT > OPTIONS >viewer
Example: City study ANALYZE > Descriptive Statistics > Crosstabs > Statistics > Chi-square
Use crosstabs to test for difference on categorical variables
Move variables desired to Rows and Columns Click on Statistics Note: You can put multiple variables under rows
Click on chi-square If desired, select phi coefficient also.
SYNTAX CROSSTABS /TABLES=OUTCOME Age_groups CategGCS BY City_of_injury /FORMAT=AVALUE TABLES /CELLS=COUNT /COUNT ROUND CELL.
Basic statistics to test covariates Testing for differences on numeric variables ANALYZE > COMPARE MEANS > INDEPENDENT SAMPLES T-TEST
Independent samples t-test
Age as test variable City_of_injury as group
What differs between cities? Age in years, Age group was not significantly different between cities Gender, Trauma Type, Mechanism of Injury, Admission to ICU, GCS, ISS & RTS are all significantly different between cities
What differs between outcomes? ICU_LOS,Trauma Type, Mechanism of Injury, Admission to ICU, GCS, ISS & RTS are all significantly different between cities
What variables should be controlled? Example of City A vs B - Logistic regression with city as dependent and age group, trauma type & admission to ICU as independents. - Logistic regression with city as dependent and Age Group, Gender, Trauma Type, Mechanism of Injury, Admission to ICU, GCS, ISS & RTS as independents.
Since running the logistic regression and creating propensity scores takes relatively little time it is not much trouble to test more than one model
Logistic regression From SPSS menu select: ANALYZE > REGRESSION > BINARY LOGISTIC
Covariates MECHANISM TRAUMATYPE RTS ISS CategGCS ICULOS
Define categorical variables
Select Predicted Probabilities (not yet)
SYNTAX LOGISTIC REGRESSION VARIABLES City_of_injury /METHOD=ENTER MECHANISM TRAUMATYPE RTS ISS CategGCS ICULOS /CONTRAST (MECHANISM)=Indicator /CONTRAST (CategGCS)=Indicator /CONTRAST (TRAUMATYPE)=Indicator /SAVE=PRED Don’t include this yet /CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).