Propensity Scores How to do it – Part 1. X 11 X 12 X 13 X 21 X 22 X 23 X 31 X 32 X 33 No matrices were harmed in this presentation.

Slides:



Advertisements
Similar presentations
Does health insurance matter? Establishing insurance status as a risk factor for mortality rate Hisham Talukder, Applied Mathematics Héctor Corrada Bravo,
Advertisements

Three or more categorical variables
SPSS Session 5: Association between Nominal Variables Using Chi-Square Statistic.
LEARNING PROGRAMME Hypothesis testing Part 2: Categorical variables Intermediate Training in Quantitative Analysis Bangkok November 2007.
Simple Logistic Regression
Bivariate Analysis Cross-tabulation and chi-square.
1 Contingency Tables: Tests for independence and homogeneity (§10.5) How to test hypotheses of independence (association) and homogeneity (similarity)
Regression With Categorical Variables. Overview Regression with Categorical Predictors Logistic Regression.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Log-linear Analysis - Analysing Categorical Data
Outliers Split-sample Validation
QM Spring 2002 Business Statistics SPSS: A Summary & Review.
Outliers Split-sample Validation
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton McNemar Test PowerPoint Prepared by Alfred P.
STATISTICS David Pieper, Ph.D.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Mann-Whitney U Test PowerPoint Prepared by Alfred.
FEBRUARY, 2013 BY: ABDUL-RAUF A TRAINING WORKSHOP ON STATISTICAL AND PRESENTATIONAL SYSTEM SOFTWARE (SPSS) 18.0 WINDOWS.
Introduction to SPSS (For SPSS Version 16.0)
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Inferential Statistics: SPSS
1 Inference for Categorical Data William P. Wattles, Ph. D. Francis Marion University.
Range, Variance, and Standard Deviation in SPSS. Get the Frequency first! Step 1. Frequency Distribution  After reviewing the data  Start with the “Analyze”
X 11 X 12 X 13 X 21 X 22 X 23 X 31 X 32 X 33. Research Question Are nursing homes dangerous for seniors? Does admittance to a nursing home increase risk.
9/23/2015Slide 1 Published reports of research usually contain a section which describes key characteristics of the sample included in the study. The “key”
Chi-Square Test of Independence Practice Problem – 1
Proc freq: Five secrets* *Okay, well, lesser known facts.
SIMPLE TWO GROUP TESTS Prof Peter T Donnan Prof Peter T Donnan.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
Categorical Data Analysis: When life fits in little boxes AnnMaria DeMars, PhD.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Education 793 Class Notes Presentation 10 Chi-Square Tests and One-Way ANOVA.
MK346 – Undergraduate Dissertation Preparation Part II - Data Analysis and Significance Testing.
Copyright © 2010 Pearson Education, Inc. Slide
Reasoning in Psychology Using Statistics Psychology
SPSS Workshop Day 2 – Data Analysis. Outline Descriptive Statistics Types of data Graphical Summaries –For Categorical Variables –For Quantitative Variables.
Statistical Analysis using SPSS Dr.Shaikh Shaffi Ahamed Asst. Professor Dept. of Family & Community Medicine.
CENTER FOR SOCIAL SCIENCE COMPUTATION AND RESEARCH (CSSCR) UNIVERSITY OF WASHINGTON SPRING 2013 CONSULTANT: SHIN HAENG LEE Introduction to SPSS.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Non-parametric Tests e.g., Chi-Square. When to use various statistics n Parametric n Interval or ratio data n Name parametric tests we covered Tuesday.
Logistic Regression Correlation, ANOVA, t-test, chi-square have numeric dependent variables E.g. test score, number of words in corpus, F2, reaction.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
PSC 47410: Data Analysis Workshop  What’s the purpose of this exercise?  The workshop’s research questions:  Who supports war in America?  How consistent.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
Using Propensity Score Matching in Observational Services Research Neal Wallace, Ph.D. Portland State University February
PSY6010: Statistics, Psychometrics and Research Design Professor Leora Lawton Spring 2007 Wednesdays 7-10 PM Room 204.
SW388R6 Data Analysis and Computers I Slide 1 Comparing Central Tendency and Variability across Groups Impact of Missing Data on Group Comparisons Sample.
BIVARIATE/MULTIVARIATE DESCRIPTIVE STATISTICS Displaying and analyzing the relationship between categorical variables.
Tutorial I: Missing Value Analysis
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Four way analysis Nursing home residence Gender Age Death.
Analyzing Data. Learning Objectives You will learn to: – Import from excel – Add, move, recode, label, and compute variables – Perform descriptive analyses.
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
Additional Regression techniques Scott Harris October 2009.
STATISTICAL TESTS USING SPSS Dimitrios Tselios/ Example tests “Discovering statistics using SPSS”, Andy Field.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
IENG-385 Statistical Methods for Engineers SPSS (Statistical package for social science) LAB # 1 (An Introduction to SPSS)
Chapter 4 Selected Nonparemetric Techniques: PARAMETRIC VS. NONPARAMETRIC.
Goodness-of-Fit and Contingency Tables Chapter 11.
Practical Solutions Additional Regression techniques.
BINARY LOGISTIC REGRESSION
Dr. Siti Nor Binti Yaacob
DEPARTMENT OF COMPUTER SCIENCE
Multiple Regression.
1) A bicycle safety organization claims that fatal bicycle accidents are uniformly distributed throughout the week. The table shows the day of the week.
SPSS Propensity Score Matching: An overview
Hypothesis Testing Part 2: Categorical variables
Individual Assignment 6
 .
Presentation transcript:

Propensity Scores How to do it – Part 1

X 11 X 12 X 13 X 21 X 22 X 23 X 31 X 32 X 33 No matrices were harmed in this presentation

WHY YOU NEED IT TWO NON-EQUIVALENT GROUPS Patients in specialized units People who attend a fundraising event

Research Question Are nursing homes dangerous for seniors? Does admittance to a nursing home increase risk of death in adults over 65 years of age when controlling for age, gender, race, and number of emergency room visits?

Propensity Score Matching or Do nursing homes kill you?

ANY TIME YOU CAN ASK THE QUESTION …. Is there a difference on OUTCOME between levels of “treatment” A, controlling for X, Y and Z ?

Examples OUTCOME“TREATMENT” LEVELS COVARIATES DROP OUTPUBLIC, PRIVATEINCOME PARENT EDUCATION GR. 8 ACHIEVEMENT BMIDAILY SOFT DRINKS NO SOFT DRINKS GENDER AGE RACE EXERCISE FREQ. DEATHLIVES AT HOME NURSING HOME AGE GENDER TOTAL ER VISITS

1. Make sure there are pre- existing differences (Thank you, Captain Obvious)

2a. Decide on covariates Are the differences pre-existing or could they possibly be due to the different “treatment” levels? Race and gender are good choices for covariates. If more students at private vs public schools are black or female, the schooling probably didn’t cause that Differences in grade 10 math scores may be a result of the type of school

2b. Decide on covariates Don’t use your outcome variable as one of your covariates

3. Run logistic regression to generate propensity scores LOGISTIC REGRESSION VARIABLES dep /METHOD=ENTER indep1 indep2 indep3 /SAVE=PRED /CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5). RENAME VARIABLES (PRE_1=propen). SAVE OUTFILE= "test.sav".

4. Select matching method 1. Quintiles 2. Nearest neighbors 3. Calipers ALL OF THE ABOVE CAN BE DONE EITHER WITH OR WITHOUT REPLACEMENT

5. Run matching program & test its effectiveness 6. Run your analysis using the matched data set

An actual example Do specialized hospital units save lives?

Our problem We have cities with and without specialized care units (trauma center, burn unit) We want to see if the cities with specialized units have higher survival rates, controlling for other variables

Creating Propensity Scores What variables are related to group? Example: Age group and gender were significantly related to city.

Preparing the data Maximum likelihood solutions are large sample methods. You may wish to combine or delete categories with small numbers

Consider dropping or combining categories… (this was done) MECHANISM Frequency Cumulative Percent Fall GSW MVC Other Accidents Shark attacks HWB Total

Start SPSS Open example.sav File > Open > Data Note: This is real data with some changes made for confidentiality

An appearance by Captain Obvious Because propensity score matching essentially checks that the difference between groups disappears once pre- existing differences are controlled, before you go to all of this trouble, test to see that the groups are,in fact, significantly different.

Syntax vs Pointy-clicky stuff

EDIT > OPTIONS >viewer

Example: City study ANALYZE > Descriptive Statistics > Crosstabs > Statistics > Chi-square

Use crosstabs to test for difference on categorical variables

Move variables desired to Rows and Columns Click on Statistics Note: You can put multiple variables under rows

Click on chi-square If desired, select phi coefficient also.

SYNTAX CROSSTABS /TABLES=OUTCOME Age_groups CategGCS BY City_of_injury /FORMAT=AVALUE TABLES /CELLS=COUNT /COUNT ROUND CELL.

Basic statistics to test covariates Testing for differences on numeric variables ANALYZE > COMPARE MEANS > INDEPENDENT SAMPLES T-TEST

Independent samples t-test

Age as test variable City_of_injury as group

What differs between cities? Age in years, Age group was not significantly different between cities Gender, Trauma Type, Mechanism of Injury, Admission to ICU, GCS, ISS & RTS are all significantly different between cities

What differs between outcomes? ICU_LOS,Trauma Type, Mechanism of Injury, Admission to ICU, GCS, ISS & RTS are all significantly different between cities

What variables should be controlled? Example of City A vs B - Logistic regression with city as dependent and age group, trauma type & admission to ICU as independents. - Logistic regression with city as dependent and Age Group, Gender, Trauma Type, Mechanism of Injury, Admission to ICU, GCS, ISS & RTS as independents.

Since running the logistic regression and creating propensity scores takes relatively little time it is not much trouble to test more than one model

Logistic regression From SPSS menu select: ANALYZE > REGRESSION > BINARY LOGISTIC

Covariates MECHANISM TRAUMATYPE RTS ISS CategGCS ICULOS

Define categorical variables

Select Predicted Probabilities (not yet)

SYNTAX LOGISTIC REGRESSION VARIABLES City_of_injury /METHOD=ENTER MECHANISM TRAUMATYPE RTS ISS CategGCS ICULOS /CONTRAST (MECHANISM)=Indicator /CONTRAST (CategGCS)=Indicator /CONTRAST (TRAUMATYPE)=Indicator /SAVE=PRED  Don’t include this yet /CRITERIA=PIN(.05) POUT(.10) ITERATE(20) CUT(.5).