LOGISTIC REGRESSION 1
BINARY LOGISTIC REGRESSSION (BLR) 2
Literature Applied logistic regression /Hosmer, Lemeshow (1989,2000,375 p.), Regression models for Categorical and Limited Dependent Data, /Scott Long(1997,296 p.), Logistic regression :a primer /Fred C. Pampel (2000,85 p.). Categorical Data Analysis/Agresti (2002,710 p.) In Czech Řeháková-Nebojte se logistické regrese (Soc. časopis 4:475-492) LR in SPSS-Discovering statistics using SPSS for Windows :advanced techniques for the beginner /Andy Field (2000,2005,2009) and Norusis
Assumptions, variables Dependent variable a) binary (binary logistic regression) - todazy b) ordinl (ordinal regression) - later c) nominal (polytomous logistic regression) – later Ind. variables: all types Close technique: diskriminant analysis (more assumptions for DA, normality of ind. vars)
Binary logistic reg Model Probability Odd Odds ratio Logit-natural logarithm of odds Equation for binary log reg and back transformations
Equation, ind. vars interactions Ind. vars- binary or nominal: use dummy (SPSS will do it for us)- lats category=reference category) interaction-see linear regression
Basic questions in BLR Does the model fit to the data? (LR: F-test and R2) a (BLR: pseudo R2 and LR test or Hosmer-Lemeshow test) Evaluate importance and stat. significance of ind. vars (LR: t-testy and beta coeffs, BLR: Wald test and standardized coeffs) Does my data fulfill ? (LR: linear relationship between ind and dep. vars, LR: relationship between ind vars and logit)
Estimation No usage of OLS Basic technique: : ML (see also loglinear models, structural equation modelling etc.) Iterative solution, more steps, impossible to solve without computer
Example Usage of Inet - WIP 2006 Intro to data set, selection of vars and exploration
BLR in SPSS Equation, Wald’s tests menu-Analyze-Regression-Binary Logistic
Automated inclusion of vars forward (1) backward (2) a) LR (likelihood-ratio) – based on overall test b) Wald – based on partial test c ) conditional-simpler version of LR, 6 combinations (1 or 2 vs a)-c)) Reco: use forward and LR or conditional
Menu SPSS Categorical-define nominal variables and reference category Save residuals and influentials Hosmer-Lemeshow test Classification Plot
Syntax LOGISTIC REGRESSION VAR=inet /METHOD=ENTER age edu /CLASSPLOT /PRINT=GOODFIT CI(95) /CRITERIA PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .
Outputs Estimates (Variables in the Equation) Interpretation of coeff: Change of logit if ind. var. increase by one unit (continuous ind. var) or in comparison with reference category (dummy or binary vars) exp(B): change in odds if ind. var increase by one unit (if the scale is long use more than one unit, e.g. 10, 100, 1000)
Other outputs Wald test CI for exp(B) -95 % confindence interval for exp(B) LR tests in table Changes of Goodness-of-Fit: Model-comparison of our model and model including only interpcept Block-changes in blocks Step-changes in Steps if more models are fitted
Pseudo R2 Close to (coeff of determination in LR Can not be interpreted as explained variance (dep. var. is binary) More formulae exists Cox and Snell R2- range (0;1) (never achive 1) Nagelkerke R2- modifief Cox and Snell (can achieve 1) Mc Fadden- R2- range (0;1) (never achive 1) The most frequently used is Nagelkerke R2
Logistic regression - outputs Classification table – percentage of correctly classified cases Histogram of estimated probabilities
Probabilty curve in BLR How to prepare this curve? Tool in MS Excel
Reco for publishing Necessity to differentiate logits, odds Keep in mind which categories are compared Coeffs, standardized coeffs LR test, Wald’s tets, Hosmer Lemeshow test, pseud R2 (mostly Nagelkerke) It is good to publish classification table or only percentage of correctly classified cases
POLYTOMOUS LOGISTIC REGRESSION (PLR) 20
PLR Dep. var. nominal More equations Comparison with the last category of dep. var. menu-Analyze-Regression-Multinomial Logistic
SPSS- options Categorical vs. Covariate Statistics Clasification Table Model Ind. vars and interactions Predicted category Options – transformations
Syntax NOMREG election BY gender WITH rightor libaut astat intaln extaln anomie age edyrs /CRITERIA = CIN(95) DELTA(0) MXITER(100) MXSTEP(5) LCONVERGE(0) PCONVERGE(1.0E-6) SINGULAR(1.0E-8) /MODEL /INTERCEPT = INCLUDE /PRINT = CLASSTABLE FIT PARAMETER SUMMARY LRT .
ORDINAL REGRESSION 24
Předpoklady a proměnné One equation with tresholds Necessary to test whether lines are paralel (if not use PLR)
Syntax PLUM q30crec BY q36rec /CRITERIA = CIN(95) DELTA(0) LCONVERGE(0) MXITER(100) MXSTEP(5) PCONVERGE(1.0E-6) SINGULAR(1.0E-8) /LINK = LOGIT /PRINT = CELLINFO FIT PARAMETER SUMMARY TPARALLEL . *PLUM-PoLytomous Universal Model (SPSS 10)
Ordinal regression – outputs Estimates (Variables in the Equation) Interpretation of coeffs-change in logit if ind. var. Increaseby 1 unit (cont. vars.) or in comparison with reference category (dummy or binary vars); use exp(B)-meaning: change in odd (for higher category in comparison with lower one)
Pseudo R2 Cox and Snell R2- range (0;1) (never achive 1) Nagelkerke R2- modifief Cox and Snell (can achieve 1) Mc Fadden- R2- range (0;1) (never achive 1)
CLOSE TECHNIQUES 29
Close techniques Discrimination analysis Loglinear analysis Logit analysis Classification and regression trees Following procedures – ROC curves