Download presentation
Presentation is loading. Please wait.
PublishAmos Warren Modified over 9 years ago
1
Session 10
2
Applied Regression -- Prof. Juran2 Outline Binary Logistic Regression Why? –Theoretical and practical difficulties in using regular (continuous) dependent variables How? –Minitab procedure –Interpreting results –Some diagnostics –Making predictions –Comparison with regular regression model
3
Applied Regression -- Prof. Juran3 Logistic Regression In our previous discussions of regression analysis, we have implicitly assumed that the dependent variable is continuous. We have learned some methods for operationalizing binary independent variables (using dummy variables), but have not discussed any method for dealing with categorical or binary dependent variables with regression analysis. (One non-regression method is discriminant analysis.) There are a number of tools available, but we will focus here on logistic regression.
4
Applied Regression -- Prof. Juran4 The basic idea: instead of predicting the exact value of the (binary) dependent variable, we will try to model the probability that the dependent variable takes on the value of 1. In English, is the probability that the dependent variable is 1, given a particular vector of values for the independent variables.
5
Applied Regression -- Prof. Juran5 Example: Rick Beck Consumer Credit
6
Applied Regression -- Prof. Juran6 Why not a normal multiple regression model?
7
Applied Regression -- Prof. Juran7 Here we have Since is an estimated probability, it shouldn’t go outside of the range from zero to one. But our regression equation is unbounded, and in this data set sometimes takes on illogical estimated values.
8
Applied Regression -- Prof. Juran8 We address this problem with a logistic response function:
9
Applied Regression -- Prof. Juran9
10
10 This sort of relationship will meet our criteria of keeping in the proper range. (Note: the cumulative normal distribution has a similar shape, and is the basis for the probit model.) What we need is a transformation of either X or such that the relationship is linear. This would enable us to use linear regression to create a model.
11
Applied Regression -- Prof. Juran11
12
Applied Regression -- Prof. Juran12
13
Applied Regression -- Prof. Juran13
14
Applied Regression -- Prof. Juran14 Minitab Results Response Information Here we get the number of observations that fall into each of the two response categories. The response value that has been designated as the “reference event” is the first entry under Value and labeled as the event. In this case, the reference event is “being in default”. Response Information Variable Value Count Default 1 153 (Event) 0 847 Total 1000
15
Applied Regression -- Prof. Juran15 Deviance Table Source DF Adj Dev Adj Mean Chi-Square P-Value Regression 5 283.811 56.7621 283.81 0.000 Single 1 13.113 13.1125 13.11 0.000 Credit D 1 60.523 60.5230 60.52 0.000 Credit E 1 84.985 84.9850 84.98 0.000 Children 1 9.932 9.9316 9.93 0.002 Debt 1 39.674 39.6744 39.67 0.000 Error 994 571.945 0.5754 Total 999 855.756 Similar to T tests for individual slopes Similar to F test for all slopes
16
Applied Regression -- Prof. Juran16 Smaller values of Akaike Information Criterion (AIC) indicate a better fit Deviance R-Sq R-Sq(adj) AIC 33.16% 32.58% 583.95 Coefficients Term Coef SE Coef VIF Constant -1.139 0.337 Single 0.970 0.272 1.56 Credit D 2.023 0.263 1.18 Credit E 3.038 0.348 1.24 Children -0.849 0.271 1.57 Debt -0.000019 0.000004 1.07 The regression model
17
Applied Regression -- Prof. Juran17 The coefficient of 0.970 for Single represents the estimated change in the log of P (default)/ P (not default) when the subject is single compared to when he/she is not single, with the other independent variables held constant. The coefficient of –0.019 for Debt is the estimated change in the log of P (default)/ P (not default) with a $1000 increase in Debt, with the other independent variables held constant.
18
Applied Regression -- Prof. Juran18 Regression Equation P(1) = exp(Y')/(1 + exp(Y')) Y' = -1.139 + 0.970 Single + 2.023 Credit D + 3.038 Credit E - 0.849 Children - 0.000019 Debt Goodness-of-Fit Tests Test DF Chi-Square P-Value Deviance 994 571.95 1.000 Pearson 994 642.32 1.000 Hosmer-Lemeshow 8 29.76 0.000
19
Applied Regression -- Prof. Juran19 Fits and Diagnostics for Unusual Observations Observed Obs Probability Fit Resid Std Resid 6 1.0000 0.4641 1.2391 1.25 X 39 1.0000 0.4372 1.2864 1.30 X 58 1.0000 0.4671 1.2338 1.25 X 62 1.0000 0.0872 2.2087 2.21 R 66 1.0000 0.6670 0.9000 0.91 X 85 1.0000 0.4510 1.2619 1.28 X 90 0.0000 0.6372 -1.4240 -1.44 X 115 0.0000 0.5637 -1.2879 -1.30 X 123 1.0000 0.6899 0.8616 0.88 X 136 1.0000 0.1037 2.1288 2.14 R
20
Applied Regression -- Prof. Juran20 Making Predictions
21
Applied Regression -- Prof. Juran21
22
Applied Regression -- Prof. Juran22
23
Applied Regression -- Prof. Juran23
24
Applied Regression -- Prof. Juran24
25
Applied Regression -- Prof. Juran25
26
Applied Regression -- Prof. Juran26
27
Applied Regression -- Prof. Juran27
28
Applied Regression -- Prof. Juran28 Summary Binary Logistic Regression Why? –Theoretical and practical difficulties in using regular (continuous) dependent variables How? –Minitab procedure –Interpreting results –Some diagnostics –Making predictions –Comparison with regular regression model
29
Applied Regression -- Prof. Juran29 For Session 11 and 12 Student presentations
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.