Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiple Logistic Regression STAT E-150 Statistical Methods.

Similar presentations


Presentation on theme: "Multiple Logistic Regression STAT E-150 Statistical Methods."— Presentation transcript:

1 Multiple Logistic Regression STAT E-150 Statistical Methods

2 2 Multiple Logistic Regression is used when there are several predictors, and the response variable is a binary variable. As we did previously, we can use an indicator variable to represent the response variable, with 1 to represent the presence of some condition (“success” or "yes") and 0 to represent the absence of the condition (“failure” or "no"): The logistic regression model describes how the probability of “success” is related to the values of the explanatory variables, which can be categorical or quantitative.

3 3 The Multiple Logistic Regression Model Logit form: Probability form:

4 4 Conditions: Linearity: Check for a linear relationship between a predictor and the logit - transformed response variable, using logit plots. Probability Model: The response values must be random and independent. Think carefully about how the data was produced.

5 5 As with other regression methods, if the conditions are satisfied, we can test hypotheses and construct confidence intervals, and use the results to describe relationships and make predictions. There are advantages to this analysis. First, there are no assumptions about the distributions of the predictors; they do not have to be normally distributed, linearly related, or have equal variances within each group. In addition, there are no restrictions on the type of predictors.

6 6 Let's return to the example we discussed earlier: Suppose that the sales director of appliance stores wants to find out which factors encourage customers to purchase extended warranties after a major appliance purchase. The response variable indicates whether a warranty is purchased. The predictor variables are - Customer gender - Age of the customer - Whether a gift is offered with the warranty - Price of the appliance - Race of the customer (this is coded with four indicator variables to represent White, African-American, Hispanic, and Other)

7 Variables in the Equation BS.E.WalddfSig.Exp(B) Step 1 a Gender-3.7722.5682.1581.142.023 Gift2.7151.5673.0031.08315.112 Age.091.0562.6381.1041.096 Price.001.0003.3631.0671.001 White3.77313.863.0741.78543.518 AfricanAmerican1.16313.739.0071.9333.199 Hispanic6.34714.070.2031.652570.898 Constant-12.01814.921.6491.421.000 a. Variable(s) entered on step 1: Gender, Gift, Age, Price, White, AfricanAmerican, Hispanic. 7 Let‘s start with the full model, using all predictors: The significance of each predictor is measured using the Wald statistic. (Note that SPSS finds a Wald Chi-Square and not a Wald z which you may see elsewhere; remember that the z value is just the square root of the Chi-Square value. In this case, use the sign of the corresponding coefficient estimate, β i.)

8 Variables in the Equation BS.E.WalddfSig.Exp(B) Step 1 a Gender-3.7722.5682.1581.142.023 Gift2.7151.5673.0031.08315.112 Age.091.0562.6381.1041.096 Price.001.0003.3631.0671.001 White3.77313.863.0741.78543.518 AfricanAmerican1.16313.739.0071.9333.199 Hispanic6.34714.070.2031.652570.898 Constant-12.01814.921.6491.421.000 a. Variable(s) entered on step 1: Gender, Gift, Age, Price, White, AfricanAmerican, Hispanic. 8 Let‘s start with the full model, using all predictors: Which predictors are significant at the.10 level of significance? Gender (p =.142) and the race variables White (p =.785), AfricanAmerican (p =.933), and Hispanic (p =.652), are not significant. (Age is marginal (p =.104), and we’ll leave it in for now.)

9 9 We can now repeat the analysis using only the significant predictors. Which predictors are significant in this reduced model?

10 10 We can now repeat the analysis using only the significant predictors. Which predictors are significant in this reduced model? The reduced model indicates that all variables are significant, even at the.05 level of significance.

11 11 As we have discussed, there are seemingly contradictory values: the coefficient for Price is too small to fit into three decimal places, but we know it is an important predictor because the odds ratio is 1. We can try to find more information by dividing the values of Price by 100. Here are the results for the new model:

12 12 Here are the results for the new model: Note that the odds ratio for Price100 is now 1.041 and the new coefficient is.040. All other values are unchanged. Variables in the Equation BS.E.WalddfSig.Exp(B) Step 1 a Gift2.3391.1314.2731.03910.368 Age.064.0324.1321.0421.066 Price100.040.0166.1651.0131.041 Constant-6.0962.1428.0961.004.002 a. Variable(s) entered on step 1: Gift, Age, Price100.

13 13 How do we assess this model? Use the -2(log likelihood) value; lower values indicate a better fit. When you are comparing models, consider the difference in the -2LL values to see if the difference is significant.

14 14 What if we remove the Age variable? Variables in the Equation BS.E.WalddfSig.Exp(B) Step 1 a Gift2.3391.1314.2731.03910.368 Age.064.0324.1321.0421.066 Price100.040.0166.1651.0131.041 Constant-6.0962.1428.0961.004.002 a. Variable(s) entered on step 1: Gift, Age, Price100.

15 15 What if we remove the Age variable?

16 16 With the Age variable: Without the Age variable: We can conclude that the larger model is better.

17 17 We can also compare the results of the Hosmer-Lemenshow test for goodness-of-fit; it assesses the general model, not the model parameters. With the Age variable: Without the Age variable: Smaller p-values indicate a lack of fit for the model. Which model appears to be better based on this value?

18 18 We can also compare the results of the Hosmer-Lemenshow test for goodness-of-fit; it assesses the general model, not the model parameters. With the Age variable: Without the Age variable: Again we can conclude that the larger model is a better fit.


Download ppt "Multiple Logistic Regression STAT E-150 Statistical Methods."

Similar presentations


Ads by Google