Presentation is loading. Please wait.

Presentation is loading. Please wait.

Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.

Similar presentations


Presentation on theme: "Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission."— Presentation transcript:

1 Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission

2 Logistic Regression We can convert a probability to odds: “Logit” = natural log (ln) of an odds Natural log means base “e”, not base 10 –We can model a logit as a function of independent variables: Just as we model Y or a probability (the LPM)

3 Logistic Regression Note: We can solve for “p” and reformulate the model: Why model this rather than a probability? –Because it is a useful non-linear transformation It always generates Ps between 0 and 1, regardless of the values of X variables Note: probit transformation has similar effect.

4 Logistic Regression Benefits of Logistic regression: You can now effectively model probability as a function of X variables You don’t have to worry about violations of OLS assumptions Predictions fall between 0 and 1 Downsides –You lose the “simple” interpretation of linear coefficients In a linear model, effect of each unit change in X on Y is consistent In a non-linear model, the effect isn’t consistent… Also, you can’t compute some stats (e.g., R-square).

5 Logistic Regression Example Stata output for gun ownership:. logistic gun male educ income south liberal, coef Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male |.7837017.156764 5.00 0.000.4764499 1.090954 educ | -.0767763.0254047 -3.02 0.003 -.1265686 -.026984 income |.2416647.0493794 4.89 0.000.1448828.3384466 south |.7363169.1979038 3.72 0.000.3484327 1.124201 liberal | -.1641107.0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572.6200443 -3.69 0.000 -3.500984 -1.070455 ------------------------------------------------------------------------------ Note: Results aren’t that different from LPM We’re dealing with big effects, large sample… But, predicted probabilities & SEs will be better.

6 Interpreting Coefficients Raw coefficients (  s) show effect of 1-unit change in X on the log odds of Y=1 –Positive coefficients make “Y=1” more likely Negative coefficients mean “less likely” –But, effects are not linear Effect of unit change on p(Y=1) isn’t same for all values of X! –Rather, Xs have a linear effect on the “log odds” But, it is hard to think in units of “log odds”, so we need to do further calculations NOTE: log-odds interpretation doesn’t work on Probit!

7 Interpreting Coefficients Best way to interpret logit coefficients is to exponentiate them This converts from “log odds” to simple “odds” Exponentiation = opposite of natural log –On calculator use “e x ” or “inverse ln” function –Exponentiated coefficients are called odds ratios An odds ratio of 3.0 indicates odds are 3 times higher for each unit change in X –Or, you can say the odds increase “by a factor of 3”. An odds ratio of.5 indicates odds decrease by ½ for each unit change in X. –Odds ratios < 1 indicate negative effects.

8 Raw Coefs vs. Odds ratios It is common to present results either way:. logistic gun male educ income south liberal, coef ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male |.7837017.156764 5.00 0.000.4764499 1.090954 educ | -.0767763.0254047 -3.02 0.003 -.1265686 -.026984 income |.2416647.0493794 4.89 0.000.1448828.3384466 south |.7363169.1979038 3.72 0.000.3484327 1.124201 liberal | -.1641107.0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572.6200443 -3.69 0.000 -3.500984 -1.070455 ------------------------------------------------------------------------------. logistic gun male educ income south liberal ------------------------------------------------------------------------------ gun | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | 2.189562.3432446 5.00 0.000 1.610347 2.977112 educ |.926097.0235272 -3.02 0.003.8811137.9733768 income | 1.273367.0628781 4.89 0.000 1.155904 1.402767 south | 2.08823.4132686 3.72 0.000 1.416845 3.077757 liberal |.848648.049066 -2.84 0.005.7577291.9504762 ------------------------------------------------------------------------------ Can you see the relationship? Negative coeffs yield ratios less below 1.0!

9 Interpreting Coefficients Example: Do you drink coffee? Y=1 indicates coffee drinkers; Y=0 indicates no coffee Key independent variable: Year in grad program –Observed “raw” coefficient: b = 0.67 A positive effect… each year increases log odds by.67 But how big is it really? –Exponentiation: e.67 = 1.95 Odds increase multiplicatively by 1.95 If a person’s initial odds were 2.0 (2:1), an extra year of school would result in: 2.0*1.95 = 3.90 The odds nearly DOUBLE for each unit change in X –Net of other variables in the model…

10 Interpreting Coefficients Exponentiated coefficients (“odds ratios”) operate multiplicatively Effect on odds is found by multiplying coefficients –e b of 1.0 means that a variable has no effect Multiplying anything by 1.0 results in same value –e b > 1.0 means that the variable has a positive effect on the odds of “Y=1” e b < 1.0 means that the variable has a negative effect Hint: Papers may present results as “raw” coefficients or odds ratios It is important to be aware of what you’re looking at If all numbers are positive, it is probably odds ratios!

11 Interpreting Coefficients To further aid interpretation, we can: convert exponentiated coefficients to % change in odds –Calculate: (exponentiated coef - 1)*100% Ex: (e.67 – 1) * 100% = (1.95 – 1) * 100% = 95% Interpretation: Every unit change in X (year of school) increases the odds of coffee drinking by 95% What about a 2-point change in X? Is it 2 * 95%? No!!! You must multiply odds ratios: (1.95 * 1.95 – 1) * 100% = (3.80 – 1) * 100 = +280% –3-point change = (1.95 * 1.95 * 1.95 – 1) * 100% N-point change = (OR n – 1) * 100%

12 Interpreting Coefficients What is the effect of a 1-unit decrease in X? No, you can’t flip sign… it isn’t -95% –You must invert odds ratios to see opposite effect Additional year in school = (1.95 – 1) * 100% = +95% One year less: (1/1.95 – 1)*100 =(.512 -1)*100= -48.7% What is the effect of two variables together? To combine odds ratios you must multiply –Ex: Have a mean advisor; b=.1.2; OR = e 1.2 = 3.32 Effect of 1 additional year AND mean advisor: (1.95 * 3.32 – 1)*100 = (6.47 – 1) * 100% = 547% increase in odds of coffee drinking…

13 Interpreting Coefficients Gun ownership: Effect of education?. logistic gun male educ income south liberal, coef Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male |.7837017.156764 5.00 0.000.4764499 1.090954 educ | -.0767763.0254047 -3.02 0.003 -.1265686 -.026984 income |.2416647.0493794 4.89 0.000.1448828.3384466 south |.7363169.1979038 3.72 0.000.3484327 1.124201 liberal | -.1641107.0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572.6200443 -3.69 0.000 -3.500984 -1.070455 ------------------------------------------------------------------------------ (e -.076 -1)*100% = 7.38% lower odds per year Also: Male: (e.78 -1)*100% = 118% -- more than double!

14 Interpreting Interactions Interactions work like linear regression. gen maleXincome = male * income. logistic gun male educ income maleXincome south liberal, coef Logistic regression Number of obs = 850 LR chi2(6) = 93.10 Prob > chi2 = 0.0000 Log likelihood = -500.93966 Pseudo R2 = 0.0850 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | 2.914016 1.186788 2.46 0.014.5879542 5.240078 educ | -.0783493.0254356 -3.08 0.002 -.1282022 -.0284964 income |.3595354.0879431 4.09 0.000.1871701.5319008 maleXincome | -.1873155.1030033 -1.82 0.069 -.3891982.0145672 south |.7293419.1987554 3.67 0.000.3397886 1.118895 liberal | -.1671854.0579675 -2.88 0.004 -.2807996 -.0535711 _cons | -3.58824 1.030382 -3.48 0.000 -5.60775 -1.568729 ------------------------------------------------------------------------------ Income coef for women is.359. For men it is.359 – (-.187) =.172; exp(.172)= 1.187 Combining odds ratios (by multiplying) gives identical results: exp(.359) * exp (-.187) = 1.43 *.083 = 1.187

15 Predicted Probabilities To determine predicted probabilities, first compute the predicted Logit value: Then, plug logit values back into P formula:

16 Predicted Probabilities: Own a gun? Predicted probability for a female PhD student Highly educated northern liberal female ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male |.7837017.156764 5.00 0.000.4764499 1.090954 educ | -.0767763.0254047 -3.02 0.003 -.1265686 -.026984 income |.2416647.0493794 4.89 0.000.1448828.3384466 south |.7363169.1979038 3.72 0.000.3484327 1.124201 liberal | -.1641107.0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572.6200443 -3.69 0.000 -3.500984 -1.070455 ------------------------------------------------------------------------------

17 The Logit Curve Effect of log odds on probability = nonlinear! »From Knoke et al. p. 300

18 Predicted Probabilities Important point: Substantive effect of a variable on predicted probability differs depending on values of other variables If probability is already high (or low), variable changes may matter less… –Suppose a 1-point change in X doubles the odds… Effect isn’t substantively consequential if probability (Y=1) is already very high –Ex: 20:1 odds =.95 probability; 40:1 odds =.975 probability –Change in probability is only.025 Effect matters a lot for cases with probabilities near.5 –1:1 odds =.5 probability. 2:1 odds =.67 probability –Change in probability is nearly.2!

19 Logit Example: Own a gun? Predicted probability of gun ownership for a female PhD student is very low: P=.017 –Two additional years of education lowers probability from.017 to.015 – not a big effect Additional unit change can’t have a big effect – because probability can’t go below zero It would matter much more for a southern male…

20 Predicted Probabilities Predicted probabilities are a great way to make findings accessible to a reader –Often people make bar graphs of probabilities –1. Show predicted probabilities for real cases Ex: probability of civil war for Ghana vs. Sweden –2. Show probabilities for “hypothetical” cases that exemplify key contrasts in your data Ex: Guns: Southern male vs. female PhD student –3. Show how a change in critical independent variable would affect predicted probability Ex: Guns: What would happen to southern male who went and got a PhD?

21 Predicted Probabilities: Stata Like OLS regression, we can calculate predicted values for all cases. predict predprob, pr (1488 missing values generated). list predprob gun if gun ~=. +----------------+ | predprob gun | |----------------| 1. |.486874 0 | 2. |.6405225 1 | 6. |.7078031 1 | 9. |.6750654 1 | 14. |.4243994 0 | |----------------| 17. |.0617232 0 | 19. |.6556235 1 | 22. |.6356462 0 | 27. |.3670604 0 | 32. |.5620316 0 | Many of the predictions are pretty good But, some aren’t!

22 Predicted Probabilities: Stata “Adjust” command can produce predicted values for different groups in your data Also – can set variables at mean or specific values –NOTE: “Adjust” does not take into account weighted data Example: Probabilities for men/women. adjust, pr by(male) ------------------------------------------------------------------ Dependent variable: gun Command: logistic Variables left as is: educ, income, south, liberal ---------------------- male | pr ----------+----------- 0 |.225814 1 |.417045 ---------------------- Note that the predicted probability for men is nearly twice as high as for women.

23 Stata Notes: Adjust Command Stata “adjust” command can be tricky –1. By default it uses the entire sample, not just cases in your prior analysis Best to specify prior sample: adjust if e(sample), pr by(male) –2. For non-specified variables, stata uses group means (defined by “by” command) Don’t assume it pegs cases to overall sample mean Variables “left as is” take on mean for subgroups –3. It doesn’t take into account weighted data Use “lincom” if you have weighted data

24 Predicted Probabilities: Stata Effect of pol views & gender for PhD students. adjust south=0 income=4 educ=20, pr by(liberal male) ------------------------------------------------------------ Dependent variable: gun Command: logistic Covariates set to value: south = 0, income = 4, educ = 20 ---------------------------- | male liberal | 0 1 ----------+----------------- 1 |.046588.096652 2 |.039818.083241 3 |.033996.071544 4 |.029.06138 5 |.024719.052578 6 |.021057.044978 7 |.017927.038433 Note that independent variables are set to values of interest. (Or can be set to mean).

25 Graphing Predicted Probabilities P(Y=1) for Women & Men by Liberal scatter Women Men Liberal, c(l l)

26 Did model categorize cases correctly? We can choose a criteria: predicted P >.5:. estat clas -------- True -------- Classified | D ~D | Total -----------+--------------------------+----------- + | 64 48 | 112 - | 229 509 | 738 -----------+--------------------------+----------- Total | 293 557 | 850 Classified + if predicted Pr(D) >=.5 True D defined as gun != 0 -------------------------------------------------- Sensitivity Pr( +| D) 21.84% Specificity Pr( -|~D) 91.38% Positive predictive value Pr( D| +) 57.14% Negative predictive value Pr(~D| -) 68.97% -------------------------------------------------- False + rate for true ~D Pr( +|~D) 8.62% False - rate for true D Pr( -| D) 78.16% False + rate for classified + Pr(~D| +) 42.86% False - rate for classified - Pr( D| -) 31.03% -------------------------------------------------- Correctly classified 67.41% -------------------------------------------------- The model yields predicted p>.5 for 112 people; only 64 of them actually have guns Overall, this simple model doesn’t offer extremely accurate predictions… 67% of people are correctly classified Note: Results change if you use a different criteria (e.g., p>.6)

27 Sensitivity / Specificity of Prediction Sensitivity: Of gun owners, what proportion were correctly predicted to own a gun? Specificity: Of non-gun owners, what proportion did we correctly predict? Choosing a different probability cutoff affects those values If we reduce the cutoff to P >.4, we’ll catch a higher proportion of gun owners But, we’ll incorrectly identify more non-gun owners. And, we’ll have more false positives.

28 Sensitivity / Specificity of Prediction Stata can produce a plot showing how predictions will change if we vary “P” cutoff: Stata command: lsens

29 Hypothesis tests Testing hypotheses using logistic regression H0: There is no effect of year in grad program on coffee drinking H1: Year in grad school is associated with coffee –Or, one-tail test: Year in school increases probability of coffee –MLE estimation yields standard errors… like OLS –Test statistic: 2 options; both yield same results t = b/SE… just like OLS regression Wald test (Chi-square, 1df); essentially the square of t –Reject H0 if Wald or t > critical value Or if p-value less than alpha (usually.05).

30 Model Fit: Likelihood Ratio Tests MLE computes a likelihood for the model “Better” models have higher likelihoods Log likelihood is typically a negative value, so “better” means a less negative value… -100 > -1000 Log likelihood ratio test: Allows comparison of any two nested models One model must be a subset of vars in other model –You can’t compare totally unrelated models! Models must use the exact same sample.

31 Model Fit: Likelihood Ratio Tests Default LR test comparison: Current model versus “null model” Null model = only a constant; no covariates; K=0 Also useful: Compare small & large model Do added variables (as a group) fit the data better? –Ex: Suppose a theory suggests 4 psychological variables will have an important effect… We could use LR test to compare “base model” to model with 4 additional variables. STATA: Run first model; “store” estimates; run second model; use stata command “ lrtest ” to compare models

32 Model Fit: Likelihood Ratio Tests Likelihood ratio test is based on the G-square Chi-square distributed; df = K 1 – K 0 K = # variables; K 1 = full model, K 0 = simpler model L 1 = likelihood for full model; L 0 = simpler model Significant likelihood ratio test indicates that the larger model (L 1 ) is an improvement G 2 > critical value; or p-value <.05.

33 Model Fit: Likelihood Ratio Tests Stata’s default LR test; compares to null model. logistic gun male educ income south liberal, coef Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male |.7837017.156764 5.00 0.000.4764499 1.090954 educ | -.0767763.0254047 -3.02 0.003 -.1265686 -.026984 income |.2416647.0493794 4.89 0.000.1448828.3384466 south |.7363169.1979038 3.72 0.000.3484327 1.124201 liberal | -.1641107.0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572.6200443 -3.69 0.000 -3.500984 -1.070455 ------------------------------------------------------------------------------ LR Chi2(5) indicates G-square for 5 degrees of freedom Prob > chi2 is a p-value. p <.05 indicates a significantly better model Model likelihood = -502.7 Null model is a lower value (more negative)

34 Model Fit: Likelihood Ratio Tests Example: Null model log likelihood: -547.5; Full model: -502.7 5 new variables, so K 1 – K 0 = 5. According to  2 table, crit value=11.07 Since 89.5 greatly exceeds 11.07, we are confident that the full model is an improvement Also, observed p-value in STATA output is.000!

35 Model Fit: Pseudo R-Square Pseudo R-square “A descriptive measure that indicates roughly the proportion of observed variation accounted for by the… predictors.” Knoke et al, p. 313 Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | 2.189562.3432446 5.00 0.000 1.610347 2.977112 educ |.926097.0235272 -3.02 0.003.8811137.9733768 income | 1.273367.0628781 4.89 0.000 1.155904 1.402767 south | 2.08823.4132686 3.72 0.000 1.416845 3.077757 liberal |.848648.049066 -2.84 0.005.7577291.9504762 ------------------------------------------------------------------------------ Model explains roughly 8% of variation in Y

36 Assumptions & Problems Assumption: Independent random sample Serial correlation or clustering violate assumptions; bias SE estimates and hypothesis tests We will discuss possible remedies in the future Multicollinearity: High correlation among independent variables causes problems Unstable, inefficient estimates Watch for coefficient instability, check VIF/tolerance Remove unneeded variables or create indexes of related variables.

37 Assumptions & Problems Outliers/Influential cases Unusual/extreme cases can distort results, just like OLS –Logistic requires different influence statistics Example: dbeta – very similar to OLS “Cooks D” –Outlier diagnostics are available in STATA After model: “ predict outliervar, dbeta ” Lists & graphs of residuals & dbetas can identify influential cases.

38 Assumptions & Problems Insufficient variance: You need cases for both values of the dependent variable Extremely rare (or common) events can be a problem Suppose N=1000, but only 3 are coded Y=1 Estimates won’t be great –Also: Maximum likelihood estimates cannot be computed if any independent variable perfectly predicts the outcome (Y=1) Ex: Suppose Soc 8811 drives all students to drink coffee... So there is no variation… –In that case, you cannot include a dummy variable for taking Soc 8811 in the model.

39 Assumptions & Problems Model specification / Omitted variable bias Just like any regression model, it is critical to include appropriate variables in the model Omission of important factors or ‘controls’ will lead to misleading results.

40 Stata Notes: Logistic Regression Stata has two commands: “logit” & “logistic” –Logit, by default, produces raw coefficients –Logistic, by default, produces odds ratios It exponentiates all coefficients for you! Note: Both yield identical results –The following pairs of commands are identical –For raw coefficients: logit gun male educ income south liberal logistic gun male educ income south liberal, coef –And for odds ratios: logit gun male educ income south liberal, nocoef logistic gun male educ income south liberal

41 Real World Example: Coups Issue: Many countries face the threat of a coup d’etat – violent overthrow of the regime What factors whether a countries will have a coup? Paper Handout: Belkin and Schofer (2005) What are the basic findings? How much do the odds of a coup differ for military regimes vs. civilian governments? –b=1.74; (e 1.74 -1)*100% = +470% What about a 2-point increase in log GDP? –b=-.233; ((e -.233 * e -.233) -1)*100% = -37%


Download ppt "Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission."

Similar presentations


Ads by Google