Logistic Regression Sociology 229: Advanced Regression

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Brief introduction on Logistic Regression
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Multiple Regression Fenster Today we start on the last part of the course: multivariate analysis. Up to now we have been concerned with testing the significance.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Some Topics In Multivariate Regression. Some Topics We need to address some small topics that are often come up in multivariate regression. I will illustrate.
Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Multinomial Logit Sociology 8811 Lecture 11 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Chapter 13 Multiple Regression
Multiple Linear Regression Model
Chapter 12 Multiple Regression
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
An Introduction to Logistic Regression JohnWhitehead Department of Economics Appalachian State University.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Topic 3: Regression.
Event History Models Sociology 229: Advanced Regression Class 5
An Introduction to Logistic Regression
Today Concepts underlying inferential statistics
Multiple Regression 2 Sociology 5811 Lecture 23 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Correlation and Regression Analysis
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Multiple Linear Regression Analysis
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Copyright © 2005 by Evan Schofer
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Inference for regression - Simple linear regression
Overview of Meta-Analytic Data Analysis
Hypothesis Testing in Linear Regression Analysis
Linear Regression Inference
Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Multiple Regression 1 Sociology 5811 Lecture 22 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Chapter 9: Testing Hypotheses
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Multiple Regression 3 Sociology 5811 Lecture 24 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Count Models 1 Sociology 8811 Lecture 12
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Sociology 5811: Lecture 11: T-Tests for Difference in Means Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Logistic Regression 2 Sociology 8811 Lecture 7 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Logistic Regression and Odds Ratios Psych DeShon.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
QM222 Class 9 Section A1 Coefficient statistics
Logistic Regression APKC – STATS AFAC (2016).
Hypothesis Testing Review
QM222 Class 8 Section A1 Using categorical data in regression
Regression 1 Sociology 8811 Copyright © 2007 by Evan Schofer
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Logistic Regression 4 Sociology 8811 Lecture 9
Inferential Statistics
Presentation transcript:

Logistic Regression Sociology 229: Advanced Regression Copyright © 2010 by Evan Schofer Do not copy or distribute without permission

Announcements None

Agenda Today’s Class Next week: Introductions Go over syllabus Review topic: Logistic regression Not required – only for those who want to stay… Next week: Multinomial logistic regression

Introduction Goal of this course: expand your methodological “toolbox” Regression is extremely robust and versatile... BUT: often we have data that violates assumptions of regression models… Such as a dichotomous dependent variable OR: we wish to do a kind of analysis beyond what can be done with ordinary regression models Ex: quantile regression So, we need to develop a set of additional tools…

Introduction Main course topics Multinomial logistic regression Count models Event history / survival analysis Multilevel models & panel models & some additional stuff squeezed in… Issue: There is always a trade-off between depth and coverage The course covers a lot of topics briefly Advantage: exposes you to lots of useful things Disadvantage: We don’t have nearly enough time to cover material thoroughly…

Review Syllabus Main points: All readings are available online Complete readings prior to class on week they are assigned Grades are based on several short assignments Plus, small “participation” component No big paper at the end NOTE: This class has some overlap with my Event History Analysis class I’ve come up with some (optional) alternative material for those who took my earlier class.

Introductions This is a small class… let’s introduce ourselves Also: It is helpful to get to know your classmates… for when you are stuck on the homework…

Review: Types of Variables Continuous variable = can be measured with infinite precision Age: we may round off, but great precision is possible Discrete variable = can only take on a specific set of values Typically: Positive integers or a small set of categories Ex: # children living in a household; Race; gender Note: Dichotomous = discrete with 2 categories.

Review: Types of Variables And, don’t forget about measurement scales: Nominal: Categories that can’t be ordered Note: Also called “categorical” variables Ex: Religion; race; geographic state of residence Ordinal: Orderable categories Ex: Social class; College “rankings”; Most attitudinal measures (Do you approve of… on a 1-5 scale) Interval/Continuous: Ordered, with consistent differences across units Ex: Age; Cholesterol level; Income (in dollars).

Review: OLS Regression Question: What kinds of variables can be analyzed with OLS regression? Basic correlation and regression was designed for 2 interval/ratio variables Does fat consumption correlate with cholesterol level? Also: It is easy to incorporate nominal/categorical independent variables Strategy: Use dummy variables in regression Ex: Is gender associated with cholesterol level? Also: OLS is “robust” and works reasonably well with many ordinal measures (ideally 5+ categories) Ex: Are environmental attitudes associated with approval of the president?

Example 1: OLS Regression Example: Study time and student achievement. X variable: Average # hours spent studying per day Y variable: Score on reading test Y axis X axis 0 1 2 3 4 30 20 10 Case X Y 1 2.6 28 2 1.4 13 3 .65 19 4 4.1 31 5 .25 8 6 1.9 16

Example 2: Dichotomous Variable Ex: Did students pass the test (score > 18)? Does OLS regression make sense here? Y axis X axis 0 1 2 3 4 Pass (1) Fail (0) Case X Y 1 2.6 2 1.4 3 .65 4 4.1 5 .25 6 1.9

OLS & Dichotomous Variables Problem: OLS regression wasn’t really designed for dichotomous dependent variables Two possible outcomes (typically labeled 0 & 1) What kinds of problems come up? Linearity assumption doesn’t hold up Error distribution is not normal The model offers nonsensical predicted values Instead of predicting pass (1) or fail (0), the regression line might predict -.5.

The Linear Probability Model (LPM) Solution #1: Use OLS regression anyway! Dependent variable = the probability that a case scores 1 (as opposed to 0) In previous example, 1 = passed test; 0 = failed. We’ll assume that the probability changes as a linear function of independent variables: Note: This assumption may not be appropriate

Linear Probability Model (LPM) The LPM may yield reasonable results Often good enough to get a “crude look” at your data Results tend to be better if data is well behaved Ex: If there are decent numbers of cases in each category of the dependent variable. Interpretation: Coefficients (b) reflect the increase in probability of Y=1 for each unit change in X Constant (a) reflects the base probability of Y=1 if all X variables are zero Significance tests are done; but may not be trustworthy due to OLS assumption violations.

LPM Example: Own a gun? Stata OLS output: . regress gun male educ income south liberal Source | SS df MS Number of obs = 850 -------------+------------------------------ F( 5, 844) = 17.86 Model | 18.3727851 5 3.67455703 Prob > F = 0.0000 Residual | 173.628391 844 .205720843 R-squared = 0.0957 -------------+------------------------------ Adj R-squared = 0.0903 Total | 192.001176 849 .226149796 Root MSE = .45356 ------------------------------------------------------------------------------ gun | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .1637871 .0314914 5.20 0.000 .1019765 .2255978 educ | -.0153661 .00525 -2.93 0.004 -.0256706 -.0050616 income | .0379628 .0071879 5.28 0.000 .0238546 .0520711 south | .1539077 .0420305 3.66 0.000 .0714111 .2364043 liberal | -.0313841 .011572 -2.71 0.007 -.0540974 -.0086708 _cons | .13901 .1027844 1.35 0.177 -.0627331 .3407531 Interpretation: Each additional year of education decreases probability of gun ownership by .015. What about other vars?

LPM Example: Own a gun? OLS results can yield predicted probabilities Just plug in values of constant, X’s into linear equation Ex: A conservative, poor, southern male: ------------------------------------------------------------------------------ gun | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .1637871 .0314914 5.20 0.000 .1019765 .2255978 educ | -.0153661 .00525 -2.93 0.004 -.0256706 -.0050616 income | .0379628 .0071879 5.28 0.000 .0238546 .0520711 south | .1539077 .0420305 3.66 0.000 .0714111 .2364043 liberal | -.0313841 .011572 -2.71 0.007 -.0540974 -.0086708 _cons | .13901 .1027844 1.35 0.177 -.0627331 .3407531

LPM Example: Own a gun? Predicted probability for a female PhD student Highly educated northern liberal female ------------------------------------------------------------------------------ gun | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .1637871 .0314914 5.20 0.000 .1019765 .2255978 educ | -.0153661 .00525 -2.93 0.004 -.0256706 -.0050616 income | .0379628 .0071879 5.28 0.000 .0238546 .0520711 south | .1539077 .0420305 3.66 0.000 .0714111 .2364043 liberal | -.0313841 .011572 -2.71 0.007 -.0540974 -.0086708 _cons | .13901 .1027844 1.35 0.177 -.0627331 .3407531

LPM: Weaknesses Model yields nonsensical predicted values Probabilities should always fall between 0 and 1. Assumptions of OLS regression are violated Linearity Homoskedasticity (Equal error variance across values of X): error = low near 0, 1 & high at other values. Normality of error distribution Coefficients (b) are not biased; but not “best” (i.e., lowest possible sampling variance) Variances & Standard errors will be inaccurate Hypothesis tests (t-tests, f-tests) can’t be trusted

Logistic Regression Better Alternative: Logistic Regression Also called “Logit” A non-linear form of regression that works well for dichotomous dependent variables Other non-linear formulations also work (e.g., probit) Based on “odds” rather than probability Rather than model P(Y=1), we model “log odds” of Y=1 “Logit” refers to the natural log of an odds… Logistic regression is regression for a logit Rather than a simple variable “Y” (OLS) Or a probability (the Linear Probability Model).

Probability & Odds Probability of event A defined as p(A): Example: Coin Flip… probability of “heads” 1 outcome is “heads”, 2 total possible outcomes P(“heads”) = 1 / 2 = .5 Odds of A = Number of outcomes that are A, divided by number of outcomes that are not A Odds of “heads” = 1 / 1 = 1.0 Also equivalent to: probability of event over probability of it not happening: p/(1-p) = (.5 / 1-.5) = 1.0

Logistic Regression We can convert a probability to odds: “Logit” = natural log (ln) of an odds Natural log means base “e”, not base 10 We can model a logit as a function of independent variables: Just as we model Y or a probability (the LPM)

The Logit Curve Note: Logit always falls between 0 and 1 From Knoke et al. p. 300

Logistic Regression Note: We can solve for “p” and reformulate the model: Why model this rather than a probability? Because it is a useful non-linear transformation It always generates Ps between 0 and 1, regardless of the values of X variables Note: probit transformation has similar effect.

Logistic Regression: Estimation Estimation: We can model the logit Recall: “Hat” = estimate… Solution requires Maximum Likelihood Estimation (MLE) In OLS there was an algebraic solution Here, we allow the computer to “search” for the best values of coefficients (“a” and “b”s) to fit observed data.

Logistic Regression: Estimation Properties of Maximum Likelihood Estimation See Long & Freese 2003:69, Long 1997:54 for a summary “Consistent, efficient and asymptotically normal as N approaches infinity.” Large N = better! Rules of thumb regarding sample size N > 500 = fine; N < 100 can be worrisome Results aren’t necessarily wrong if N<100; But it is a possibility; and hard to know when problems crop up Plus ~10 cases per independent variable Eliason (1993) suggests minimum N~60 for up to 5 IVs Higher N is needed if data are problematic due to: Multicollinearity Limited variation in dependent variable.

Logistic Regression Benefits of Logistic regression: Downsides You can now effectively model probability as a function of X variables You don’t have to worry about violations of OLS assumptions Predictions fall between 0 and 1 Downsides You lose the “simple” interpretation of linear coefficients In a linear model, effect of each unit change in X on Y is consistent In a non-linear model, the effect isn’t consistent… Also, you can’t compute some stats (e.g., R-square).

Logistic Regression Example Stata output for gun ownership: . logistic gun male educ income south liberal, coef Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .7837017 .156764 5.00 0.000 .4764499 1.090954 educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984 income | .2416647 .0493794 4.89 0.000 .1448828 .3384466 south | .7363169 .1979038 3.72 0.000 .3484327 1.124201 liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455 Note: Results aren’t that different from LPM We’re dealing with big effects, large sample… But, predicted probabilities & SEs will be better.

Interpreting Coefficients Raw coefficients (bs) show effect of 1-unit change in X on the log odds of Y=1 Positive coefficients make “Y=1” more likely Negative coefficients mean “less likely” But, effects are not linear Effect of unit change on p(Y=1) isn’t same for all values of X! Rather, Xs have a linear effect on the “log odds” But, it is hard to think in units of “log odds”, so we need to do further calculations NOTE: log-odds interpretation doesn’t work on Probit!

Interpreting Coefficients Best way to interpret logit coefficients is to exponentiate them This converts from “log odds” to simple “odds” Exponentiation = opposite of natural log On calculator use “ex” or “inverse ln” function Exponentiated coefficients are called odds ratios An odds ratio of 3.0 indicates odds are 3 times higher for each unit change in X Or, you can say the odds increase “by a factor of 3”. An odds ratio of .5 indicates odds decrease by ½ for each unit change in X. Odds ratios < 1 indicate negative effects.

Interpreting Coefficients Example: Do you drink coffee? Y=1 indicates coffee drinkers; Y=0 indicates no coffee Key independent variable: Year in grad program Observed “raw” coefficient: b = 0.67 A positive effect… each year increases log odds by .67 But how big is it really? Exponentiation: e.67= 1.95 Odds increase multiplicatively by 1.95 If a person’s initial odds were 2.0 (2:1), an extra year of school would result in: 2.0*1.95 = 3.90 The odds nearly DOUBLE for each unit change in X Net of other variables in the model…

Interpreting Coefficients Exponentiated coefficients (“odds ratios”) operate multiplicatively Effect on odds is found by multiplying coefficients eb of 1.0 means that a variable has no effect Multiplying anything by 1.0 results in same value eb > 1.0 means that the variable has a positive effect on the odds of “Y=1” eb < 1.0 means that the variable has a negative effect Hint: Papers may present results as “raw” coefficients or odds ratios It is important to be aware of what you’re looking at If all coeffs are positive, they might be odds ratios!

Interpreting Coefficients To further aid interpretation, we can: convert exponentiated coefficients to % change in odds Calculate: (exponentiated coef - 1)*100% Ex: (e.67 – 1) * 100% = (1.95 – 1) * 100% = 95% Interpretation: Every unit change in X (year of school) increases the odds of coffee drinking by 95% What about a 2-point change in X? Is it 2 * 95%? No!!! You must multiply odds ratios: (1.95 * 1.95 – 1) * 100% = (3.80 – 1) * 100 = +280% 3-point change = (1.95 * 1.95 * 1.95 – 1) * 100% N-point change = (ORn – 1) * 100%

Interpreting Coefficients What is the effect of a 1-unit decrease in X? No, you can’t flip sign… it isn’t -95% You must invert odds ratios to see opposite effect Additional year in school = (1.95 – 1) * 100% = +95% One year less: (1/1.95 – 1)*100 =(.512 -1)*100= -48.7% What is the effect of two variables together? To combine odds ratios you must multiply Ex: Have a mean advisor; b=.1.2; OR = e1.2 = 3.32 Effect of 1 additional year AND mean advisor: (1.95 * 3.32 – 1)*100 = (6.47 – 1) * 100% = 547% increase in odds of coffee drinking…

Interpreting Coefficients Gun ownership: Effect of education? . logistic gun male educ income south liberal, coef Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .7837017 .156764 5.00 0.000 .4764499 1.090954 educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984 income | .2416647 .0493794 4.89 0.000 .1448828 .3384466 south | .7363169 .1979038 3.72 0.000 .3484327 1.124201 liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455 (e-.076-1)*100% = 7.39% lower odds per year Also: Male: (e.78-1)*100% = +118% -- more than double!

Raw Coefs vs. Odds ratios It is common to present results either way: . logistic gun male educ income south liberal, coef ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .7837017 .156764 5.00 0.000 .4764499 1.090954 educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984 income | .2416647 .0493794 4.89 0.000 .1448828 .3384466 south | .7363169 .1979038 3.72 0.000 .3484327 1.124201 liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455 Can you see the relationship? Negative coeffs yield ratios less below 1.0! . logistic gun male educ income south liberal ------------------------------------------------------------------------------ gun | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | 2.189562 .3432446 5.00 0.000 1.610347 2.977112 educ | .926097 .0235272 -3.02 0.003 .8811137 .9733768 income | 1.273367 .0628781 4.89 0.000 1.155904 1.402767 south | 2.08823 .4132686 3.72 0.000 1.416845 3.077757 liberal | .848648 .049066 -2.84 0.005 .7577291 .9504762

Interpreting Coefficients Raw coefficients (bs) show effect of 1-unit change in X on the log odds of Y=1 Positive coefficients make “Y=1” more likely Negative coefficients mean “less likely” But, effects are not linear Effect of unit change on p(Y=1) isn’t same for all values of X! Rather, Xs have a linear effect on the “log odds” But, it is hard to think in units of “log odds”, so we need to do further calculations NOTE: log-odds interpretation doesn’t work on Probit!

Interpreting Coefficients Best way to interpret logit coefficients is to exponentiate them This converts from “log odds” to simple “odds” Exponentiation = opposite of natural log On calculator use “ex” or “inverse ln” function Exponentiated coefficients are called odds ratios An odds ratio of 3.0 indicates odds are 3 times higher for each unit change in X Or, you can say the odds increase “by a factor of 3”. An odds ratio of .5 indicates odds decrease by ½ for each unit change in X. Odds ratios < 1 indicate negative effects.

Raw Coefs vs. Odds ratios It is common to present results either way: . logistic gun male educ income south liberal, coef ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .7837017 .156764 5.00 0.000 .4764499 1.090954 educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984 income | .2416647 .0493794 4.89 0.000 .1448828 .3384466 south | .7363169 .1979038 3.72 0.000 .3484327 1.124201 liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455 Can you see the relationship? Negative coeffs yield ratios less below 1.0! . logistic gun male educ income south liberal ------------------------------------------------------------------------------ gun | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | 2.189562 .3432446 5.00 0.000 1.610347 2.977112 educ | .926097 .0235272 -3.02 0.003 .8811137 .9733768 income | 1.273367 .0628781 4.89 0.000 1.155904 1.402767 south | 2.08823 .4132686 3.72 0.000 1.416845 3.077757 liberal | .848648 .049066 -2.84 0.005 .7577291 .9504762

Interpreting Coefficients Example: Do you drink coffee? Y=1 indicates coffee drinkers; Y=0 indicates no coffee Key independent variable: Year in grad program Observed “raw” coefficient: b = 0.67 A positive effect… each year increases log odds by .67 But how big is it really? Exponentiation: e.67= 1.95 Odds increase multiplicatively by 1.95 If a person’s initial odds were 2.0 (2:1), an extra year of school would result in: 2.0*1.95 = 3.90 The odds nearly DOUBLE for each unit change in X Net of other variables in the model…

Interpreting Coefficients Exponentiated coefficients (“odds ratios”) operate multiplicatively Effect on odds is found by multiplying coefficients eb of 1.0 means that a variable has no effect Multiplying anything by 1.0 results in same value eb > 1.0 means that the variable has a positive effect on the odds of “Y=1” eb < 1.0 means that the variable has a negative effect Hint: Papers may present results as “raw” coefficients or odds ratios It is important to be aware of what you’re looking at If all numbers are positive, it is probably odds ratios!

Interpreting Coefficients To further aid interpretation, we can: convert exponentiated coefficients to % change in odds Calculate: (exponentiated coef - 1)*100% Ex: (e.67 – 1) * 100% = (1.95 – 1) * 100% = 95% Interpretation: Every unit change in X (year of school) increases the odds of coffee drinking by 95% What about a 2-point change in X? Is it 2 * 95%? No!!! You must multiply odds ratios: (1.95 * 1.95 – 1) * 100% = (3.80 – 1) * 100 = +280% 3-point change = (1.95 * 1.95 * 1.95 – 1) * 100% N-point change = (ORn – 1) * 100%

Interpreting Coefficients What is the effect of a 1-unit decrease in X? No, you can’t flip sign… it isn’t -95% You must invert odds ratios to see opposite effect Additional year in school = (1.95 – 1) * 100% = +95% One year less: (1/1.95 – 1)*100 =(.512 -1)*100= -48.7% What is the effect of two variables together? To combine odds ratios you must multiply Ex: Have a mean advisor; b=.1.2; OR = e1.2 = 3.32 Effect of 1 additional year AND mean advisor: (1.95 * 3.32 – 1)*100 = (6.47 – 1) * 100% = 547% increase in odds of coffee drinking…

Interpreting Coefficients Gun ownership: Effect of education? . logistic gun male educ income south liberal, coef Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .7837017 .156764 5.00 0.000 .4764499 1.090954 educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984 income | .2416647 .0493794 4.89 0.000 .1448828 .3384466 south | .7363169 .1979038 3.72 0.000 .3484327 1.124201 liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455 (e-.076-1)*100% = 7.38% lower odds per year Also: Male: (e.78-1)*100% = 118% -- more than double!

Interpreting Interactions Interactions work like linear regression . gen maleXincome = male * income . logistic gun male educ income maleXincome south liberal, coef Logistic regression Number of obs = 850 LR chi2(6) = 93.10 Prob > chi2 = 0.0000 Log likelihood = -500.93966 Pseudo R2 = 0.0850 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | 2.914016 1.186788 2.46 0.014 .5879542 5.240078 educ | -.0783493 .0254356 -3.08 0.002 -.1282022 -.0284964 income | .3595354 .0879431 4.09 0.000 .1871701 .5319008 maleXincome | -.1873155 .1030033 -1.82 0.069 -.3891982 .0145672 south | .7293419 .1987554 3.67 0.000 .3397886 1.118895 liberal | -.1671854 .0579675 -2.88 0.004 -.2807996 -.0535711 _cons | -3.58824 1.030382 -3.48 0.000 -5.60775 -1.568729 Income coef for women is .359. For men it is .359 – (-.187) = .172; exp(.172)= 1.187 Combining odds ratios (by multiplying) gives identical results: exp(.359) * exp (-.187) = 1.43 * .083 = 1.187

Predicted Probabilities To determine predicted probabilities, first compute the predicted Logit value: Then, plug logit values back into P formula:

Predicted Probabilities: Own a gun? Predicted probability for a female PhD student Highly educated northern liberal female ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .7837017 .156764 5.00 0.000 .4764499 1.090954 educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984 income | .2416647 .0493794 4.89 0.000 .1448828 .3384466 south | .7363169 .1979038 3.72 0.000 .3484327 1.124201 liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455

The Logit Curve Effect of log odds on probability = nonlinear! From Knoke et al. p. 300

Predicted Probabilities Important point: Substantive effect of a variable on predicted probability differs depending on values of other variables If probability is already high (or low), variable changes may matter less… Suppose a 1-point change in X doubles the odds… Effect isn’t substantively consequential if probability (Y=1) is already very high Ex: 20:1 odds = .95 probability; 40:1 odds = .975 probability Change in probability is only .025 Effect matters a lot for cases with probabilities near .5 1:1 odds = .5 probability. 2:1 odds = .67 probability Change in probability is nearly .2!

Logit Example: Own a gun? Predicted probability of gun ownership for a female PhD student is very low: P=.017 Two additional years of education lowers probability from .017 to .015 – not a big effect Additional unit change can’t have a big effect – because probability can’t go below zero It would matter much more for a southern male…

Predicted Probabilities Predicted probabilities are a great way to make findings accessible to a reader Often people make bar graphs of probabilities 1. Show predicted probabilities for real cases Ex: probability of civil war for Ghana vs. Sweden 2. Show probabilities for “hypothetical” cases that exemplify key contrasts in your data Ex: Guns: Southern male vs. female PhD student 3. Show how a change in critical independent variable would affect predicted probability Ex: Guns: What would happen to southern male who went and got a PhD?

Predicted Probabilities: Stata Like OLS regression, we can calculate predicted values for all cases . predict predprob, pr (1488 missing values generated) . list predprob gun if gun ~= . +----------------+ | predprob gun | |----------------| 1. | .486874 0 | 2. | .6405225 1 | 6. | .7078031 1 | 9. | .6750654 1 | 14. | .4243994 0 | 17. | .0617232 0 | 19. | .6556235 1 | 22. | .6356462 0 | 27. | .3670604 0 | 32. | .5620316 0 | Many of the predictions are pretty good But, some aren’t!

Predicted Probabilities: Stata “Adjust” (stata 9/10) and “margins” (stata 11) commands can produce predicted values for different groups in your data Also – can set variables at mean or specific values Example: Probabilities for men/women . adjust, pr by(male) ------------------------------------------------------------------ Dependent variable: gun Command: logistic Variables left as is: educ, income, south, liberal ---------------------- male | pr ----------+----------- 0 | .225814 1 | .417045 Note that the predicted probability for men is nearly twice as high as for women.

Stata Notes: Adjust Command Stata “adjust” command can be tricky 1. By default it uses the entire sample, not just cases in your prior analysis Best to specify prior sample: adjust if e(sample), pr by(male) 2. For non-specified variables, stata uses group means (defined by “by” command) Don’t assume it pegs cases to overall sample mean Variables “left as is” take on mean for subgroups 3. It doesn’t take into account weighted data Use “lincom” if you have weighted data

Marginal Change in Logit Issue: How to best capture effect size in non-linear models? % Change in odds ratios for 1-unit change in X Change in actual probability for 1-unit change in X Either for hypothetical cases or an actual case Another option: marginal change The actual slope of the curve at a specific point Again, can be computed for real or hypothetical cases Use “adjust” (stata 9/10) or “margins” (stata 11) Recall from calculus: derivatives are slopes... So, a marginal change is just a derivative.

Marginal vs Discrete Change in Logit Long and Freese 2006:169

Predicted Probabilities: Stata Effect of pol views & gender for PhD students Note that independent variables are set to values of interest. (Or can be set to mean). . adjust south=0 income=4 educ=20, pr by(liberal male) ------------------------------------------------------------ Dependent variable: gun Command: logistic Covariates set to value: south = 0, income = 4, educ = 20 ---------------------------- | male liberal | 0 1 ----------+----------------- 1 | .046588 .096652 2 | .039818 .083241 3 | .033996 .071544 4 | .029 .06138 5 | .024719 .052578 6 | .021057 .044978 7 | .017927 .038433

Graphing Predicted Probabilities P(Y=1) for Women & Men by Liberal scatter Women Men Liberal, c(l l)

Did model categorize cases correctly? We can choose a criteria: predicted P > .5: . estat clas -------- True -------- Classified | D ~D | Total -----------+--------------------------+----------- + | 64 48 | 112 - | 229 509 | 738 Total | 293 557 | 850 Classified + if predicted Pr(D) >= .5 True D defined as gun != 0 -------------------------------------------------- Sensitivity Pr( +| D) 21.84% Specificity Pr( -|~D) 91.38% Positive predictive value Pr( D| +) 57.14% Negative predictive value Pr(~D| -) 68.97% False + rate for true ~D Pr( +|~D) 8.62% False - rate for true D Pr( -| D) 78.16% False + rate for classified + Pr(~D| +) 42.86% False - rate for classified - Pr( D| -) 31.03% Correctly classified 67.41% The model yields predicted p>.5 for 112 people; only 64 of them actually have guns Overall, this simple model doesn’t offer extremely accurate predictions… 67% of people are correctly classified Note: Results change if you use a different criteria (e.g., p>.6)

Sensitivity / Specificity of Prediction Sensitivity: Of gun owners, what proportion were correctly predicted to own a gun? Specificity: Of non-gun owners, what proportion did we correctly predict? Choosing a different probability cutoff affects those values If we reduce the cutoff to P > .4, we’ll catch a higher proportion of gun owners But, we’ll incorrectly identify more non-gun owners. And, we’ll have more false positives.

Sensitivity / Specificity of Prediction Stata can produce a plot showing how predictions will change if we vary “P” cutoff: Stata command: lsens

Hypothesis tests Testing hypotheses using logistic regression H0: There is no effect of year in grad program on coffee drinking H1: Year in grad school is associated with coffee Or, one-tail test: Year in school increases probability of coffee MLE estimation yields standard errors… like OLS Test statistic: 2 options; both yield same results t = b/SE… just like OLS regression Wald test (Chi-square, 1df); essentially the square of t Reject H0 if Wald or t > critical value Or if p-value less than alpha (usually .05).

Model Fit: Likelihood Ratio Tests MLE computes a likelihood for the model “Better” models have higher likelihoods Log likelihood is typically a negative value, so “better” means a less negative value… -100 > -1000 Log likelihood ratio test: Allows comparison of any two nested models One model must be a subset of vars in other model You can’t compare totally unrelated models! Models must use the exact same sample.

Model Fit: Likelihood Ratio Tests Default LR test comparison: Current model versus “null model” Null model = only a constant; no covariates; K=0 Also useful: Compare small & large model Do added variables (as a group) fit the data better? Ex: Suppose a theory suggests 4 psychological variables will have an important effect… We could use LR test to compare “base model” to model with 4 additional variables. STATA: Run first model; “store” estimates; run second model; use stata command “lrtest” to compare models

Model Fit: Likelihood Ratio Tests Likelihood ratio test is based on the G-square Chi-square distributed; df = K1 – K0 K = # variables; K1 = full model, K0 = simpler model L1 = likelihood for full model; L0 = simpler model Significant likelihood ratio test indicates that the larger model (L1) is an improvement G2 > critical value; or p-value < .05.

Model Fit: Likelihood Ratio Tests Stata’s default LR test; compares to null model . logistic gun male educ income south liberal, coef Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | .7837017 .156764 5.00 0.000 .4764499 1.090954 educ | -.0767763 .0254047 -3.02 0.003 -.1265686 -.026984 income | .2416647 .0493794 4.89 0.000 .1448828 .3384466 south | .7363169 .1979038 3.72 0.000 .3484327 1.124201 liberal | -.1641107 .0578167 -2.84 0.005 -.2774294 -.0507921 _cons | -2.28572 .6200443 -3.69 0.000 -3.500984 -1.070455 LR Chi2(5) indicates G-square for 5 degrees of freedom Prob > chi2 is a p-value. p < .05 indicates a significantly better model Model likelihood = -502.7 Null model is a lower value (more negative)

Model Fit: Likelihood Ratio Tests Example: Null model log likelihood: -547.5; Full model: -502.7 5 new variables, so K1 – K0 = 5. According to 2 table, crit value=11.07 Since 89.5 greatly exceeds 11.07, we are confident that the full model is an improvement Also, observed p-value in STATA output is .000!

Model Fit: Pseudo R-Square “A descriptive measure that indicates roughly the proportion of observed variation accounted for by the… predictors.” Knoke et al, p. 313 Logistic regression Number of obs = 850 LR chi2(5) = 89.53 Prob > chi2 = 0.0000 Log likelihood = -502.7251 Pseudo R2 = 0.0818 ------------------------------------------------------------------------------ gun | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- male | 2.189562 .3432446 5.00 0.000 1.610347 2.977112 educ | .926097 .0235272 -3.02 0.003 .8811137 .9733768 income | 1.273367 .0628781 4.89 0.000 1.155904 1.402767 south | 2.08823 .4132686 3.72 0.000 1.416845 3.077757 liberal | .848648 .049066 -2.84 0.005 .7577291 .9504762 Model explains roughly 8% of variation in Y

Assumptions & Problems Assumption: Independent random sample Serial correlation or clustering violate assumptions; bias SE estimates and hypothesis tests We will discuss possible remedies in the future Multicollinearity: High correlation among independent variables causes problems Unstable, inefficient estimates Watch for coefficient instability, check VIF/tolerance Remove unneeded variables or create indexes of related variables.

Assumptions & Problems Outliers/Influential cases Unusual/extreme cases can distort results, just like OLS Logistic requires different influence statistics Example: dbeta – very similar to OLS “Cooks D” Outlier diagnostics are available in STATA After model: “predict outliervar, dbeta” Lists & graphs of residuals & dbetas can identify influential cases.

Plotting Residuals by Casenumber predict sresid, rstandard gen casenum = _n scatter sresid casenum

Assumptions & Problems Insufficient variance: You need cases for both values of the dependent variable Extremely rare (or common) events can be a problem Suppose N=1000, but only 3 are coded Y=1 Estimates won’t be great Also: Maximum likelihood estimates cannot be computed if any independent variable perfectly predicts the outcome (Y=1) Ex: Suppose sociology classes drives all students to drink coffee... So there is no variation… In that case, you cannot include a dummy variable for taking sociology classes in the model.

Assumptions & Problems Model specification / Omitted variable bias Just like any regression model, it is critical to include appropriate variables in the model Omission of important factors or ‘controls’ will lead to misleading results.

Probit Probit models are an alternative to logistic regression Involves a different non-linear transformation Generally yields results very similar to logit models Coefficients are rescaled by factor of (approx) 1.6 For ‘garden variety’ analyses, there is little reason to prefer either logit or probit But, probit has advantages in some circumstances Ex: Multinomial models that violate the IIA assumption (to be discussed later).

Example: Unions and Political Participation Handout

Example: Coup d’etat Issue: Many countries face the threat of a coup d’etat – violent overthrow of the regime What factors whether a countries will have a coup? Paper Handout: Belkin and Schofer (2005) What are the basic findings? How much do the odds of a coup differ for military regimes vs. civilian governments? b=1.74; (e1.74 -1)*100% = +470% What about a 2-point increase in log GDP? b=-.233; ((e-.233 * e-.233) -1)*100% = -37%