Logistic regression A quick intro. Why Logistic Regression?  Big idea: dependent variable is a dichotomy (thought can use for more than 2 categories.

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Sociology 680 Multivariate Analysis Logistic Regression.
Linear Regression.
Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
Copyright © 2010 Pearson Education, Inc. Slide
Logistic Regression.
Chapter 8 – Logistic Regression
Models with Discrete Dependent Variables
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Lecture 6: Multiple Regression
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Log-linear and logistic models
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
An Introduction to Logistic Regression
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Multiple Regression Research Methods and Statistics.
Multiple Regression – Basic Relationships
Multinomial Logistic Regression Basic Relationships
Logistic Regression – Basic Relationships
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.
Ordinal Logistic Regression “Good, better, best; never let it rest till your good is better and your better is best” (Anonymous)
Inferential Statistics
An Illustrative Example of Logistic Regression
Categorical Data Prof. Andy Field.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Other Regression Stuff you might come across. Outline Generalized Linear Models  Logistic regression and Related  Survival analysis  Generalized Additive.
Inferences for Regression
Multiple Regression.
Multinomial Logistic Regression Basic Relationships
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Week 6: Model selection Overview Questions from last week Model selection in multivariable analysis -bivariate significance -interaction and confounding.
Extension to Multiple Regression. Simple regression With simple regression, we have a single predictor and outcome, and in general things are straightforward.
Slide 1 The SPSS Sample Problem To demonstrate these concepts, we will work the sample problem for logistic regression in SPSS Professional Statistics.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Logistic regression A quick intro. Why Logistic Regression? Big idea: dependent variable is a dichotomy (though can use for more than 2 categories i.e.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Logistic (regression) single and multiple. Overview  Defined: A model for predicting one variable from other variable(s).  Variables:IV(s) is continuous/categorical,
Logistic Regression. Conceptual Framework - LR Dependent variable: two categories with underlying propensity (yes/no) (absent/present) Independent variables:
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
Slide 1 The Kleinbaum Sample Problem This problem comes from an example in the text: David G. Kleinbaum. Logistic Regression: A Self-Learning Text. New.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Logistic Regression. Linear Regression Purchases vs. Income.
Logistic Regression Analysis Gerrit Rooks
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Nonparametric Statistics
BINARY LOGISTIC REGRESSION
Logistic Regression APKC – STATS AFAC (2016).
Notes on Logistic Regression
Correlation, Regression & Nested Models
Categorical Data Aims Loglinear models Categorical data
Nonparametric Statistics
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Logistic Regression.
Presentation transcript:

Logistic regression A quick intro

Why Logistic Regression?  Big idea: dependent variable is a dichotomy (thought can use for more than 2 categories i.e. multinomial logistic regression)  Why would we use?  One thing to use a t-test (or multivariate counterpart) to say groups are different, however it may be the research goal to predict group membership  Clinical/Medical context  Schizophrenic or not  Clinical depression or not  Cancer or not  Social/Cognitive context  Vote yes or no  Preference A over B  Graduate or not

X2 X1 X3 X4 Categorical Y Basic Model (Same as MR)

Questions  Can the cases be accurately classified given a set of predictors?  Can the solution generalize to predicting new cases?  Comparison of equation with predictors plus intercept to a model with just the intercept  What is the relative importance of each predictor?  How does each variable affect the outcome?  Does a predictor make the solution better or worse or have no effect?  Are there interactions among predictors?  Does adding interactions among predictors (continuous or categorical) significantly improve the model?  Can parameters be accurately estimated?  What is the strength of association between the outcome variable and a set of predictors?

Why logistic regression? Why not?  Goal: To assess likelihood of falling into 1 of the DV categories, given a set of predictors.  Does not require assumptions of linearity, homoscedasticity & normality that we had in MR, though outcome categories must be exclusive and exhaustive and there are LR counterparts to those assumptions  Differs from DFA differ in that DFA focuses on loadings where logreg focuses on odds ratios of how likely it is that an individual will fall into the highest outcome category, given a 1-unit change in a predictor.  While logreg is more flexible in terms of assumptions usually requires larger samples due to using maximum likelihood estimation.  If your DFA meets its assumptions, it might be the better (more statistically powerful) alternative  Furthermore, one can assess different linear combinations on which groups may be classified if there are more than two groups in the DV

Multiple regression approach  With MR, we used a method to minimize the squared deviations from our predicted values  Can’t really pull off with dichotomous variable  Only two outcome values to produce residuals  Can’t meet normality or homoscedasticity assumptions  While it could produce what are essentially predicted probabilities of belonging to a particular group, those probabilities are not bounded by zero and 1  Logistic regression will allow us to go about the prediction/explanation process in a similar manner, but without the problems

Assumptions  The only “real” limitation with logistic regression is that the outcome must be discrete.  If the distributional assumptions are met for it then discriminant function analysis may be more powerful, although it has been shown to overestimate the association using discrete predictors.  If the outcome is continuous then multiple regression is more powerful given that the assumptions are met

Assumptions  Ratio of cases to variables: using discrete variables requires that there are enough responses in every given category to allow for reasonable estimation of parameters/predictive power  Due to the maximum likelihood approach, some suggest even 50 cases per predictor as a rule of thumb  Linearity in the logit – the IVs should have a linear relationship with the logit form of the DV  There is no assumption about the predictors being linearly related to each other

Assumptions  Absence of collinearity among predictors  No outliers  Independence of errors  Assumes categories are mutually exclusive

Model fit  Significance Test: Log-Likelihood (LL)  2 test between Model (M) with predictors + intercept, vs. Intercept (I) only model.  If Likelihood  2 test is significant, predictors model is best.  Goodness-of-fit statistics help you to determine whether the model adequately describes the data  Here statistical significance is not desired  More like a badness of fit really, and problematic since one can’t accept the null due to non-significance  Best used descriptively perhaps  Pseudo r-squared statistics  In this dichotomous situation we will have trouble with devising an r 2

Coefficients  In interpreting coefficients we’re now thinking about a particular case’s tendency toward some outcome  The problem with probabilities is that they are non-linear  Going from.10 to.20 doubles the probability, but going from.80 to.90 only increases the probability somewhat  With logistic regression we start to think about the odds  Odds are just an alternative way of expressing the likelihood (probability) of an event.  Probability is the expected number of the event divided by the total number of possible outcomes  Odds are the expected number of the event divided by the expected number of non-event occurrences.  Expresses the likelihood of occurrence relative to likelihood of non-occurrence

Odds  Let's begin with probability. Let's say that the probability of success is.8, thus  p =.8  Then the probability of failure is  q = 1 - p =.2  The odds of success are defined as  odds(success) = p/q =.8/.2 = 4,  that is, the odds of success are 4 to 1.  We can also define the odds of failure  odds(failure) = q/p =.2/.8 =.25,  that is, the odds of failure are 1 to 4.

Odds Ratio  Next, let's compute the odds ratio by  OR = odds(success)/odds(failure) = 4/.25 = 16  The interpretation of this odds ratio would be that the odds of success are 16 times greater than for failure.  Now if we had formed the odds ratio the other way around with odds of failure in the numerator, we would have gotten  OR = odds(failure)/odds(success) =.25/4 =.0625  Here the interpretation is that the odds of failure are one-sixteenth the odds of success.

Logit  Logit  Natural log (e) of an odds  Often called a log odds  The logit scale is linear  Logits are continuous and are centered on zero (kind of like z-scores)  p = 0.50, odds = 1, then logit = 0  p = 0.70, odds = 2.33, then logit = 0.85  p = 0.30, odds =.43, then logit = -0.85

Logit  So conceptually putting things in our standard regression form:  Log odds = b o + b 1 X  Now a one unit change in X leads to a b 1 change in the log odds  In terms of odds:  In terms of probability:  Thus the logit, odds and probability are different ways of expressing the same thing

Coefficients  The raw coefficients* for our predictor variables in our output are the amount of increase in the log odds given a one unit increase in that predictor  The coefficients are determined through an iterative process that finds the coefficients that best match the data at hand  Maximum likelihood  Starts with a set of coefficients (e.g. ordinary least squares estimates) and then proceeds to alter until almost no change in fit

Coefficients  We also receive a different type of coefficient expressed in odds  Anything above 1 suggests an increase in odds of an event, less than, a decrease in the odds  For example, if 1.14, moving on the predictor variable 1 unit increases the odds of the event by a factor of 1.14  Essentially it is the odds ratio for one value of X vs. the next value of X  More intuitively it refers to the percentage increase (or decrease) of becoming a member of group such and such with a one unit increase in the predictor variable

Example  Example: predicting art museum visitation by education, age, income, and political views  Gss93 dataset  Key things to look for  Model fit: Pseudo-R 2  Coefficients  Classification accuracy  Performing a logistic regression is no different than multiple regression  Once the appropriate function/menu is selected one selects variables in the same fashion and may do sequential, stepwise etc.

Model fit  Cox & Snell’s value would not reach 1.0 even for a perfect fit  Nagelkerke is a version of C&S that would*  Probably preferred but may be a little optimistic (just like our regular R- square)  The Hosmer and Lemeshow GOF suggests we’re ok too**

Coefficients  Would appear age is the only one that doesn’t contribute significantly  Note the odds ratio of 1.00  Polview (1 extreme lib, 7 extreme cons) isn’t perhaps doing much either  More conservative less likely to go to museum  Education  More education more likely** to visit  Income  Higher income more likely to visit *

Classification  Classification table  Here we get a good sense of how well we’re able to predict the outcome.  69% overall compared to 58.7% if we just guessed the more prevelent class ‘no’*

Other measures regarding classification Measure Calculation Prevalence(a + c)/N Overall Diagnostic Power(b + d)/N Correct Classification Rate(a + d)/N Sensitivitya/(a + c) Specificityd/(b + d) False Positive Rateb/(b + d) False Negative Ratec/(a + c) Positive Predictive Powera/(a + b) Negative Predictive Powerd/(c + d) Misclassification Rate(b + c)/N Odds-ratio(ad)/(cb) Kappa (a + d) - (((a + c)(a + b) + (b + d)(c + d))/N) N - (((a + c)(a + b) + (b + d)(c + d))/N) NMI n(s) 1 - -a.ln(a)-b.ln(b)-c.ln(c)-d.ln(d)+(a+b).ln(a+b)+(c+d).ln(c+d) N.lnN - ((a+c).ln(a+c) + (b+d).ln(b+d)) Actual +Actual - Predicted +ab Predicted -cd The classification stats from DFA would apply here as well

Doing a much better logreg in R attach(Dataset) #more output using the design library; the x and y part will allow us to validate later by #freeing up the predictors and outcome for bootstrapping library(Design) GLM.2 <- lrm(formula?, x=T, y=T, data=Dataset) GLM.2 #this part is required for the design library to do effects summaries for your predictors; #the 'options' line isn't necessary unless you put this code before fitting the model ddist=datadist(pred1,pred2,pred3...) options(datadist=‘ddist’) #the actual summaries including odds ratios and CIs for them summary(GLM.2) plot(GLM.2) #validate the model so as to get a bias-corrected R 2 and other metrics validate(GLM.2, method="boot", B=100)