Overview of Logistics Regression and its SAS implementation

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Statistical Analysis SC504/HS927 Spring Term 2008
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Brief introduction on Logistic Regression
The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression Models with Several Predictors of Interest in the Presence of Covariates D. Keith.
Logistic Regression Psy 524 Ainsworth.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Logistic Regression.
Simple Logistic Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Logistic Regression Example: Horseshoe Crab Data
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
1 Experimental design and analyses of experimental data Lesson 6 Logistic regression Generalized Linear Models (GENMOD)
Chapter 10 Simple Regression.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
An Introduction to Logistic Regression
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Generalized Linear Models
Logistic regression for binary response variables.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
Topic 1 Binary Logit Models.
Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data.
Logistic Regression I HRP 261 2/09/04 Related reading: chapters and of Agresti.
Introduction to SAS Essentials Mastering SAS for Data Analytics
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
Multinomial Logistic Regression Basic Relationships
EIPB 698E Lecture 10 Raul Cruz-Cano Fall Comments for future evaluations Include only output used for conclusions Mention p-values explicitly (also.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Applied Epidemiologic Analysis - P8400 Fall 2002
Logistic Regression Database Marketing Instructor: N. Kumar.
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
HSRP 734: Advanced Statistical Methods July 17, 2008.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Logistic Regression Analysis Gerrit Rooks
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Logistic regression (when you have a binary response variable)
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Exact Logistic Regression
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Analysis of matched data Analysis of matched data.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
BINARY LOGISTIC REGRESSION
Logistic Regression APKC – STATS AFAC (2016).
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Notes on Logistic Regression
Generalized Linear Models
Multiple logistic regression
ביצוע רגרסיה לוגיסטית. פרק ה-2
Logistic Regression.
Introduction to Logistic Regression
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Logistic Regression.
Presentation transcript:

Overview of Logistics Regression and its SAS implementation Logistics regression is widely used nowadays in finance, marketing research and clinical studies when the dependent variable is dichotomous, representing an event or a non-event. However, because ordinary linear regression was routinely used before we had the modern statistical packages for analyzing logit, we will compare the statistical assumptions of logistic regression with that of ordinary least square linear regression. Next we will examine PROC LOGISTICS implemented in SAS and discuss the basic statistic output for understanding the logistic regression results. We will then discuss how to setup and understand logistics regression when the dependent variable has more than two outcomes. We will conclude the presentation by comparing PROC LOGISTICS with other SAS procedures that can also perform logistics regression.

Examples of discrete responses: Getting decease vs. not getting decease Good, medium and bad credit risks Responders vs. non – responders (both in marketing or clinical trial studies) Married vs. unmarried Guilty vs. not guilty

Comparing linear and logistic regression Linear Probability Model Logit Model

Why can linear regression work reasonable well on binary dependent variables ? Assumptions Consequence of violations Notes 1 Biased parameter estimates Parameter’s meaning hard to interpret, except a linear approximation to nonlinear functions. Prediction can be <0, >1 2 Biased intercept estimate 3 Unbiased estimates but biased Variance of Biased confidence interval 4 Same as 3 5 Unable for us to use t , F statistical tests for regression models. The estimates may still be normal if sample size is large. Biased variance make statistic tests (t, F) unreliable If 1) and 2) are true, it can be shown that 3) and 5) are necessarily false. However, the consequences may not be as serious as you expect.

Logistic regression for binary response variables Basic Syntax: proc logistic data=chdage1 outest=parms descending; model chd = age / selection = stepwise ctable pprob = (0 to 1 by 0.1) outroc=roc1; proc score data=chdage1 score = parms out=scored type=parms; var age; run; In the events/trials syntax, you specify two variables that contain count data for a binomial experiment. These two variables are separated by a slash. The value of the first variable, events, is the number of positive responses (or events). The value of the second variable, trials, is the number of trials.

Interpretation of SAS output - continued Model Selection Criteria: Convergence - difference in parameter estimates is small enough. Model Fit Statistics Criteria: Likelihood Function: – 2 * log (likelihood ) AIC = – 2 * log ( max likelihood ) + 2 * k SIC = – 2 * log ( max likelihood ) + log (N) * k Testing Global Null Hypothesis: BETA=0 Likelihood ratio: ln(L intercept)- ln(L int + covariates), Score: 1st and 2nd derivative of Log(L) Wald: (coefficient / std error)2 AIC (Akaike Information Criterion), SIC (Schwarz Information Criterion) the small the better K no of covariates, N sample size Likelihood ratio:ln(L0)-ln(Lmax), similar to F test in linear model the larger the better (larger the difference)

Interpretation of SAS output - continued Analysis of Maximum Likelihood Estimates Parameter estimates and significance test Odds Ratio Estimates Odds: Odds ratio: Oi / Oj per unit change in covariate. Association of Predicted Probabilities and Observed Responses Pairs: 43 (event) * 57 (non event) = 2451 Concordant (0- lower prob vs. 1- higher prob) Discordant (0- higher prob vs. 1- lower prob) Tie – all other ROC used to visualize model model prediction strength.

Interpretation of SAS output - continued Classification Table: The model classifies an observation as an event if its estimated probability is greater than or equal to a given probability cutpoints.

Logistic regression for polychotomous response variables Example: Three outcomes The cumulative probability model The assumption: A common slope parameter associated with the predictor.

Logistic regression for polychotomous response variables Examples: proc logistic data=diabetes descending; model group=glutest; output out=probs predicted=prob xbeta=logit; format group gp.; run;

Other SAS Procedures for Logistic Regression Models Model Options Notes Logistics Event/Trial format only works for binary Response GENMOD Dist = Binormial Link=logit One of General linear models PROBIT Dist=Logistic CATMOD Proc catmod data=diabetes; direct glutest; response logits / out = cat_prob; model group = glutest; run; Allow for individual parameters: PHREG proc phreg data=diabetes; model t * group = glutest; Trick: events occur at time one, non events occur at a later times (censored). * t is dummy time var, group is censoring var The GENMOD procedure fits a generalized linear model to the data by maximum likelihood estimation of the parameter vector.Syntax similar to that of PROC GLM for the specification of the response and model effects, including interaction terms and automatic coding of classification variables The PROBIT procedure calculates maximum likelihood estimates of regression parameters and the natural (or threshold) response rate for quantal response data from biological assays or other discrete event data. This includes probit, logit, ordinal logistic, and extreme value (or gompit) regression models. The CATMOD procedure performs categorical data modeling of data that can be represented by a contingency table. PROC CATMOD fits linear models to functions of response frequencies, and it can be used for linear modeling, log-linear modeling, logistic regression, and repeated measurement analysis. The PHREG procedure performs regression analysis of survival data based on the Cox proportional hazards model. Cox's semiparametric model is widely used in the analysis of survival data to explain the effect of explanatory variables on survival times.

References Hosmer, D.W, Jr. and Lemeshow, S. (1989), Applied Logistic Regression, New York: John Wiley & Sons, Inc. SAS Institute Inc. (1995), Logistic Regression Examples Using the SAS System, Cary, NC: SAS Institute Inc. Paul D. Allison (1999) Logistic Regression Using the SAS System: Theory and Application, BBU Press and John Wiley Sons Inc.