Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Continued Psy 524 Ainsworth
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Prof. Navneet Goyal CS & IS BITS, Pilani
Brief introduction on Logistic Regression
The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression Models with Several Predictors of Interest in the Presence of Covariates D. Keith.
A Model to Evaluate Recreational Management Measures Objective I – Stock Assessment Analysis Create a model to distribute estimated landings (A + B1 fish)
Logistic Regression Psy 524 Ainsworth.
Logistic Regression.
Overview of Logistics Regression and its SAS implementation
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 10 Simple Regression.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
An Introduction to Logistic Regression
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
Generalized Linear Models
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
Chapter 13: Inference in Regression
Simple linear regression Linear regression with one predictor variable.
Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Simple Linear Regression
Logistic Regression I HRP 261 2/09/04 Related reading: chapters and of Agresti.
Random Sampling, Point Estimation and Maximum Likelihood.
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
EIPB 698E Lecture 10 Raul Cruz-Cano Fall Comments for future evaluations Include only output used for conclusions Mention p-values explicitly (also.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
01/20151 EPI 5344: Survival Analysis in Epidemiology Maximum Likelihood Estimation: An Introduction March 10, 2015 Dr. N. Birkett, School of Epidemiology,
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Applied Epidemiologic Analysis - P8400 Fall 2002
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Different Distributions David Purdie. Topics Application of GEE to: Binary outcomes: – logistic regression Events over time (rate): –Poisson regression.
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Introduction to logistic regression and Generalized Linear Models July 14, 2011 Introduction to Statistical Measurement and Modeling Karen Bandeen-Roche,
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Logistic Regression. Linear Regression Purchases vs. Income.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
1 Topic 4 : Ordered Logit Analysis. 2 Often we deal with data where the responses are ordered – e.g. : (i) Eyesight tests – bad; average; good (ii) Voting.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Logistic Regression Analysis Gerrit Rooks
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
Analysis of matched data Analysis of matched data.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
The simple linear regression model and parameter estimation
BINARY LOGISTIC REGRESSION
Logistic Regression APKC – STATS AFAC (2016).
Probability Theory and Parameter Estimation I
CHAPTER 7 Linear Correlation & Regression Methods
Generalized Linear Models
Multiple logistic regression
ביצוע רגרסיה לוגיסטית. פרק ה-2
Scatter Plots of Data with Various Correlation Coefficients
Introduction to Logistic Regression
Simple Linear Regression
Presentation transcript:

Logistic Regression I

Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression (from a 2x2 table)—illustrates how the math works… Step-by-step examples Dummy variables – Confounding and interaction

Introduction to Maximum Likelihood Estimation a little coin problem…. You have a coin that you know is biased towards heads and you want to know what the probability of heads (p) is. YOU WANT TO ESTIMATE THE UNKNOWN PARAMETER p

Data You flip the coin 10 times and the coin comes up heads 7 times. What’s you’re best guess for p? Can we agree that your best guess for is.7 based on the data?

The Likelihood Function What is the probability of our data—seeing 7 heads in 10 coin tosses—as a function p? The number of heads in 10 coin tosses is a binomial random variable with N=10 and p=(unknown) p. This function is called a LIKELIHOOD FUNCTION. It gives the likelihood (or probability) of our data as a function of our unknown parameter p.

The Likelihood Function We want to find the p that maximizes the probability of our data (or, equivalently, that maximizes the likelihood function). THE IDEA: We want to find the value of p that makes our data the most likely, since it’s what we saw!

Maximizing a function… Here comes the calculus… Recall: How do you maximize a function? 1.Take the log of the function --turns a product into a sum, for ease of taking derivatives. [log of a product equals the sum of logs: log(a*b*c)=loga+logb+logc and log(a c )=cloga] 2.Take the derivative with respect to p. --The derivative with respect to p gives the slope of the tangent line for all values of p (at any point on the function). 3. Set the derivative equal to 0 and solve for p. --Find the value of p where the slope of the tangent line is 0— this is a horizontal line, so must occur at the peak or the trough.

1. Take the log of the likelihood function. 3. Set the derivative equal to 0 and solve for p. 2. Take the derivative with respect to p. Jog your memory  *derivative of a constant is 0 *derivative 7f(x)=7f '(x) *derivative of log x is 1/x *chain rule

The actual maximum value of the likelihood might not be very high. RECAP: Here, the –2 log likelihood (which will become useful later) is:

Thus, the MLE of p is.7 So, we’ve managed to prove the obvious here! But many times, it’s not obvious what your best guess for a parameter is! MLE tells us what the most likely values are of regression coefficients, odds ratios, averages, differences in averages, etc. {Getting the variance of that best guess estimate is much trickier, but it’s based on the second derivative, for another time ;-) }

Generalized Linear Models Twice the generality! The generalized linear model is a generalization of the general linear model SAS uses PROC GLM for general linear models SAS uses PROC GENMOD for generalized linear models

Recall: linear regression Require normally distributed response variables and homogeneity of variances. Uses least squares estimation to estimate parameters – Finds the line that minimizes total squared error around the line: – Sum of Squared Error (SSE)=  (Y i -(  +  x)) 2 – Minimize the squared error function: derivative[  (Y i -(  +  x)) 2 ]=0  solve for , 

Why generalize? General linear models require normally distributed response variables and homogeneity of variances. Generalized linear models do not. The response variables can be binomial, Poisson, or exponential, among others.

Example : The Bernouilli (binomial) distribution Smoking (cigarettes/day) Lung cancer; yes/no y n

Could model probability of lung cancer…. p =  +  1 *X Smoking (cigarettes/day) The probability of lung cancer (p) 1 0 But why might this not be best modeled as linear? [ ]

Alternatively… log(p/1- p) =  +  1 *X Logit function

The Logit Model Logit function (log odds) Baseline odds Linear function of risk factors and covariates for individual i:  1 x 1 +  2 x 2 +  3 x 3 +  4 x 4 … Bolded variables represent vectors

Example Baseline odds Linear function of risk factors and covariates for individual i:  1 x 1 +  2 x 2 +  3 x 3 +  4 x 4 … Logit function (log odds of disease or outcome)

Relating odds to probabilities oddsalgebraprobability

Relating odds to probabilities oddsalgebraprobability

Probabilities associated with each individual’s outcome: Individual Probability Functions Example:

The Likelihood Function The likelihood function is an equation for the joint probability of the observed events as a function of 

Maximum Likelihood Estimates of  Take the log of the likelihood function to change product to sum: Maximize the function (just basic calculus): Take the derivative of the log likelihood function Set the derivative equal to 0 Solve for 

“Adjusted” Odds Ratio Interpretation

Adjusted odds ratio, continuous predictor

Practical Interpretation The odds of disease increase multiplicatively by e ß for every one-unit increase in the exposure, controlling for other variables in the model.

Simple Logistic Regression

2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0

(courtesy Hosmer and Lemeshow) Odds Ratio for simple 2x2 Table

Example 1: CHD and Age (2x2) (from Hosmer and Lemeshow) =>55 yrs<55 years CHD Present CHD Absent

The Logit Model

The Likelihood

The Log Likelihood

Derivative(s) of the log likelihood

Maximize  =Odds of disease in the unexposed (<55)

Maximize  1

Hypothesis Testing H 0 :  =0 2. The Likelihood Ratio test: 1. The Wald test: Reduced=reduced model with k parameters; Full=full model with k+p parameters Null value of beta is 0 (no association)

Hypothesis Testing H 0 :  =0 2. What is the Likelihood Ratio test here? – Full model = includes age variable – Reduced model = includes only intercept Maximum likelihood for reduced model ought to be (.43) 43 x(.57) 57 (57 cases/43 controls)…does MLE yield this?… 1. What is the Wald Test here?

The Reduced Model

Likelihood value for reduced model = marginal odds of CHD!

Likelihood value of full model

Finally the LR…

Example 2: >2 exposure levels *(dummy coding) CHD status WhiteBlackHispanicOther Present Absent2010 (From Hosmer and Lemeshow)

SAS CODE data race; input chd race_2 race_3 race_4 number; datalines; end; run; proc logistic data=race descending; weight number; model chd = race_2 race_3 race_4; run; Note the use of “dummy variables.” “Baseline” category is white here.

What’s the likelihood here? In this case there is more than one unknown beta (regression coefficient)— so this symbol represents a vector of beta coefficients.

SAS OUTPUT – model fit Intercept Intercept and Criterion Only Covariates AIC SC Log L Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio Score Wald

SAS OUTPUT – regression coefficients Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept race_ race_ race_

SAS output – OR estimates The LOGISTIC Procedure Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits race_ race_ race_ Interpretation: 8x increase in odds of CHD for black vs. white 6x increase in odds of CHD for hispanic vs. white 4x increase in odds of CHD for other vs. white

Example 3: Prostrate Cancer Study (same data as from lab 3) Question: Does PSA level predict tumor penetration into the prostatic capsule (yes/no)? (this is a bad outcome, meaning tumor has spread). Is this association confounded by race? Does race modify this association (interaction)?

1.What’s the relationship between PSA (continuous variable) and capsule penetration (binary)?

Capsule (yes/no) vs. PSA (mg/ml) psa vs. capsule capsule psa

Mean PSA per quintile vs. proportion capsule=yes  S-shaped? proportion with capsule=yes PSA (mg/ml)

logit plot of psa predicting capsule, by quintiles  linear in the logit? Est. logit psa

psa vs. proportion, by decile… proportion with capsule=yes PSA (mg/ml)

logit vs. psa, by decile Est. logit psa

model: capsule = psa model: capsule = psa Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio <.0001 Score <.0001 Wald <.0001 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 psa <.0001

Model: capsule = psa race Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept psa <.0001 race No indication of confounding by race since the regression coefficient is not changed in magnitude.

Model: capsule = psa race psa*race Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept psa race psa*race Evidence of effect modification by race (p=.07).

race= Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 psa < race= Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept psa STRATIFIED BY RACE:

How to calculate ORs from model with interaction term Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept psa race psa*race Increased odds for every 5 mg/ml increase in PSA: If white (race=0): If black (race=1):