Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Intercept only SAS Program proc logistic data=case_control978 descending; model status=; run; * Descending: to get the probability and OR for dependent variable=1 SAS Output The LOGISTIC Procedure Model Information Data Set WORK.CASE_CONTROL978 Response Variable status Number of Response Levels 2 Number of Observations 978 Model binary logit Optimization Technique Fisher's scoring
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Intercept only SAS Output Response Profile Ordered Total Value status Frequency Probability modeled is status=1. Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. -2 Log L = Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Intercept only Log [Y/(1-Y)] = α Y = e α / (1+ e α ) = exp(α) / [1 + exp(α)] In our model, α = , is the log odds of cancer for total sample. The odds ( e α ) is Y = exp( ) / [1 + exp( )] = =200/( ) Y is related to α in Logistic Model
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Dichotomous Predictor Alcohol Consumption (alcgrp): 0=0-39 gm/day; 1=40+ gm/day SAS Program proc logistic data=case_control978 descending; model status=alcgrp; run; SAS Output Model Fit Statistics Criterion Intercept Only Intercept and Covariates -2 Log L Likelihood Ratio Test G = – = df = 1 The model with variable ‘alcgrp’ is significantly.
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Dichotomous Predictor SAS Output Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 alcgrp <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits alcgrp is the log odds of cancer for light drinkers (alcgrp=0). Log odds of cancer for heavy drinkers (alcgrp=1) is –0.827 ( ). Y = for light drinkers, and for heavy drinkers. OR = exp(β) = exp(1.7641) = Heavy drinkers (alcgrp=1) are about 6 times more likely to get cancer than light drinkers (alcgrp=0). OR is not related to α in Logistic Model
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Ordinal Predictor Alcohol Consumption (alcgrp4): 0=0-39 gm/day; 1=40-79 gm/day 2= gm/day; 3=120+ gm/day SAS Program proc logistic data=case_control978 descending; model status=alcgrp4; run; SAS Output Model Fit Statistics Criterion Intercept Only Intercept and Covariates -2 Log L Likelihood Ratio Test G = – = df = 1 The model with variable ‘alcgrp4’ is significantly.
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Ordinal Predictor SAS Output Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 alcgrp <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits alcgrp OR = exp(1.0453) = Men with alcgrp4=1 are about 3 times more likely to get cancer than men with alcgrp4=0. This OR is also for alcgrp4= 1 vs. alcgrp4=2; or alcgrp4=2 vs. alcgrp4=3. OR = exp[(3-1)*1.0453] = exp(2.0906) = for alcgrp4=1 vs. alcgrp4=3 OR = exp[(3-0)*1.0453] = exp(3.1359) = for alcgrp4=0 vs. alcgrp4=3
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Continuous Predictor Alcohol Consumption (alcohol): daily consumption in grams SAS Program proc logistic data=case_control978 descending; model status=alcohol; run; SAS Output Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept <.0001 alcohol <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits alcohol
Applied Epidemiologic Analysis - P8400 Fall 2002 Logistic Regression for Continuous Predictor OR = exp(0.0261) = The odds of cancer increase by a factor of for each unit in alcohol consumption OR = exp[40*(0.0261)] = exp(1.044) = for a 40-grams increase in alcohol consumption per day OR = exp[120*(0.0261)] = for a man who drinks 160 grams per day compare with a man who is similar in other respects but drinks 40 grams per day.
Applied Epidemiologic Analysis - P8400 Fall 2002 Interaction in Logistic Regression model status = α + β 1 alcgrp + β 2 tobgrp β 1 : the effect of alcohol on cancer, controlling for tobacco (i.e., the same OR across levels of tobacco) β 2 :the effect of tobacco on cancer, controlling for alcohol (i.e., the same OR across levels of alcohol) model status = α + β 1 alcgrp + β 2 tobgrp + β 3 alcgrp*tobgrp β 1 : the effect of alcohol on cancer among non-smokers (tobgrp=0) β 2 :the effect of tobacco on cancer among non-drinkers (alcgrp=0) β 3 : interaction between smokers and drinkers
Applied Epidemiologic Analysis - P8400 Fall 2002 Interaction in Logistic Regression model status = (alcgrp) (tobgrp) –0.98 (alcgrp*tobgrp) Log odds odds A: alcgrp=0 & tobgrp=0 2.28* *0 – 0.98*0*0 = B: alcgrp=1 & tobgrp=0 2.28* *0 – 0.98*1*0 = C: alcgrp=0 & tobgrp=1 2.28* *1 – 0.98*0*1 = D: alcgrp=1 & tobgrp=1 2.28* *1 – 0.98*1*1 = Odds Ratio A vs. B9.78 = 9.78/1.00 A vs. C3.97 = 3.97/1.00 A vs. D14.59 = 14.59/1.00 B vs. D1.49 = 14.59/9.78 C vs. D3.68 = 14.59/3.97