Download presentation
Presentation is loading. Please wait.
Published byBenedict Perkins Modified over 9 years ago
1
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate the effect of a risk on the occurrence of a disease event. Example: Framingham Heart Study Coronary heart disease and blood pressure
2
LOGISTIC REGRESSION: AN EXAMPLE Event: Coronary Heart Disease Occurrence is the dependent variable, which takes 2 values: Yes or No. Risk factor: Blood pressure Systolic blood pressure is the independent variable X, a continuous measurement. The probability of getting coronary heart disease depends on blood pressure.
3
DATA
4
SCATTER PLOT
5
LINEAR REGRESSION FOR Prob.(CHD): NOT A GOOD IDEA!
6
PROPORTION WITH CHD BY SBP GROUP Systolic BP Range Proportion 130-149 mmHg 0/3 0.00 150-169 mmHg 2/4 0.50 170-189 mmHg 3/3 1.00
7
LOGISTIC REGRESSION PROBABILITY MODEL 1 p(X) = ----------------------------- 1 + exp (- 0 - X) The probability of the event varies as an S-shaped function of the risk factor X: the logistic curve.
8
LOGISTIC CURVE MODEL: OCCURRENCE OF CHD AS A FUNCTION OF SBP
9
LOGISTIC MODEL: LOG ODDS p (X) log ----------- = 0 + 1 X 1 - p (X) The log of the odds of the event is a linear function of X. Log(odds of CHD) = - 6.08 + 0.0243(SBP)
10
ODDS The odds of an event is the chance that the event occurs divided by the chance of its not occurring: Odds = p/(1 - p) = p/q
11
: KEY PARAMETER OF THE LOGISTIC MODEL p (X) log ----------- = 0 + 1 X 1 - p (X) The parameter is like the slope of a linear regression model. = 0 indicates that X has no effect on the probability, e.g., a man’s chance of CHD does not depend on his SBP.
12
1 : KEY PARAMETER p (X) log ----------- = 0 + 1 X 1 - p (X) The coefficient 1 measures the amount of change in the log of the odds per unit change in X.
13
1 : KEY PARAMETER log odds(X+1) = 0 + 1 (X+1) = 0 + 1 X+ 1 log odds(X) = 0 + 1 X Difference in log odds = 1 E.g., the log of the odds of getting CHD increases by 0.0243 for an increase of 1 mmHg of systolic blood pressure. (Hard to explain to a patient!)
14
THE COEFFICIENT 1 AND THE ODDS RATIO Difference in log odds given by 1 translates into the odds ratio (OR). exp( 1 ) = OR = ratio of odds at risk level of X+1 to the odds when risk level is X 1 = 0 OR = 1.
15
THE COEFFICIENT $ 1 AND THE ODDS RATIO For example, the odds of CHD are multiplied by the factor exp(0.0243) = 1.025 for every increase of 1 mmHg in SBP. A difference of 10 mmHg multiplies the odds of CHD by (1.025) 10, or 1.275.
16
ESTIMATION OF THE PARAMETERS Technique: Maximum likelihood estimation For large sample sizes, the normal distribution is used to put a confidence interval around the estimate of the coefficient .
17
HYPOTHESIS TESTING Ho: 1 = 0 No difference in risk at different levels of the risk factor X. No association between risk factor X and probability of occurrence.
18
HYPOTHESIS TESTING Ha: 1 =/= 0 or 1 > 0 (risk increases with X) or 1 < 0 (risk goes down as X increases)
19
HYPOTHESIS TESTING Ho: OR = 1 Ha: OR =/= 1 or OR > 1 (risk increases with X) or OR < 1 (X is protective)
20
RESULTS OF LOGISTIC REGRESSION OR with confidence interval and p value indicate whether there is a significant association between level of the risk factor and chance of occurrence OR = 1.025 (1.015, 1.034), p < 0.001
21
RESULTS OF LOGISTIC REGRESSION Can be used to predict an individual’s risk: prob. of CHD when SBP = 180: p/q = exp{-6.082 + 0.0243(180)} Solve for p: prob. of CHD = 0.125
22
MULTIVARIATE LOGISTIC REGRESSION Model with additional risk factors: p (X) log ----------- = 0 + 1 X + 2 X 1 - p (X) Log(odds of CHD) = 0 + 1 (SBP) + 2 (CHOL) + 3 (smoker)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.