Download presentation
Presentation is loading. Please wait.
Published byCharla Payne Modified over 6 years ago
1
Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU
Logistic Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU
2
Types of Logistic Regression
Response Variable Type of Logistic Regression Two Categories (AMI/non-AMI; LBW / NBW Binary Three or more categories (Accuracy: Under-/ Accurate / Over-report Nominal Ordinal Ordinal categories (BMI: low / normal / high) Gestational age: <34 / / / >=37
10
射頻電磁波對兒童健康影響研究(DOH99-HP-1407 )
主持人﹕陳楚杰 國立台北護理健康大學健康事業管理系 副教授兼系主任 共同主持人﹕李中一* 國立成功大學公共衛生學科暨研究所 教授 共同主持人﹕劉介宇 國立台北護理健康大學護理系 助理教授 2010/11/25
11
11
13
What Does Logistic Regression Do?
The logistic regression model uses the predictor variables, which can be categorical or continuous, to predict the probability of specific outcomes. In other words, logistic regression is designed to describe probability associated with the values of the response variable.
14
Logistic Regression Models
Logistic regression used for binary outcomes to predict probability (odds) of the ‘event’ happening Ex., what are odds of developing diabetes by age 50 given body mass index and dietary factors Logistic regression models the effects of different predictors on the odds of the event in a particular person or group while staying within the probability limits of zero to one
15
What Are Odds and Odds Ratio?
Odds ratios used extensively in epidemiology to express the odds of developing a condition due to the presence of a factor compared to the absence of the factor Odds for success are the ratio of the probability of success to the probability of failure Odds ratio is the ratio of the odds of a disease with a specific factor to the odds of a disease without the factor
16
Probability of Outcome
Outcome Yes No Total Group A 20 60 80 Group B 10 90 100 30 150 180 Probability of a “Yes” outcome= 20/80 (25%) in Group A Probability of a “No” outcome= 60/80 (75%) in Group A
17
Probability of Outcome
Outcome Yes No Total Group A 20 60 80 Group B 10 90 100 30 150 180 Probability of a “Yes” outcome= 10/100 (10%) in Group B Probability of a “No” outcome= 90/100 (90%) in Group B
18
Odds of Outcome in Group A=
Probability of a “Yes” outcome in Group A Probability of a “No” outcome in Group A 0.250.75=0.33
19
Odds of Outcome in Group B=
Probability of a “Yes” outcome in Group B Probability of a “No” outcome in Group B 0.100.90=0.11
20
Odds Ratio of Group A to Group B
Odds Ratios Odds Ratio of Group A to Group B Odds of outcome in Group A Odds of outcome in Group B 0.330.11=3
21
Odds Ratios No Association Group B More Likely Group A More Likely 1 Thus, an odds ratio of 2.0 indicates that a person with the specific factor has twice the odds of contracting the disease than a person without that factor.
22
Logistic Regression Model
Logistic regression model is written as Logit (p)=Ln (p / [1-p]) =Ln(odds)= + x To predict the odds of an event for an individual, use Exp( + x)=(p/[1-p])=probability of disease/probability of disease-free
23
Logistic Regression Model
Logit (p)= + x Where Logit (p): logit transformation of the probability of the event =intercept of the regression line =slope of the regression line
24
Odds Ratio Estimated From Logistical Regression Model
Because logit (p)=ln (p/[1-p])= +x For exposed subjects (x=1) the logit (p1)=ln (p1/[1-p1])=+=ln (odds of disease for exposed subjects) For non-exposed subjects (x=0) the logit (p2)=ln (p2/[1-p2])==ln (odds of disease for non-exposed subjects) ln (p1/[1-p1])-ln (p2/[1-p2])=ln (odds ratio)=(+)- = OR=exp()
25
Relationship Between 2x2 Table and Logistic Regression
Example, Let's look at a two by two table showing the distribution of exposure for both cases and controls. Diseased Non-diseased Exposed a b Non-exposed c d
27
Interpreting the Logistic Model
Parameters are the logits which are used to calculated the odds ratios Odds ratio for high income is exp(parameter for high income) and is the increase in odds compared to the reference group Odds ratio of 1 is equivalent to a regression parameter of 0 Odds ratios greater than 1 indicate an increase in the odds of the event Odds ratios less than 1 indicate a decrease in the odds
28
Multiple Logistic Regression Model
Let's consider an example of a 12-year period cohort study evaluating risk factors for male coronary heart disease (CHD) in 742 men aged at start of study. X1: age (40-44, 45-49) X2: cholesterol level (normal, abnormal) X3: systolic blood pressure (normal, abnormal) X4: weight (kg) (<50, 50-59, 60-69, 70-79, 80+) X5: hemoglobin level (normal, abnormal) X6: smoking (none, <=1 pack, 1 pack, >=1 pack) X7: ECG (normal, abnormal).
29
Multiple Logistic Regression Model Used for a Binary Outcome Variable
The logistic model offers an opportunity to perform multiple regression with a binary outcome variable. Recall that when outcome was continuous, multiple regression was based on the model with assumptions of homogeneity of error variances and normality.
30
If we try to apply this same formulation for a binary Y,
The logistic model specifies that the probability of disease p depend on
31
Therefore, it is not E(Y), but a function of it (the logit) which is represented as a linear combination of the X's.
32
Interpretation of βs X1: age (in years) X2: cholesterol level (mg/dl)
X3: systolic blood pressure (mmHg) X4: relative weight X5: hemoglobin level (%) X6: smoking (0=none, 1=≦1 pack, 2=1 pack, 3= ≧1 pack) X7: ECG (normal, abnormal) A logistic regression analysis produced:
33
Interpretation of βs Parameter Estimate (i) Standard error (i)
0 (intercept) 1 0.1216 0.0437 2 0.0070 0.0025 3 0.0068 0.0060 4 0.0257 0.0091 5 0.0098 6 0.4223 0.1031 7 0.7206 0.4009
34
Interpretation of βs Recall that
Can be used to estimate the probability of CHD incidence in the next 12 years for some individual with characteristics ( )
35
For Example To estimate the probability of CHD in the next 12 years for a 45 year old man with cholesterol level=200 , SBP=130, weight=100, hemoglobin level=120, non-smoker and normal ECG , we compute =- (45) (210) (0) (0) =-2.9813
36
Example Continued For a man with the same characteristics as above, but who smokes more than 1 pack per day, (x6=3) =- (45) (210) (3) (0) =-1.7144
37
Measure of Association for Smoking > 1 Pack (X6=3) Vs None (X6=0)
Risk Difference= =0.1043 Risk Ratio (Relative Risk) = /0.0483=3.16 Odds Ratio= =3.55 Note that RR=3.16 ≒ OR=3.55, since p in the baseline (0.0483) is relatively rare
38
Estimation of OR From Multiple Logistic Regression Models
=OR of disease for smokers of >1 pack vs non-smokers Why is OR= ? Recall that
40
Therefore, for non-smokers
for heavy smokers (>1 pack per day)
41
Note that is the odds ratio of disease for heavy smokers to non-smokers irrespective of the other characteristics, as long as they are remaining constant.
42
Recap We introduced the logistic regression model
which can be linearized into
43
Recap (Continued) we already - defined - interpreted
- tested hypothesis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.