Download presentation
Presentation is loading. Please wait.
Published byLeslie Matthews Modified over 9 years ago
1
Logistic Regression Applications Hu Lunchao
2
2 Contents 1 1 What Is Logistic Regression? 2 2 Modeling Categorical Responses 3 3 Modeling Ordinal Variables 4 4 Matched Case-Control Studies
3
An Example 3
4
4
5
5
6
6 What Is Logistic Regression? Logistic Function: Logit transformation: The variance:
7
7 Modeling Categorical Responses Logistic regression model for binary response variables. For instance, logistic regression can model A voter's choice in a presidential election, with predictor variables political ideology, annual income, education level, and religious affiliation. Whether a person uses illegal drugs (yes or no), with predictors education level, whether employed, religiosity, marital status, and annual income.
8
Modeling Categorical Responses 8 Income and Having Travel Credit Cards
9
Modeling Categorical Responses 9 Let x = annual income and y = whether have a travel credit card (1 = yes, 0 = no). Software provides results shown in Table Fit of Logistic Regression Model for Italian Credit Card Data betaS.E.Exp(beta) income.1054.02621.111 Constant-3.5179.7103
10
Modeling Categorical Responses 10 The logistic prediction equation is
11
Modeling Categorical Responses 11 Interpretation for beta: The sign of beta tells us whether it is increasing or decreasing as x increases ; The curve has slope beta*P(y = 1)[1-P(y = 1)] , where P(y = 1) is the probability at that point.
12
12 This exponential relationship implies that every unit increase in x has a multiplicative effect of e beta on the odds. Interpretation Using the Odds and Odds Ratio
13
Modeling for Ordinal Variables 13 Many applications have a categorical response variable with more than two categories. For instance, the General Social Survey recently asked subjects whether government spending on the environment should increase, remain the same, or decrease
14
Cumulative Probabilities 14 Let y denote an ordinal response variable. Let P(y <=j) denote the probability that the response falls in category j or below. With four categories, for example, the cumulative probabilities are
15
Cumulative Logits 15 A c-category response has c cumulative probabilities With c = 4, for example, the logits are
16
Cumulative Logit Models for an Ordinal Response 16 The model requires a separate intercept parameter j for each cumulative probability. Since the cumulative probabilities increase as j increases, so do. The effect of an explanatory variable on all the cumulative probabilities for y should be the same, so beta keeps unchanged.
17
17 Software estimates the parameters using all the cumulative probabilities at once.
18
Do Republicans tend to be more conservative than Democrats? 18 Let x be a dummy variable for party affiliation, with x = 1 for Democrats and x = 0 for Republicans.
19
19 Democrats tending to be more liberal than Republicans
20
Matched Case-Control Studies 20 Case-control studies are used to identify factors that may contribute to a medical condition by comparing subjects who have that condition (the 'cases') with patients who do not have the condition but are otherwise similar. Within each stratum, samples of cases(y=1) and controls(y=0) are chosen. The number of cases and controls need not be constant across strata, but the most common matched designs include one case and form one to five controls per stratum and are thus referred to as 1-M matched studies.
21
21 In a case–control study, the logistic regression outcome variable Y is binary; here Y = 1 for cases and Y = 0 for controls. Then, it is written as:
22
ML Approaches to Estimate beta: unconditional and conditional 22 The unconditional likelihood is where n is the total number of individuals.
23
ML Approaches to Estimate beta: unconditional and conditional 23 The conditional likelihood is where m is the number of cases, and the sum in the denominator is over the C m n possibilities of dividing the numbers from 1 to n into one group {1, 2,...,m} of size m and its complement {l m+1,...,l n }.
24
What Factors Attribute to Low-birth-weight 24 Dataset : 189 women, of whom 59 had low-birth -weight babies and 130 had normal-weight babies Risk factors: weight in pounds at the last menstrual period (LWT), presence of hypertension (HT), smoking status during pregnancy (Smoke), presence of uterine irritability (UI). The woman’s age (Age),used as the matching variable.
25
Data Input 25 The variable Low is used to determine whether the subject is a case (Low=1, low-birth-weight baby) or a control (Low=0, normal-weight baby). The dummy time variable Time takes the value 1 for cases and 2 for controls
26
Conditional logistic regression Result 26 The variable Time is the response, and Low is the censoring variable. The matching variable Age is used in the STRATA statement so that each unique age value defines a stratum.
27
27
28
28 Results indicate that women were more likely to have low-birth- weight babies if they were underweight in the last menstrual cycle, were hypertensive, smoked during pregnancy, or suffered uterine irritability
29
29 THANKS FOR YOUR ATTENTION!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.