Download presentation
1
Logistic Regression Hal Whitehead BIOL4062/5062
2
Categorical data Logistic regression on binary data Odds ratio Logits Probit regression With many categories
3
Categorical data Categorical data: Categorical vs Continuous
Sex, species, morph, physiological state Categorical vs Continuous Continuous => Continuous Linear regression Categorical => Continuous ANOVA Categorical => Categorical Log-linear models Continuous => Categorical Logistic regression {Also: Continuous + Categorical => Categorical}
4
Logistic Regression on Binary Data
two categories proportions want to work out probability of being in a category: P Logistic regression: Z= β0 + β1·X1 + …
5
Logistic Regression Z= β0 + β1 · X1 + …
If Z is large and positive: P ~ 1.0 If Z is large and negative: P ~ 0.0 Fit β0 , β1 using maximum likelihood X’s can be categorical as well as continuous
6
Logistic Regression: Outputs
Estimates of regression coefficients: β0, β1 ,… Significance of regression coefficients and overall logistic regression Quantile probabilities Accuracy of prediction Odds ratios
7
Logistic Regression Regression coefficients estimated by maximizing log-likelihood iteratively Significance of coefficients indicated by likelihood ratio test (theoretically best) Wald test (normal approximation) Can reduce numbers of independent variables using stepwise elimination Or choose “best” model using AIC
8
Example: Fruit-fly Death
Dose Dead Alive
9
Logistic Regression β0 = 0.56 β1 = 0.92 Constant x Log(Dose) P=0.255
Overall P=0.0064 β0 = 0.56 Constant β1 = 0.92 x Log(Dose)
10
Model selection using AIC
Constant only Log(L)= AIC=35.650 Const, dose Log(L)= AIC=30.224 Const, dose, dose2 Log(L)= AIC=31.738
11
Accuracy of prediction
Predicted: Actual: Died Lived Died Lived Correct Overall correct
12
Odds ratio Compares probabilities of something happening at two values of independent variable: ω=[P(A)/(1-P(A))] / [P(B)/(1-P(B))] “Odds of dying in next 5 years are ω times greater for smokers than non-smokers” Log(ω)= β the change in odds of the event happening as the independent variable changes by one is the log of the regression coefficient
13
Odds ratio Odds ratio for β1 = 2.5
95% c.i Odds of dying are 2.5 greater when dose is 10-fold stronger
14
Example: Matriarchs As Repositories of Social Knowledge in African Elephants
Playback vocalizations of other elephants to matriarchal groups of elephants Do they “bunch”? McComb et al. Science 2001
15
Elephant Knowledge Dependent variable: Bunch / not bunch
Independent variables: Family [Categorical] Age of matriarch Mean age of other females Number of females in group Number of calves in group Age of youngest calf Presence of adult males Association index between group and playback individual Interactions Age of matriarch X ...
16
Logistic Regression Elephant Bunching on:
β d.f. Variables included in final model Family P = 0.029 Age of matriarch P = 0.005 Association index P = 0.147 Age of matriarch × association index P = 0.011 Variables excluded from final model Age of other females P = 0.248 Females in group P = 0.867 Calves in group P = 0.946 Age of youngest calf P = 0.194 Presence of males P = 0.166 Other interactions with Age of matriarch
17
Logistic Regression Elephant Bunching on:
β d.f. Variables included in final model Family P = 0.029 Age of matriarch P = 0.005 Association index P = 0.147 Age of matriarch × association index P = 0.011 55 yr-old matriarchs 35 yr-old matriarchs “sensitivity of the bunching response to the association index increased with the age of the matriarch” McComb et al. Science 2001
18
Logit Logistic regression Logit transformation Z= β0 + β1 · X1 + …
Logit transformation is inverse of logistic function Logit differences are logs of odds-ratios Logit regression (almost) equivalent to logistic regression Z= β0 + β1 · X1 + … Logistic regression Logit transformation
19
Probit Regression Transforms values in range [0 1] using inverse cumulative normal function Useful for proportions (when numbers are not available) Type of generalized linear model Probit(Y) Y
20
With Many Categories Logistic regression for one category against rest
Canonical Variate Analysis
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.