Logistic Regression Hal Whitehead BIOL4062/5062.

Slides:



Advertisements
Similar presentations
Statistical Analysis SC504/HS927 Spring Term 2008
Advertisements

Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
Logit & Probit Regression
Trashball: A Logistic Regression Classroom Activity Christopher Morrell (Joint work with Richard Auer) Mathematics and Statistics Department Loyola University.
Logistic Regression Example: Horseshoe Crab Data
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Logistic Regression STA302 F 2014 See last slide for copyright information 1.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
GRA 6020 Multivariate Statistics; The Linear Probability model and The Logit Model (Probit) Ulf H. Olsson Professor of Statistics.
Multinomial Logistic Regression
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
GRA 6020 Multivariate Statistics; The Linear Probability model and The Logit Model (Probit) Ulf H. Olsson Professor of Statistics.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
1 G Lect 11M Binary outcomes in psychology Can Binary Outcomes Be Studied Using OLS Multiple Regression? Transforming the binary outcome Logistic.
An Introduction to Logistic Regression
Generalized Linear Models
Logistic regression for binary response variables.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
AN INTRODUCTION TO LOGISTIC REGRESSION ENI SUMARMININGSIH, SSI, MM PROGRAM STUDI STATISTIKA JURUSAN MATEMATIKA UNIVERSITAS BRAWIJAYA.
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
Logistic (regression) single and multiple. Overview  Defined: A model for predicting one variable from other variable(s).  Variables:IV(s) is continuous/categorical,
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Logistic Regression. Linear Regression Purchases vs. Income.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
Université d’Ottawa - Bio Biostatistiques appliquées © Antoine Morin et Scott Findlay :32 1 Logistic regression.
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
Logistic Regression Analysis Gerrit Rooks
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
Logistic regression (when you have a binary response variable)
Logistic Regression Saed Sayad 1www.ismartsoft.com.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
Logistic Regression For a binary response variable: 1=Yes, 0=No This slide show is a free open source document. See the last slide for copyright information.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Nonparametric Statistics
BINARY LOGISTIC REGRESSION
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
Logistic Regression APKC – STATS AFAC (2016).
Logistic Regression.
Notes on Logistic Regression
Logistic Regression Part One
Generalized Linear Models
Introduction to logistic regression a.k.a. Varbrul
Nonparametric Statistics
Logistic Regression.
Introduction to Logistic Regression
Presentation transcript:

Logistic Regression Hal Whitehead BIOL4062/5062

Categorical data Logistic regression on binary data Odds ratio Logits Probit regression With many categories

Categorical data Categorical data: Categorical vs Continuous Sex, species, morph, physiological state Categorical vs Continuous Continuous => Continuous Linear regression Categorical => Continuous ANOVA Categorical => Categorical Log-linear models Continuous => Categorical Logistic regression {Also: Continuous + Categorical => Categorical}

Logistic Regression on Binary Data two categories proportions want to work out probability of being in a category: P Logistic regression: Z= β0 + β1·X1 + …

Logistic Regression Z= β0 + β1 · X1 + … If Z is large and positive: P ~ 1.0 If Z is large and negative: P ~ 0.0 Fit β0 , β1 using maximum likelihood X’s can be categorical as well as continuous

Logistic Regression: Outputs Estimates of regression coefficients: β0, β1 ,… Significance of regression coefficients and overall logistic regression Quantile probabilities Accuracy of prediction Odds ratios

Logistic Regression Regression coefficients estimated by maximizing log-likelihood iteratively Significance of coefficients indicated by likelihood ratio test (theoretically best) Wald test (normal approximation) Can reduce numbers of independent variables using stepwise elimination Or choose “best” model using AIC

Example: Fruit-fly Death Dose Dead Alive 0.01 1 4 0.1 3 2 1.0 2 3 10.0 4 1 100.0 5 0

Logistic Regression β0 = 0.56 β1 = 0.92 Constant x Log(Dose) P=0.255 Overall P=0.0064 β0 = 0.56 Constant β1 = 0.92 x Log(Dose)

Model selection using AIC Constant only Log(L)=-16.825 AIC=35.650 Const, dose Log(L)=-13.112 AIC=30.224 Const, dose, dose2 Log(L)=-12.869 AIC=31.738

Accuracy of prediction Predicted: Actual: Died Lived Died 10.6 4.4 Lived 4.4 5.6 Correct 0.7 0. 6 Overall correct 0.65

Odds ratio Compares probabilities of something happening at two values of independent variable: ω=[P(A)/(1-P(A))] / [P(B)/(1-P(B))] “Odds of dying in next 5 years are ω times greater for smokers than non-smokers” Log(ω)= β the change in odds of the event happening as the independent variable changes by one is the log of the regression coefficient

Odds ratio Odds ratio for β1 = 2.5 95% c.i. 1.2-5.4 Odds of dying are 2.5 greater when dose is 10-fold stronger

Example: Matriarchs As Repositories of Social Knowledge in African Elephants Playback vocalizations of other elephants to matriarchal groups of elephants Do they “bunch”? McComb et al. Science 2001

Elephant Knowledge Dependent variable: Bunch / not bunch Independent variables: Family [Categorical] Age of matriarch Mean age of other females Number of females in group Number of calves in group Age of youngest calf Presence of adult males Association index between group and playback individual Interactions Age of matriarch X ...

Logistic Regression Elephant Bunching on: β d.f. Variables included in final model Family - 20 P = 0.029 Age of matriarch -0.514 1 P = 0.005 Association index 98.0 1 P = 0.147 Age of matriarch × association index -4.31 1 P = 0.011 Variables excluded from final model Age of other females -0.201 1 P = 0.248 Females in group 0.033 1 P = 0.867 Calves in group 0.015 1 P = 0.946 Age of youngest calf 0.032 1 P = 0.194 Presence of males -0.851 1 P = 0.166 Other interactions with Age of matriarch

Logistic Regression Elephant Bunching on: β d.f. Variables included in final model Family - 20 P = 0.029 Age of matriarch -0.514 1 P = 0.005 Association index 98.0 1 P = 0.147 Age of matriarch × association index -4.31 1 P = 0.011 55 yr-old matriarchs 35 yr-old matriarchs “sensitivity of the bunching response to the association index increased with the age of the matriarch” McComb et al. Science 2001

Logit Logistic regression Logit transformation Z= β0 + β1 · X1 + … Logit transformation is inverse of logistic function Logit differences are logs of odds-ratios Logit regression (almost) equivalent to logistic regression Z= β0 + β1 · X1 + … Logistic regression Logit transformation

Probit Regression Transforms values in range [0 1] using inverse cumulative normal function Useful for proportions (when numbers are not available) Type of generalized linear model Probit(Y) Y

With Many Categories Logistic regression for one category against rest Canonical Variate Analysis