Basic Introduction LOGISTIC REGRESSION

Slides:



Advertisements
Similar presentations
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Advertisements

Multinomial Logistic Regression David F. Staples.
Logistic Regression.
Logistic Regression Example: Horseshoe Crab Data
Logistic Regression Predicting Dichotomous Data. Predicting a Dichotomy Response variable has only two states: male/female, present/absent, yes/no, etc.
© 2000 Prentice-Hall, Inc. Chap Multiple Regression Models.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
© 2004 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Unit 5c: Adding Predictors to the Discrete Time Hazard Model © Andrew Ho, Harvard Graduate School of EducationUnit 5c– Slide 1
Logistic Regression with “Grouped” Data Lobster Survival by Size in a Tethering Experiment Source: E.B. Wilkinson, J.H. Grabowski, G.D. Sherwood, P.O.
Unit 5c: Adding Predictors to the Discrete Time Hazard Model © Andrew Ho, Harvard Graduate School of EducationUnit 5c– Slide 1
Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Regression and Correlation
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
Logistic Regression and Generalized Linear Models:
Lecture 15: Logistic Regression: Inference and link functions BMTRY 701 Biostatistical Methods II.
Logistic Regression Pre-Challenger Relation Between Temperature and Field-Joint O-Ring Failure Dalal, Fowlkes, and Hoadley (1989). “Risk Analysis of the.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Repeated Measures  The term repeated measures refers to data sets with multiple measurements of a response variable on the same experimental unit or subject.
Logistic regression. Analysis of proportion data We know how many times an event occurred, and how many times did not occur. We want to know if these.
Simple Linear Regression. Deterministic Relationship If the value of y (dependent) is completely determined by the value of x (Independent variable) (Like.
Logistic Regression Database Marketing Instructor: N. Kumar.
November 5, 2008 Logistic and Poisson Regression: Modeling Binary and Count Data LISA Short Course Series Mark Seiss, Dept. of Statistics.
AN INTRODUCTION TO LOGISTIC REGRESSION ENI SUMARMININGSIH, SSI, MM PROGRAM STUDI STATISTIKA JURUSAN MATEMATIKA UNIVERSITAS BRAWIJAYA.
Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.
Lecture Slide #1 Logistic Regression Analysis Estimation and Interpretation Hypothesis Tests Interpretation Reversing Logits: Probabilities –Averages.
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
A preliminary exploration into the Binomial Logistic Regression Models in R and their potential application Andrew Trant PPS Arctic - Labrador Highlands.
Logistic Regression. Linear Regression Purchases vs. Income.
Multiple Regression  Similar to simple regression, but with more than one independent variable R 2 has same interpretation R 2 has same interpretation.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
Logistic Regression. Example: Survival of Titanic passengers  We want to know if the probability of survival is higher among children  Outcome (y) =
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
LOGISTIC REGRESSION Binary dependent variable (pass-fail) Odds ratio: p/(1-p) eg. 1/9 means 1 time in 10 pass, 9 times fail Log-odds ratio: y = ln[p/(1-p)]
© Department of Statistics 2012 STATS 330 Lecture 24: Slide 1 Stats 330: Lecture 24.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.
Nonparametric Statistics
Topic 13: Quantitative-Quantitative Association Part 1:
BINARY LOGISTIC REGRESSION
Logistic regression.
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
Simple Linear Regression
Advanced Quantitative Techniques
An Interactive Tutorial for SPSS 10.0 for Windows©
Review Guess the correlation
Notes on Logistic Regression
Advanced Quantitative Techniques
Regression 11/6.
Regression 10/29.
Introduction to logistic regression a.k.a. Varbrul
Nonparametric Statistics
(& Generalized Linear Models)
24/02/11 Tutorial 3 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ)
Soc 3306a Lecture 11: Multivariate 4
Warm Up The table below shows data on the number of live births per 1000 women (aged years) from 1965 to (Hint: enter the year as the years.
15.1 The Role of Statistics in the Research Process
Logistic Regression with “Grouped” Data
Logistic Regression.
Presentation transcript:

Basic Introduction LOGISTIC REGRESSION Mike Bailey 2/19/2019

Course at statistics.com 2/19/2019

BASICS Response dichotomous Predictors X categorical (usually make these dichotomous Design variables) real-valued Predictors X 2/19/2019

PREDICTING PROBABILITIES 2/19/2019

LOGISTIC MODEL 2/19/2019

LOGIT FUNCTION So, if we can estimate p(x) and take the logit, we have a linear function of the x’s. We can use regression to estimate b’s 2/19/2019

ODDS p(x)/(1-p(x)) is the ODDS that Y=1 given x 2/19/2019

CASE 1: DICHOTOMOUS x data contingency table Y X 1 Y=0 Y=1 X=0 a d X=1 data contingency table Y=0 Y=1 X=0 a d X=1 c b 2/19/2019

ODDS what are the odds of Y=1 when X=1? 2/19/2019

ODDS RATIO when X=1 when X=0 Y=0 Y=1 X=0 a d X=1 c b when X=1 when X=0 ratio of odds for Y = 1 odds ratios have easily-understood interpretation 2/19/2019

EXAMPLE Y = 1 if the baby has low birth weight X = 1 if the mother has frequent prenatal care ODDS RATIO: the increase in P[Y=1] when X=1 “Low birth weight occurs half as often (O.R. = ½) when the mother has adequate prenatal care.” 2/19/2019

2/19/2019

THE MAGIC CONTINUES... b1 = ln(O. R.) the logit is linear in x 2/19/2019

USING R G <- glm(formula = weight ~ prenatal, family = binomial(link = logit) ) 2/19/2019

DATA Save out of Excel as a .csv file y x1 x2 x3 x4 x5 marine army navy iraqi 200 1 100 90 300 50 150 Save out of Excel as a .csv file > eof2 <-read.csv(file="e:datafile2.csv", header = TRUE) 2/19/2019

RESULTS > g2 <- glm(formula = y ~ x1+x2+x3+x4+x5, family = binomial(link=logit), data = eof2) > g2 Call: glm(formula = y ~ x1 + x2 + x3 + x4 + x5, family = binomial(link = logit), data = eof) Coefficients: (Intercept) x1 x2 x3 x4 x5 -950.506 3.714 -3.716 NA 951.613 -75.118 Degrees of Freedom: 36 Total (i.e. Null); 32 Residual Null Deviance: 29.31 Residual Deviance: 3.802 AIC: 13.8 H0: The model doesn’t explain the variability in the data Deviance statistic ~ sum of squares ~ c2 2/19/2019

ARMY vs. USMC > SERV2 <- glm(formula = y ~ marine + army, family = binomial(link=logit), data = eof2) > SERV2 Call: glm(formula = y ~ marine + army, family = binomial(link = logit), data = eof2) Coefficients: (Intercept) marine army -1.757e+01 1.577e+01 2.312e-09 Degrees of Freedom: 36 Total (i.e. Null); 34 Residual Null Deviance: 29.31 Residual Deviance: 28.71 AIC: 34.71 2/19/2019

AOR > region2 <- glm(formula = y ~ raleigh + topeka + denver + mobile + oshkosh + eagle, family = binomial(link=logit), data = eof2) > region2 Call: glm(formula = y ~ raleigh + topeka + denver + mobile + oshkosh + eagle, family = binomial(link = logit), data = eof2) Coefficients: (Intercept) raleigh topeka denver mobile oshkosh eagle -1.957e+01 1.693e+01 -2.086e-08 1.847e+01 3.913e+01 -2.086e-08 NA Degrees of Freedom: 36 Total (i.e. Null); 31 Residual Null Deviance: 29.31 Residual Deviance: 20.84 AIC: 32.84 2/19/2019

EXAMPLE Fear of Violence in Children Y = 1 iff the interview-ee anticipates being the victim of violence in the next 6 months Predictors are demographic Age (Design variable, 2-year categories) Race (Design variable) Below the Poverty Line (Dichotomous) Sex (Dichotomous) Two-parent home (Dichotomous) Recent victim (Dichotomous) 2/19/2019

EXAMPLE Fear of Violence in Children source: poster display, Gornto Teletechnet Center, ODU 2/19/2019

EARLY SEXUAL EXPERIENCE AND IQ Y=1 if the subject had sexual experience Predictors (X) are... design variables for intervals of the AHVPT (IQ) design variables for age (HS, Undergrad, Grad) design variables for specific universities source: http://www.gnxp.com/blog/2007/04/intercourse-and-intelligence.php 2/19/2019

RESULTS IQ of 100 was 5x more likely to have intercourse than an IQ 130 (odds ratios) Each IQ point increases the odds of virginity by 2.7% for males, 1.7% for females (estimates of b) Probability of virginity (predicted values of Y) Age 19 males: 20% Age 19 females: 25% College aged: 13% Princeton undergrads: 44% Harvard undergrads: 41% MIT graduate students: 35% 2/19/2019

RESULTS 2/19/2019

SUMMARY Logistic regression produces odds ratios, predicted values, and regression coefficients Odds ratios are easily interpreted Predictors (x’s) are often categorical or dichotomous 2/19/2019