Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Unit 4a: Basic Logistic (Binomial Logit) Regression Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 4a – Slide 1
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Unit 6a: Motivating Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6a– Slide 1
Logistic Regression Psy 524 Ainsworth.
Binary Logistic Regression: One Dichotomous Independent Variable
Logit & Probit Regression
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Trashball: A Logistic Regression Classroom Activity Christopher Morrell (Joint work with Richard Auer) Mathematics and Statistics Department Loyola University.
Regression With Categorical Variables. Overview Regression with Categorical Predictors Logistic Regression.
© Willett, Harvard University Graduate School of Education, 5/21/2015S052/I.3(b) – Slide 1 More details can be found in the “Course Objectives and Content”
Multinomial Logistic Regression
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
1 G Lect 11M Binary outcomes in psychology Can Binary Outcomes Be Studied Using OLS Multiple Regression? Transforming the binary outcome Logistic.
An Introduction to Logistic Regression
Unit 5c: Adding Predictors to the Discrete Time Hazard Model © Andrew Ho, Harvard Graduate School of EducationUnit 5c– Slide 1
Generalized Linear Models
S052/Shopping Presentation – Slide #1 © Willett, Harvard University Graduate School of Education S052: Applied Data Analysis Shopping Presentation: A.
Unit 5c: Adding Predictors to the Discrete Time Hazard Model © Andrew Ho, Harvard Graduate School of EducationUnit 5c– Slide 1
Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1
Unit 3b: From Fixed to Random Intercepts © Andrew Ho, Harvard Graduate School of EducationUnit 3b – Slide 1
Ordinal Logistic Regression “Good, better, best; never let it rest till your good is better and your better is best” (Anonymous)
Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.
Unit 2b: Dealing “Rationally” with Nonlinear Relationships © Andrew Ho, Harvard Graduate School of EducationUnit 2b – Slide 1
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
© Willett, Harvard University Graduate School of Education, 8/27/2015S052/I.3(c) – Slide 1 More details can be found in the “Course Objectives and Content”
Unit 6b: Principal Components Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 6b – Slide 1
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Andrew Ho Harvard Graduate School of Education Tuesday, January 22, 2013 S-052 Shopping – Applied Data Analysis.
Unit 5b: The Logistic Regression Approach to Life Table Analysis © Andrew Ho, Harvard Graduate School of EducationUnit 5b– Slide 1
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Unit 1c: Detecting Influential Data Points and Assessing Their Impact © Andrew Ho, Harvard Graduate School of EducationUnit 1c – Slide 1
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Logistic (regression) single and multiple. Overview  Defined: A model for predicting one variable from other variable(s).  Variables:IV(s) is continuous/categorical,
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
S052/Shopping Presentation – Slide #1 © Willett, Harvard University Graduate School of Education S052: Applied Data Analysis What Would You Like To Know.
Unit 3a: Introducing the Multilevel Regression Model © Andrew Ho, Harvard Graduate School of EducationUnit 3a – Slide 1
Introduction to logistic regression and Generalized Linear Models July 14, 2011 Introduction to Statistical Measurement and Modeling Karen Bandeen-Roche,
Psychology 820 Correlation Regression & Prediction.
© Willett, Harvard University Graduate School of Education, 12/16/2015S052/I.1(d) – Slide 1 More details can be found in the “Course Objectives and Content”
Multiple Logistic Regression STAT E-150 Statistical Methods.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
Logistic Regression Analysis Gerrit Rooks
© Willett, Harvard University Graduate School of Education, 1/19/2016S052/I.2(a) – Slide 1 More details can be found in the “Course Objectives and Content”
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Regression Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
1 Introduction to Modeling Beyond the Basics (Chapter 7)
© Willett, Harvard University Graduate School of Education, 2/19/2016S052/II.1(c) – Slide 1 S052/II.1(c): Applied Data Analysis Roadmap of the Course.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Logistic Regression and Odds Ratios Psych DeShon.
Unit 2a: Dealing “Empirically” with Nonlinear Relationships © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 1
Additional Regression techniques Scott Harris October 2009.
© Willett, Harvard University Graduate School of Education, 6/13/2016S052/II.2(a3) – Slide 1 S052/II.2(a3): Applied Data Analysis Roadmap of the Course.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Logistic Regression When and why do we use logistic regression?
Logistic Regression APKC – STATS AFAC (2016).
Notes on Logistic Regression
Generalized Linear Models
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
When You See (This), You Think (That)
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Major Topics first semester by chapter
Major Topics first semester by chapter
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Regression and Categorical Predictors
Presentation transcript:

Unit 4c: Taxonomies of Logistic Regression Models © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 1

Building the Logistic Regression Model Dichotomous Predictors Interactions Post-Hoc GLH Tests © Andrew Ho, Harvard Graduate School of Education Unit 4c– Slide 2 Multiple Regression Analysis (MRA) Multiple Regression Analysis (MRA) Do your residuals meet the required assumptions? Test for residual normality Use influence statistics to detect atypical datapoints If your residuals are not independent, replace OLS by GLS regression analysis Use Individual growth modeling Specify a Multi-level Model If time is a predictor, you need discrete- time survival analysis… If your outcome is categorical, you need to use… Binomial logistic regression analysis (dichotomous outcome) Multinomial logistic regression analysis (polytomous outcome) If you have more predictors than you can deal with, Create taxonomies of fitted models and compare them. Form composites of the indicators of any common construct. Conduct a Principal Components Analysis Use Cluster Analysis Use non-linear regression analysis. Transform the outcome or predictor If your outcome vs. predictor relationship is non-linear, Use Factor Analysis: EFA or CFA? Course Roadmap: Unit 4c Today’s Topic Area

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 3 The Bivariate Distribution of HOME on HUBSAL RQ: In 1976, were married Canadian women who had children at home and husbands with higher salaries more likely to work at home rather than joining the labor force (when compared to their married peers with no children at home and husbands who earn less)?

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 4 The Bivariate Distribution of HOME on CHILD Scatterplots don’t work very well with dichotomous outcomes and dichotomous predictors. Instead, try a 2x2 table with the “tabulate” command. Note (1,1) is in the lower right for tables but upper right for scatterplots. Scatterplots don’t work very well with dichotomous outcomes and dichotomous predictors. Instead, try a 2x2 table with the “tabulate” command. Note (1,1) is in the lower right for tables but upper right for scatterplots.  Specifies conditional percentages by rows (and joint probabilities by cells):  Given that there is a child present, the sample probability of being a homemaker is 86.58%.  Given that there is no child present, the sample probability of being a homemaker is 35.29%.  Specifies conditional percentages by rows (and joint probabilities by cells):  Given that there is a child present, the sample probability of being a homemaker is 86.58%.  Given that there is no child present, the sample probability of being a homemaker is 35.29%.

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 5 Sample Probabilities, Odds, Log-Odds, and Odds Ratios Are Children Present in the Home? Sample Probability Homemaker Sample Log-Odds (Logit) Sample Difference in Log-Odds Sample Odds Ratio Sample Log- Odds Ratio No Child35.29% Children86.58% I recommend understanding the logit scale (nonlinear in probability): -2 is around 10%, -1 is around 25%, 0 is 50%, 1 is 75%, 2 is 90%. I recommend understanding the logit scale (nonlinear in probability): -2 is around 10%, -1 is around 25%, 0 is 50%, 1 is 75%, 2 is 90%. We note that an increment from No Child (0) to Children (1) increments the log-odds by 2.47.

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 6 Modeling a Dichotomous Outcome on a Dichotomous Predictor Are Children Present in the Home? Sample Probability Homemaker Sample Log-Odds (Logit) Sample Difference in Log- Odds No Child35.29% Children86.58%1.864

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 7 Building the logistic regression model  Our old friend eststo:  Beginning with the baseline model, no predictors, constant only (Model 1).  Adding main effects separately (Models 2 and 3), together (Model 4), and an interaction (Model 5)  At each step, save the “deviance” (-2*loglikelihood)  Our old friend eststo:  Beginning with the baseline model, no predictors, constant only (Model 1).  Adding main effects separately (Models 2 and 3), together (Model 4), and an interaction (Model 5)  At each step, save the “deviance” (-2*loglikelihood)

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 8 Interpretation of Main Effects

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 9 Interpretation of Fit Statistics

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 10 Graphical Representation of Model 4  It is always good practice to only plot fitted curves in the range of the data whose relationships they describe.  It is particularly important for graphing logistic regression models on the probability metric, where there are clearly nonlinear relationships.  See today’s code for details. Label curves.  It is always good practice to only plot fitted curves in the range of the data whose relationships they describe.  It is particularly important for graphing logistic regression models on the probability metric, where there are clearly nonlinear relationships.  See today’s code for details. Label curves. No Children Children  How do we interpret the varying gap? As an interaction?  No! There is no interaction in Model 4.  The scale is not what it seems. This is actually a linear model in the log-odds.  The distance is just as large at the extremes as it is in the center, it just doesn’t seem that way, since we are plotting on the probability metric.  How do we interpret the varying gap? As an interaction?  No! There is no interaction in Model 4.  The scale is not what it seems. This is actually a linear model in the log-odds.  The distance is just as large at the extremes as it is in the center, it just doesn’t seem that way, since we are plotting on the probability metric.

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 11 Contrasting Graphical Representations of Model 4 No Children Children No Children Children

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 12 Interpretation of Model 5 No Children Children

© Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 13 Contrasting Graphical Representations of Model 5 No Children Children No Children Children

Foll © Andrew Ho, Harvard Graduate School of EducationUnit 4c – Slide 14 Post-Hoc Tests No Children Children No Children Children