Do whatever is needed to finish…

Slides:



Advertisements
Similar presentations
- Word counts - Speech error counts - Metaphor counts - Active construction counts Moving further Categorical count data.
Advertisements

Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Logistic Regression Psy 524 Ainsworth.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Logit & Probit Regression
Linear statistical models 2008 Binary and binomial responses The response probabilities are modelled as functions of the predictors Link functions: the.

Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
FIN357 Li1 Binary Dependent Variables Chapter 12 P(y = 1|x) = G(  0 + x  )
Log-linear and logistic models
EPI 809/Spring Multiple Logistic Regression.
Analysis of Complex Survey Data Day 3: Regression.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
1 B. The log-rate model Statistical analysis of occurrence-exposure rates.
Logistic regression for binary response variables.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Education 795 Class Notes Applied Research Logistic Regression Note set 10.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Biostatistics Case Studies 2015 Youngju Pak, PhD. Biostatistician Session 4: Regression Models and Multivariate Analyses.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
HSRP 734: Advanced Statistical Methods June 19, 2008.
Session 10. Applied Regression -- Prof. Juran2 Outline Binary Logistic Regression Why? –Theoretical and practical difficulties in using regular (continuous)
Linear Model. Formal Definition General Linear Model.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Week 5: Logistic regression analysis Overview Questions from last week What is logistic regression analysis? The mathematical model Interpreting the β.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Introduction to logistic regression and Generalized Linear Models July 14, 2011 Introduction to Statistical Measurement and Modeling Karen Bandeen-Roche,
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Analysis of Experimental Data IV Christoph Engel.
1 Introduction to Modeling Beyond the Basics (Chapter 7)
1 Fighting for fame, scrambling for fortune, where is the end? Great wealth and glorious honor, no more than a night dream. Lasting pleasure, worry-free.
Variance Stabilizing Transformations. Variance is Related to Mean Usual Assumption in ANOVA and Regression is that the variance of each observation is.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Non-Linear Dependent Variables Ciaran S. Phibbs November 17, 2010.
Logistic Regression: Regression with a Binary Dependent Variable.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Statistical Modelling
Logistic Regression When and why do we use logistic regression?
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
CHAPTER 7 Linear Correlation & Regression Methods
Chapter 13 Nonlinear and Multiple Regression
M.Sc. in Economics Econometrics Module I
THE LOGIT AND PROBIT MODELS
Generalized Linear Models
Drop-in Sessions! When: Hillary Term - Week 1 Where: Q-Step Lab (TBC) Sign up with Alice Evans.
Generalized Linear Models
Logistic regression One of the most common types of modeling in the biomedical literature Especially case-control studies Used when the outcome is binary.
Generalized Linear Models (GLM) in R
Introduction to logistic regression a.k.a. Varbrul
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
THE LOGIT AND PROBIT MODELS
Quantitative Methods What lies beyond?.
Welcome to the class! set.seed(843) df <- tibble::data_frame(
Scatter Plots of Data with Various Correlation Coefficients
DCAL Stats Workshop Bodo Winter.
What is Regression Analysis?
Quantitative Methods What lies beyond?.
Introduction to Logistic Regression
MPHIL AdvancedEconometrics
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Presentation transcript:

Do whatever is needed to finish…

Generalized Linear Models (GLM) EDUC 7610 Chapter 18 Generalized Linear Models (GLM) Fall 2018 Tyson S. Barrett, PhD

Lots of Types of Outcomes Not all outcomes are continuous and nicely distributed Type Example Method to Handle It Dichotomous Smoker/Non-Smoker Depressed/Not Depressed Logistic Regression Count Number of times visited hospital this month Poisson Regression, Negative Binomial Regression Ordinal Low, Mid, High levels of anxiety Ordinal Logistic Regression Time to Event Time until heart attack Survival Analysis library(tidyverse) set.seed(843) df <- data_frame( x1 = runif(100), x2 = rbinom(100, 1, .5), y = x1 + x2 + 2*x1*x2 + rnorm(100) ) df %>% mutate(x2_cat = factor(x2)) %>% ggplot(aes(x1, y, group = x2_cat, color = x2_cat)) + geom_point() + geom_smooth(method = "lm")

(same as in regular regression) The Gist of GLMs We model the expected value of the model in a different way than regular regression 𝑔( 𝑌 𝑖 )= 𝛽 0 + 𝛽 1 𝑋 1 + …+ 𝜖 𝑖 Link function Predictors (same as in regular regression) library(tidyverse) set.seed(843) df <- data_frame( x1 = runif(100), x2 = rbinom(100, 1, .5), y = x1 + x2 + 2*x1*x2 + rnorm(100) ) df %>% mutate(x2_cat = factor(x2)) %>% ggplot(aes(x1, y, group = x2_cat, color = x2_cat)) + geom_point() + geom_smooth(method = "lm") Outcome response

𝑔( 𝑌 𝑖 )= 𝛽 0 + 𝛽 1 𝑋 1 + …+ 𝜖 𝑖 The Gist of GLMs Model Link We model the expected value of the model in a different way than regular regression 𝑔( 𝑌 𝑖 )= 𝛽 0 + 𝛽 1 𝑋 1 + …+ 𝜖 𝑖 Model Link Distribution Linear Regression Identity Normal Logistic Regression Logit Binomial Poisson Regression Log Poisson Loglinear Probit Regression Probit library(tidyverse) set.seed(843) df <- data_frame( x1 = runif(100), x2 = rbinom(100, 1, .5), y = x1 + x2 + 2*x1*x2 + rnorm(100) ) df %>% mutate(x2_cat = factor(x2)) %>% ggplot(aes(x1, y, group = x2_cat, color = x2_cat)) + geom_point() + geom_smooth(method = "lm")

The Linear Models and GLMs There are so much in common between linear models and generalized linear models Model specification Diagnostics Simple and multiple library(tidyverse) set.seed(843) df <- data_frame( x1 = runif(100), x2 = rbinom(100, 1, .5), y = x1 + x2 + 2*x1*x2 + rnorm(100) ) df %>% mutate(x2_cat = factor(x2)) %>% ggplot(aes(x1, y, group = x2_cat, color = x2_cat)) + geom_point() + geom_smooth(method = "lm") Continuous and categorical predictors Similar assumptions

The most common GLM: Logistic Logistic regression is a particular type of GLM The outcome is binary (dichotomous) The predicted values (predicted probabilities) are along an S shaped curve Makes it so the predictions are never less than 0 or more than 1 Often matches how probabilities probably work library(tidyverse) set.seed(843) df <- data_frame( x1 = rnorm(100), x2 = runif(100), z = 2*x1 + x2, pr = 1/(1+exp(-z)), y = rbinom(100,1,pr) ) df %>% ggplot(aes(x1, y)) + geom_point(color = "chartreuse4", alpha = .7, size = 3) + geom_smooth(method = "glm", method.args = list(family = "binomial"), color = "coral2")

The most common GLM: Logistic Logistic regression is a particular type of GLM 𝑙𝑜𝑔𝑖𝑡 𝑌 𝑖 = 𝛽 0 + 𝛽 1 𝑋 1 + …+ 𝜖 𝑖 log 𝑃 1 −𝑃 The results then are in terms of ”logits” or the ”log-odds” of a positive response (gives us the S shape for the predicted values) For a one unit increase in 𝑋 1 , there is an associated 𝛽 1 change in the log odds of the outcome

Predicted Probabilities Logistic Regression Since log-odds is not all that intuitive, let’s talk about other ways to interpret the results Predicted Probabilities Odds Ratios Exponentiate the coefficients for OR  𝑒 𝛽 1 Can be slightly biased (Mood, 2010) but are the most common way to interpret logistic regression Average Marginal Effects Use AMEs to get the average effect in the sample Are less biased than odds ratios (Mood, 2010)

Some Additional Linear Models Poisson Regression Ordinal Logistic Regression For count outcomes For ordinal outcomes Survival Analysis Structural Equation Modeling For time to event outcomes A flexible, powerful framework for general purpose modeling (linear regression is a subset of SEM) Time Series Multilevel Modeling For outcomes where it is measured periodically many times For nested or longitudinal outcomes

Structural Equation Modeling A flexible, powerful framework for general purpose modeling (linear regression is a subset of SEM) M1 M2 M3 M X Y

Structural Equation Modeling A flexible, powerful framework for general purpose modeling (linear regression is a subset of SEM) Can do multiple “dependent” variables Latent variables to control for measurement error Interpreted like regular regression Several approaches (e.g., LCA) Many estimation routines (mostly based on ML like GLMs) More assumptions though M1 M2 M3 M X Y

By Debbie Pan on Unsplash.com