Generalized Linear Models

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Logistic Regression Psy 524 Ainsworth.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Logistic Regression.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
1. Analyzing Non-Normal Data with Generalized Linear Models 2010 LISA Short Course Series Sai Wang, Dept. of Statistics.
Objectives (BPS chapter 24)
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.

Generalised linear models
Introduction to Logistic Regression. Simple linear regression Table 1 Age and systolic blood pressure (SBP) among 33 adult women.
Log-linear and logistic models Generalised linear model ANOVA revisited Log-linear model: Poisson distribution logistic model: Binomial distribution Deviances.
Log-linear and logistic models
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Linear Regression/Correlation
Logistic regression for binary response variables.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Linear Regression and Correlation Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and the level of.
Logistic Regression In logistic regression the outcome variable is binary, and the purpose of the analysis is to assess the effects of multiple explanatory.
Logistic Regression Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
Objectives of Multiple Regression
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
Chapter 12 Multiple Regression and Model Building.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
Linear Model. Formal Definition General Linear Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Logistic and Nonlinear Regression Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
1 GLM I: Introduction to Generalized Linear Models By Curtis Gary Dean Distinguished Professor of Actuarial Science Ball State University By Curtis Gary.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Negative Binomial Regression NASCAR Lead Changes
Analysis Overheads1 Analyzing Heterogeneous Distributions: Multiple Regression Analysis Analog to the ANOVA is restricted to a single categorical between.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Logistic regression (when you have a binary response variable)
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
Quantitative Research Methods for Social Sciences Spring 2012 Module 2: Lecture 7 Introduction to Generalized Linear Models, Logistic Regression and Poisson.
Logistic Regression Logistic Regression - Binary Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Chapter 11: Linear Regression E370, Spring From Simple Regression to Multiple Regression.
BINARY LOGISTIC REGRESSION
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
Logistic Regression APKC – STATS AFAC (2016).
Negative Binomial Regression
Notes on Logistic Regression
Generalized Linear Models
John Loucks St. Edward’s University . SLIDES . BY.
Generalized Linear Models
Chapter 12 Multiple Regression.
Introduction to logistic regression a.k.a. Varbrul
Logistic Regression.
Presentation transcript:

Generalized Linear Models

Generalized Linear Models (GLM) General class of linear models that are made up of 3 components: Random, Systematic, and Link Function Random component: Identifies dependent variable (Y) and its probability distribution Systematic Component: Identifies the set of explanatory variables (X1,...,Xk) Link Function: Identifies a function of the mean that is a linear function of the explanatory variables

Random Component Conditionally Normally distributed response with constant standard deviation - Regression models we have fit so far. Binary outcomes (Success or Failure)- Random component has Binomial distribution and model is called Logistic Regression. Count data (number of events in fixed area and/or length of time)- Random component has Poisson distribution and model is called Poisson Regression When Count data have V(Y) > E(Y), model fit can be Negative Binomial Regression Continuous data with skewed distribution and variation that increases with the mean can be modeled with a Gamma distribution

Common Link Functions Identity link (form used in normal and gamma regression models): Log link (used when m cannot be negative as when data are Poisson counts): Logit link (used when m is bounded between 0 and 1 as when data are binary):

Logistic Regression Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) Goal: Model the probability of a particular outcome as a function of the predictor variable(s) Problem: Probabilities are bounded between 0 and 1 Distribution of Responses: Binomial Link Function:

Logistic Regression with 1 Predictor Response - Presence/Absence of characteristic Predictor - Numeric variable observed for each case Model - p(x)  Probability of presence at predictor level x b1 = 0  P(Presence) is the same at each level of x b1 > 0  P(Presence) increases as x increases b 1< 0  P(Presence) decreases as x increases

Logistic Regression with 1 Predictor b0, b1 are unknown parameters and must be estimated using statistical software such as SPSS, SAS, R or STATA (or in a matrix language) Primary interest in estimating and testing hypotheses regarding b1 Large-Sample test (Wald Test): H0: b1 = 0 HA: b1  0 Note: Some software packages perform this as an equivalent Z-test or t-test

Odds Ratio Interpretation of Regression Coefficient (b): In linear regression, the slope coefficient is the change in the mean response as x increases by 1 unit In logistic regression, we can show that: Thus eb represents the change in the odds of the outcome (multiplicatively) by increasing x by 1 unit If b = 0, the odds and probability are the same at all x levels (eb=1) If b > 0 , the odds and probability increase as x increases (eb>1) If b < 0 , the odds and probability decrease as x increases (eb<1)

95% Confidence Interval for Odds Ratio Step 1: Construct a 95% CI for b : Step 2: Raise e = 2.718 to the lower and upper bounds of the CI: If entire interval is above 1, conclude positive association If entire interval is below 1, conclude negative association If interval contains 1, cannot conclude there is an association

Multiple Logistic Regression Extension to more than one predictor variable (either numeric or dummy variables). With k predictors, the model is written: Adjusted Odds ratio for raising xi by 1 unit, holding all other predictors constant: Many models have nominal/ordinal predictors, and widely make use of dummy variables

Testing Regression Coefficients Testing the overall model: L0, L1 are values of the maximized likelihood function, computed by statistical software packages. This logic can also be used to compare full and reduced models based on subsets of predictors. Testing for individual terms is done as in model with a single predictor.

Poisson Regression Generally used to model Count data Distribution: Poisson (Restriction: E(Y)=V(Y)) Link Function: Can be identity link, but typically use the log link: Tests are conducted as in Logistic regression When the mean and variance are not equal (over-dispersion), often replace the Poisson Distribution replaced with Negative Binomial Distribution