1 STA 617 – Chp9 Loglinear/Logit Models 9.7 Poisson regressions for rates  In Section 4.3 we introduced Poisson regression for modeling counts. When outcomes.

Slides:



Advertisements
Similar presentations
If we use a logistic model, we do not have the problem of suggesting risks greater than 1 or less than 0 for some values of X: E[1{outcome = 1} ] = exp(a+bX)/
Advertisements

Qualitative predictor variables
Three or more categorical variables
1 Markov Chains: Transitional Modeling Qi Liu. 2 content Terminology Transitional Models without Explanatory Variables Transitional Models without Explanatory.
Survival Analysis-1 In Survival Analysis the outcome of interest is time to an event In Survival Analysis the outcome of interest is time to an event The.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
EPI 809/Spring Probability Distribution of Random Error.
Logistic Regression Example: Horseshoe Crab Data
Loglinear Models for Independence and Interaction in Three-way Tables Veronica Estrada Robert Lagier.
Copyright © 2013, SAS Institute Inc. All rights reserved. GENERALIZED LINEAR MODELS.
1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS  Recall multi(n, =( 1,  2, …,  c ))  Suppose.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
بسم الله الرحمن الرحیم. Generally,survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of.
Chapter 11 Survival Analysis Part 3. 2 Considering Interactions Adapted from "Anderson" leukemia data as presented in Survival Analysis: A Self-Learning.
SLIDE 1IS 240 – Spring 2010 Logistic Regression The logistic function: The logistic function is useful because it can take as an input any.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Proportional Hazard Regression Cox Proportional Hazards Modeling (PROC PHREG)
Statistics 303 Chapter 9 Two-Way Tables. Relationships Between Two Categorical Variables Relationships between two categorical variables –Depending on.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
Incomplete Block Designs
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Introduction to Survival Analysis PROC LIFETEST and Survival Curves.
Log-linear analysis Summary. Focus on data analysis Focus on underlying process Focus on model specification Focus on likelihood approach Focus on ‘complete-data.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
1 B. The log-rate model Statistical analysis of occurrence-exposure rates.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Analysis of Complex Survey Data
Poisson Regression Caution Flags (Crashes) in NASCAR Winston Cup Races L. Winner (2006). “NASCAR Winston Cup Race Results for ,” Journal.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Survival analysis Brian Healy, PhD. Previous classes Regression Regression –Linear regression –Multiple regression –Logistic regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.3 Determining.
SAS Lecture 5 – Some regression procedures Aidan McDermott, April 25, 2005.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.3 Determining.
Discrete Choice Modeling William Greene Stern School of Business New York University.
G Lecture 121 Analysis of Time to Event Survival Analysis Language Example of time to high anxiety Discrete survival analysis through logistic regression.
Chapter 3: Generalized Linear Models 3.1 The Generalization 3.2 Logistic Regression Revisited 3.3 Poisson Regression 1.
Andrew Thomson on Generalised Estimating Equations (and simulation studies)
Bayesian Analysis and Applications of A Cure Rate Model.
1 STA 617 – Chp9 Loglinear/Logit Models Loglinear / Logit Models  Chapter 5-7 logistic regression: GLM with logit link binomial / multinomial  Chapter.
Different Distributions David Purdie. Topics Application of GEE to: Binary outcomes: – logistic regression Events over time (rate): –Poisson regression.
Introduction to Survival Analysis Utah State University January 28, 2008 Bill Welbourn.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
HSRP 734: Advanced Statistical Methods July 31, 2008.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
1 STA 617 – Chp11 Models for repeated data Analyzing Repeated Categorical Response Data  Repeated categorical responses may come from  repeated measurements.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Slide 1 Copyright © 2004 Pearson Education, Inc..
1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Lecture 4 Ways to get data into SAS Some practice programming
01/20151 EPI 5344: Survival Analysis in Epidemiology Confounding and Effect Modification March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
Probability and odds Suppose we a frequency distribution for the variable “TB status” The probability of an individual having TB is frequencyRelative.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
ERIC CANEN, M.S. UNIVERSITY OF WYOMING WYOMING SURVEY & ANALYSIS CENTER EVALUATION 2010: EVALUATION QUALITY SAN ANTONIO, TX NOVEMBER 13, 2010 What Am I.
Generalized Linear Models
Generalized Linear Models
Log Linear Modeling of Independence
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Introduction to Logistic Regression
Analyzing the Association Between Categorical Variables
5.2 Inference for logistic regression
Presentation transcript:

1 STA 617 – Chp9 Loglinear/Logit Models 9.7 Poisson regressions for rates  In Section 4.3 we introduced Poisson regression for modeling counts. When outcomes occur over time, space, or some other index of size, it is more relevant to model their rate of occurrence than their raw number.  We use GLM with log link, Poisson distribution, log(index) as offset

2 STA 617 – Chp9 Loglinear/Logit Models Analyzing Rates Using Loglinear Models with Offsets  When a response count n i has index equal to t i, the sample rate is n i /t i. Its expected value is µ i /t i.  With an explanatory variable x, a loglinear model for the expected rate has form  This model has equivalent representation  The adjustment term, -log t i, to the log link of the mean is called an offset. The fit correspond to using log t i as a predictor on the right-hand side and forcing its coefficient to equal 1.0.

3 STA 617 – Chp9 Loglinear/Logit Models  Then is proportional to the index, with proportionality constant depending on the value of x.  Another model is to use identity link, it is less useful as the fitting process may fail because the negative fitted value  However, the log link may also possibly cause the fitted probability >1.

4 STA 617 – Chp9 Loglinear/Logit Models Modeling Death Rates for Heart Valve Operations  Laird and Olivier (1981) analyzed patient survival after heart valve replacement operations.  A sample of 109 patients were classified by type of heart valve (aortic, mitral) and by age ( 55).  Follow-up observations occurred until the patient died or the study ended.  Operations occurred throughout the study period, and follow-up observations covered lengths of time varying from 3 to 97 months.  Response: death and corresponding follow up time

5 STA 617 – Chp9 Loglinear/Logit Models  The time at risk for a subject is their follow-up time of observation.  For a given age and valve type, the total time at risk is the sum of the times at risk for all subjects in that cell (those who died and those censored).

6 STA 617 – Chp9 Loglinear/Logit Models  We now model effects of age and valve type on the rate. where a – age, v – type of valve.  Or identity link

7 STA 617 – Chp9 Loglinear/Logit Models SAS code data table9_11; input age $ vtype $ death totaltime; logtime=log(totaltime); cards; <55 aortic <55 mitral aortic mitral ;

8 STA 617 – Chp9 Loglinear/Logit Models Model fit proc genmod data=table9_11; class age vtype; model death = age vtype/ dist = poi link = log offset=logtime lrci type3 obstats; proc genmod data=table9_11; class age vtype; model death = age / dist = poi link = log offset=logtime lrci type3 obstats; proc genmod data=table9_11; class age vtype; model death = vtype/ dist = poi link = log offset=logtime lrci type3 obstats; /*identity link*/ proc genmod data=table9_11; class age vtype; model death/totaltime = age vtype/ dist = poi link = identity lrci type3 obstats; ods output obstats=obstats Modelfit=Modelfit; run;

9 STA 617 – Chp9 Loglinear/Logit Models  It is an estimated difference in death rates between the older and younger age groups for each valve type.

10 STA 617 – Chp9 Loglinear/Logit Models Another example  2004 birth vital statistics merged to death data in Florida  The predictors: smoking, drinking, education, marital status, Medicaid.  The response: infant death  Purpose: to indentify the maternal characteristics of Medicaid beneficiaries that are significantly associated with infant death so that health care and related services can be focused on risk factors that contribute to the adverse outcome

11 STA 617 – Chp9 Loglinear/Logit Models

12 STA 617 – Chp9 Loglinear/Logit Models /*raw table*/ proc sql; create table rawtable as select 'smoking' as varlabel, smoking as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by smoking union select 'drk' as varlabel, drk as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by drk union select 'edu ' as varlabel, edu as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by edu union select 'ms ' as varlabel, ms as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by ms union select 'med ' as varlabel, med as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by med; data rawtable; set rawtable; percentage=totalinfdth/totalsumple*100; proc print; run;

13 STA 617 – Chp9 Loglinear/Logit Models

14 STA 617 – Chp9 Loglinear/Logit Models /*backward model selection starting from main+2fis*/ proc genmod data=birth2004; class smoking drk edu ms med; model infdth = smoking drk edu ms med smoking*drk smoking*edu smoking*ms smoking*med drk*edu drk*ms drk*med edu*ms edu*med ms*med / dist = poi link = log offset=logtotal lrci type3; ods output type3=type3; run; proc sort data=type3; by ProbChiSq; run; proc print data=type3; run;

15 STA 617 – Chp9 Loglinear/Logit Models Main effects + 2 factor-interactions  It is not lack of fit  Model might be too complicated

16 STA 617 – Chp9 Loglinear/Logit Models Backward model selection  Sort the Type 3 table by p-value, delete drk*ms

17 STA 617 – Chp9 Loglinear/Logit Models Continue the backward procedure, but keep the main effect even it is not significant but it is included in an interaction  deleting in order  smoking*med  drk*med  smoking*edu  edu*med  edu*ms  smoking*drk  smoking*ms  drk*edu  drk

18 STA 617 – Chp9 Loglinear/Logit Models Final model proc genmod data=birth2004; class smoking drk edu ms med; model infdth = smoking edu ms med ms*med / dist = poi link = log offset=logtotal lrci type3; ods output type3=type3; run; proc sort data=type3; by ProbChiSq; run; proc print data=type3; run;

19 STA 617 – Chp9 Loglinear/Logit Models

20 STA 617 – Chp9 Loglinear/Logit Models

21 STA 617 – Chp9 Loglinear/Logit Models lsmeans smoking edu ms med ms*med /diff;

22 STA 617 – Chp9 Loglinear/Logit Models Relative Risks