© Department of Statistics 2012 STATS 330 Lecture 23: Slide 1 Stats 330: Lecture 23.

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Lecture 10 F-tests in MLR (continued) Coefficients of Determination BMTRY 701 Biostatistical Methods II.
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
1 Outliers and Influential Observations KNN Ch. 10 (pp )
Inference for Regression
Lecture 16: Logistic Regression: Goodness of Fit Information Criteria ROC analysis BMTRY 701 Biostatistical Methods II.
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Logistic Regression Example: Horseshoe Crab Data
Chapter 8 Linear Regression © 2010 Pearson Education 1.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
Stat 112: Lecture 15 Notes Finish Chapter 6: –Review on Checking Assumptions (Section ) –Outliers and Influential Points (Section 6.7) Homework.
Chapter 12 Simple Regression
Part I – MULTIVARIATE ANALYSIS C2 Multiple Linear Regression I
Midterm Review Goodness of Fit and Predictive Accuracy
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24: Thurs., April 8th
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #20.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide Are the Means of Several Groups Equal? Ho:Ha: Consider the following.
Multiple Linear Regression
An Introduction to Logistic Regression
Model Checking in the Proportional Hazard model
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Relationships Among Variables
Conditions of applications. Key concepts Testing conditions of applications in complex study design Residuals Tests of normality Residuals plots – Residuals.
Regression and Correlation Methods Judy Zhong Ph.D.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
1 MULTI VARIATE VARIABLE n-th OBJECT m-th VARIABLE.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
9/14/ Lecture 61 STATS 330: Lecture 6. 9/14/ Lecture 62 Inference for the Regression model Aim of today’s lecture: To discuss how we assess.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 26 Comparing Counts.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
CORRELATION & REGRESSION
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
© Department of Statistics 2012 STATS 330 Lecture 26: Slide 1 Stats 330: Lecture 26.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
© Department of Statistics 2012 STATS 330 Lecture 25: Slide 1 Stats 330: Lecture 25.
Statistics for Business and Economics Dr. TANG Yu Department of Mathematics Soochow University May 28, 2007.
Lecture 13 Diagnostics in MLR Variance Inflation Factors Added variable plots Identifying outliers BMTRY 701 Biostatistical Methods II.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Anaregweek11 Regression diagnostics. Regression Diagnostics Partial regression plots Studentized deleted residuals Hat matrix diagonals Dffits, Cook’s.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Copyright © 2010 Pearson Education, Inc. Slide
© Department of Statistics 2012 STATS 330 Lecture 30: Slide 1 Stats 330: Lecture 30.
12/17/ lecture 111 STATS 330: Lecture /17/ lecture 112 Outliers and high-leverage points  An outlier is a point that has a larger.
Outliers and influential data points. No outliers?
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Discrepancy between Data and Fit. Introduction What is Deviance? Deviance for Binary Responses and Proportions Deviance as measure of the goodness of.
© Department of Statistics 2012 STATS 330 Lecture 22: Slide 1 Stats 330: Lecture 22.
Logistic Regression Analysis Gerrit Rooks
Lecture 13 Diagnostics in MLR Added variable plots Identifying outliers Variance Inflation Factor BMTRY 701 Biostatistical Methods II.
KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.
© Department of Statistics 2012 STATS 330 Lecture 24: Slide 1 Stats 330: Lecture 24.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
2/25/ lecture 121 STATS 330: Lecture 12. 2/25/ lecture 122 Diagnostics 4 Aim of today’s lecture To discuss diagnostics for independence.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
Individual observations need to be checked to see if they are: –outliers; or –influential observations Outliers are defined as observations that differ.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables.
Stats Methods at IC Lecture 3: Regression.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
CHAPTER 29: Multiple Regression*
Three Measures of Influence
Inferential Statistics
Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.
Presentation transcript:

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 1 Stats 330: Lecture 23

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 2 Plan of the day In today’s lecture we continue our discussion of the multiple logistic regression model Topics covered –Models and submodels –Residuals for Multiple Logistic Regression –Diagnostics in Multiple Logistic Regression –No analogue of R 2 Reference: Coursebook, sections 5.2.3, 5.2.3

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 3 Comparison of models Suppose model 1 and model 2 are two models, with model 2 a submodel of model1 If Model 2 is in fact correct, then the difference in the deviances will have approximately a chi-squared distribution df equals the difference in df of the separate models Approximation OK for grouped and ungrouped data

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 4 Example: kyphosis data Is age alone an adequate model? > age.glm<-glm(Kyphosis~Age+I(Age^2),family=binomial, data=kyphosis.df) Null deviance: on 80 degrees of freedom Residual deviance: on 78 degrees of freedom AIC: Full model has deviance on 76 df Chisq is = on 78-76=2 df > 1-pchisq(18.311,2) [1] Highly significant: need at least one of start and number

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 5 Anova in R > anova(age.glm,kyphosis.glm, test=“Chi”) Analysis of Deviance Table Model 1: Kyphosis ~ Age + I(Age^2) Model 2: Kyphosis ~ Age + I(Age^2) + Start + Number Resid. Df Resid. Dev Df Deviance P(>|Chi|) *** Two-model form: comparing

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 6 Residuals Two kinds of residuals –Pearson residuals useful for grouped data only similar to residuals in linear regression, actual minus fitted value –Deviance residuals useful for grouped and ungrouped data Measure contribution of each covariate pattern to the deviance

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 7 Pearson residuals Pearson residual for pattern i is Probability predicted by model Standardized to have approximately unit variance, so big if more than 2 in absolute value

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 8 Deviance residuals (i) For grouped data, the deviance is

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 9 Deviance residuals (i) Thus, the deviance can be written as the sum of squares of M quantities d 1, …, d M, one for each covariate pattern Each d i is the contribution to the deviance from the ith covariate pattern If deviance residual is big (more than about 2 in magnitude), then the covariate pattern has a big influence on the likelihood, and hence the estimates

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 10 Calculating residuals > pearson.residuals<-residuals(budworm.glm, type="pearson") > deviance.residuals<-residuals(budworm.glm, type="deviance") > par(mfrow=c(1,2)) > plot(pearson.residuals, ylab="residuals", main="Pearson") > abline(h=0,lty=2) > plot(deviance.residuals, ylab="residuals", main="Deviance") > abline(h=0,lty=2)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 11

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 12 Diagnostics: outlier detection Large residuals indicate covariate patterns poorly fitted by the model Large Pearson residuals indicate a poor match between the “maximum model probabilities” and the logistic model probabilities, for grouped data Large deviance residuals indicate influential points Example: budworm data

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 13 Diagnostics: detecting non- linear regression functions For a single x, plot the logits of the maximal model probabilities against x For multiple x’s, plot Pearson residuals against fitted probabilities, against individual x’s If the data has most n i ’s equal to 1, so can’t be grouped, try gam (cf kyphosis data)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 14 Example: budworms Plot Pearson residuals versus dose, plot shows a curve

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 15 Diagnostics: influential points Will look at 3 diagnostics –Hat matrix diagonals –Cook’s distance –Leave-one-out Deviance Change

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 16 Example: vaso-constriction data Data from study of reflex vaso-constriction (narrowing of the blood vessels) of the skin of the fingers –Can be caused caused by sharp intake of breath

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 17 Example: vaso-constriction data Variables measured: Response = 0/1 1=vaso-constriction occurs, 0 = doesn’t occur Volume: volume of air breathed in Rate: rate of intake of breath

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 18 Data Volume Rate Response obs in all

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 19 Plot of data > plot(Rate,Volume,type="n", cex=1.2) > text(Rate,Volume,1:39, col=ifelse(Response==1, “red",“blue"), cex=1.2) > text(2.3,3.5,“blue: no VS", col=“blue", adj=0, cex=1.2) > text(2.3,3.0,“red: VS", col=“red", adj=0, cex=1.2)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 20 Note points 4 and 18

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 21 Enhanced residual plots > vaso.glm = glm(Response ~ log(Volume) + log(Rate), family=binomial, data=vaso.df) > pear.r<-residuals(vaso.glm, type="pearson") > dev.r<-residuals(vaso.glm, type="deviance") > par(mfrow=c(1,2)) > plot(pear.r, ylab="residuals", main="Pearson",type="n") > text(pear.r,cex=0.7) > abline(h=0,lty=2) > abline(h=2,lty=2,lwd=2) > abline(h=-2,lty=2,lwd=2) > plot(dev.r, ylab="residuals", main="Deviance",type="h") > text(dev.r, cex=0.7) > abline(h=0,lty=2) > abline(h=2,lty=2,lwd=2) > abline(h=-2,lty=2,lwd=2)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 22

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 23 Diagnostics: Hat matrix diagonals Can define hat matrix diagonals (HMD’s) pretty much as in linear models HMD big if HMD > 3p/M (M= no of covariate patterns) Draw index plot of HMD’s

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 24 Plotting HMD’s > HMD<-hatvalues(vaso.glm) > plot(HMD,ylab="HMD's",type="h") > text(HMD,cex=0.7) > abline(h=3*3/39, lty=2)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 25 Obs 31 high- leverage

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 26 Hat matrix diagonals In ordinary regression, the hat matrix diagonals measure how “outlying” the covariates for an observation are In logistic regression, the HMD’s measure the same thing, but are down-weighted according to the estimated probability for the observation. The weights gets small if the probability is close to 0 or 1. In the vaso-constriction data, points 1,2,17 had very small weights, since the probabilities are close to 1 for these points.

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 27 Note points 1,2,17

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 28 Diagnostics: Cooks distance Can define an analogue of Cook’s distance for each point CD = (Pearson resid ) 2 x HMD/(p*(1-HMD) 2 ) p = number of coeficients CD big if more than about 10% quantile of the chi- squared distribution on k+1 df, divided by k+1 Calculate with qchisq(0.1,k+1)/(k+1) But not that reliable as a measure

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 29 Cooks D: calculating and plotting p<-3 CD<-cooks.distance(vaso.glm) plot(CD,ylab="Cook's D",type="h", main="index plot of Cook's distances") text(CD, cex=0.7) bigcook<-qchisq(0.1,p)/p abline(h=bigcook, lty=2)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 30 Points 4 and 18 influential

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 31 Diagnostics: leave-one-out deviance change If the ith covariate pattern is left out, the change in the deviance is approximately (Dev. Res) 2 + (Pearson. Res) 2 HMD/(1-HMD) Big if more than about 4

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 32 Deviance change: calculating and plotting > dev.r<-residuals(vaso.glm,type="deviance") > Dev.change<-dev.r^2 + pear.r^2*HMD/(1-HMD) > plot(Dev.change,ylab="Deviance change", type="h") > text(Dev.change, cex=0.7) > bigdev<-4 > abline(h=bigdev, lty=2)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 33 4 and 18 influential

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 34 All together influenceplots(vaso.glm)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 35 Should we delete points? How influential are the 3 points? Can delete each in turn and examine changes in coefficients, predicted probabilities First, coefficients: Deleting: None All 3 (Intercept) log(Volume) log(Rate)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 36 Should we delete points (2)? Next, fitted probabilities: Conclusion: points 4 and 18 have a big effect. delete points Fitted at None and 18 All 3 point point point

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 37 Should we delete points (3)? Should we delete? They could be genuine – no real evidence they are wrong If we delete them, we increase the regression coefficients, make fitted probabilities more extreme Overstate the predictive ability of the model

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 38 Residuals for ungrouped data If all cases have distinct covariate patterns, then the residuals lie along two curves (corresponding to success and failure) and have little or no diagnostic value. Thus, there is a pattern even if everything is OK.

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 39 Formulas Pearson residuals: for ungrouped data, residual for i th case is

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 40 Formulas (cont) Deviance residuals: for ungrouped data, residual for i th case is

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 41 Use of plot function plot(kyphosis.glm)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 42 Analogue of R 2 ? There is no satisfactory analogue of R 2 for logistic regression. For the “small m big n” situation we can use the residual deviance, since we can obtain an approximate p-value. For other situations we can use the Hosmer –Lemeshow statistic (next slide)

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 43 Hosmer-Lemeshow statistic How can we judge goodness of fit for ungrouped data? Can use the Hosmer-Lemeshow statistic, which groups the data into cases having similar fitted probabilities –Sort the cases in increasing order of fitted probabilities –Divide into 10 (almost) equal groups –Do a chi-square test to see if the number of successes in each group matches the estimated probability

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 44 Kyphosis data Class 1 Class 2 Class 3 Class 4 Class 5 Observed 0’s Observed 1’s Total obs Expected 1’s Class 6 Class 7 Class 8 Class 9 Class 10 Observed 0’s Observed 1’s Total obs Expected 1’s Note: Expected = Total.obs x average prob Divide probs into 10 classes : lowest 10%, next 10%......

© Department of Statistics 2012 STATS 330 Lecture 23: Slide 45 In R, using the kyphosis data > HLstat(kyphosis.glm) Value of HL statistic = P-value = Result of fitting model A p-value of less than 0.05 indicates problems. No problem indicated for the kyphosis data – logistic appears to fit OK. The function HLstat is in the “330 functions”