1 Reg12W G89.2229 Multiple Regression Week 12 (Wednesday) Review of Regression Diagnostics Influence statistics Multicollinearity Examples.

Slides:



Advertisements
Similar presentations
Multiple Regression in Practice The value of outcome variable depends on several explanatory variables. The value of outcome variable depends on several.
Advertisements

12-1 Multiple Linear Regression Models Introduction Many applications of regression analysis involve situations in which there are more than.
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Part I – MULTIVARIATE ANALYSIS C2 Multiple Linear Regression I
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Lecture 24 Multiple Regression (Sections )
Lecture 24: Thurs., April 8th
Regression Diagnostics Checking Assumptions and Data.
Outliers and Influential Data Points in Regression Analysis James P. Stevens sujin jang november 10, 2008.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Multiple Regression Dr. Andy Field.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Correlation & Regression
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Objectives of Multiple Regression
1 MULTI VARIATE VARIABLE n-th OBJECT m-th VARIABLE.
Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.
STA302/ week 111 Multicollinearity Multicollinearity occurs when explanatory variables are highly correlated, in which case, it is difficult or impossible.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Research Methods I Lecture 10: Regression Analysis on SPSS.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
4.3 Diagnostic Checks VO Verallgemeinerte lineare Regressionsmodelle.
1 Multivariate Linear Regression Models Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of.
Multiple regression models Experimental design and data analysis for biologists (Quinn & Keough, 2002) Environmental sampling and analysis.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
1 Reg12M G Multiple Regression Week 12 (Monday) Quality Control and Critical Evaluation of Regression Results An example Identifying Residuals Leverage:
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Anaregweek11 Regression diagnostics. Regression Diagnostics Partial regression plots Studentized deleted residuals Hat matrix diagonals Dffits, Cook’s.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Dr. C. Ertuna1 Issues Regarding Regression Models (Lesson - 06/C)
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Outliers and influential data points. No outliers?
1 Model Selection Response: Highway MPG Explanatory: 13 explanatory variables Indicator variables for types of car – Sports Car, SUV, Wagon, Minivan.
Applied Quantitative Analysis and Practices LECTURE#30 By Dr. Osman Sadiq Paracha.
B AD 6243: Applied Univariate Statistics Multiple Regression Professor Laku Chidambaram Price College of Business University of Oklahoma.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 10.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
1 Regression Review Population Vs. Sample Regression Line Residual and Standard Error of Regression Interpretation of intercept & slope T-test, F-test.
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
DSCI 346 Yamasaki Lecture 6 Multiple Regression and Model Building.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Unit 9: Dealing with Messy Data I: Case Analysis
Logistic Regression When and why do we use logistic regression?
Outlier Detection Identifying anomalous values in the real- world database is important both for improving the quality of original data and for reducing.
CHAPTER 3 Describing Relationships
Multiple Regression Prof. Andy Field.
Chapter 9 Multiple Linear Regression
Regression Diagnostics
Regression Analysis Simple Linear Regression
Chapter 12: Regression Diagnostics
Regression Model Building - Diagnostics
Stats Club Marnie Brennan
Residuals The residuals are estimate of the error
Regression Diagnostics
Regression Model Building - Diagnostics
Multivariate Linear Regression Models
Regression Forecasting and Model Building
Multicollinearity Multicollinearity occurs when explanatory variables are highly correlated, in which case, it is difficult or impossible to measure their.
Presentation transcript:

1 Reg12W G Multiple Regression Week 12 (Wednesday) Review of Regression Diagnostics Influence statistics Multicollinearity Examples

2 Reg12W An issue for survey data: Influence In balanced experimental designs, main effects of variables are a function of large proportions of observations In survey designs, one or a few observations may completely determine the direction and statistical significance of a main effect Example: lack of support in miscarriage study »We found that more than 90% of the women were supported. »Results depended on 4 women

3 Reg12W Measures of Influence Influence takes into account both leverage and discrepancy. It is calculated by seeing the impact of dropping each successive observation. Diff B: How much does each regression weight change when an observation is deleted The numerator may itself be of practical interest, since it can affect whether an effect appears to be significant.

4 Reg12W Global Influence Cook’s distance can be interpreted as a global test of these differences It is related to an F distribution on (k+1, n-k-1) df. It is also related to the square of another global measure, DIFFIT.

5 Reg12W Summary Check regression diagnostics »To exercise quality control of your data »To understand the generalizability of your findings »To explore new aspects of your data

6 Reg12W Example of Influence Two predictors that are highly correlated »Neither one has particular outliers »Jointly there is a lone point

7 Reg12W List of Observations with Highest CooksD The first is a point at the extreme of the correlated predictor space The second is the point that is isolated in the bivariate plot. None of these values "ring the bell". F(3,27)=.8 for CooksD

8 Reg12W Multicollinearity Problems associated with highly correlated predictors »In extreme case, numerical instability »Problem of interpretation Indices depend on R 2 i|X, the multiple R-square of the i th variable with other Xs »Variance Inflation Factor =1/(1- R 2 i|X ) »Tolerance = (1- R 2 i|X ) =1/VIF

9 Reg12W Approaches to Multicollinearity Conceptual: Rethink the set of variables. »E.g. If measures of anxiety and depression are very highly correlated as predictors, think about whether one is simply interested in distress Statistical: »Principal Components analysis »Factor analysis »Structural equation analysis

10 Reg12W Regression Diagnostics in Logistic Models Discrepancy »There will not be “outliers” due to extreme Y values in logistic regression since Y is binary. »Residuals are difference between Y and fitted P(Y=1). »If an event occurs when it is thought to be extremely unlikely, the discrepancy will be large Leverage and Influence »One can study H matrix as before to identify influential points »Cook’s distance has been generalized to logistic.