Chapter 4, Regression Diagnostics Detection of Model Violation

Slides:



Advertisements
Similar presentations
Assumptions underlying regression analysis
Advertisements

Copyright © 2010 Pearson Education, Inc. Slide
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Lecture 20 Simple linear regression (18.6, 18.9)
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Lecture 19 Transformations, Predictions after Transformations Other diagnostic tools: Residual plot for nonconstant variance, histogram to check normality.
Pertemua 19 Regresi Linier
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Business Statistics - QBM117 Statistical inference for regression.
Simple Linear Regression Analysis
Correlation & Regression
Objectives of Multiple Regression
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Inferences for Regression
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Economics 173 Business Statistics Lecture 20 Fall, 2001© Professor J. Petry
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Worked Example Using R. > plot(y~x) >plot(epsilon1~x) This is a plot of residuals against the exploratory variable, x.
Simple Linear Regression (SLR)
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
Quantitative Methods Residual Analysis Multiple Linear Regression C.W. Jackson/B. K. Gordor.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Chapter 8: Multiple Regression for Time Series
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 13 Simple Linear Regression
Why Model? Make predictions or forecasts where we don’t have data.
Inference for Least Squares Lines
Statistical Data Analysis - Lecture /04/03
Regression Analysis: Statistical Inference
Linear Regression.
Statistics for Managers using Microsoft Excel 3rd Edition
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Chapter 11: Simple Linear Regression
Evgeniya Anatolievna Kolomak, Professor
Chapter 12: Regression Diagnostics
Chapter 6 Predicting Future Performance
9/19/2018 ST3131, Lecture 6.
…Don’t be afraid of others, because they are bigger than you
I271B Quantitative Methods
Diagnostics and Transformation for SLR
Lecture 14 Review of Lecture 13 What we’ll talk about today?
Solutions to Tutorial 6 Problems
Tutorial 8 Table 3.10 on Page 76 shows the scores in the final examination F and the scores in two preliminary examinations P1 and P2 for 22 students in.
1/18/2019 ST3131, Lecture 1.
Multiple Regression Chapter 14.
Interpretation of Regression Coefficients
Simple Linear Regression
Three Measures of Influence
Simple Linear Regression and Correlation
Regression Assumptions
Chapter 13 Additional Topics in Regression Analysis
Chapter 6 Predicting Future Performance
Inferences for Regression
Diagnostics and Transformation for SLR
3.2. SIMPLE LINEAR REGRESSION
Regression Assumptions
Diagnostics and Remedial Measures
Presentation transcript:

Chapter 4, Regression Diagnostics Detection of Model Violation Lecture 13 Chapter 4, Regression Diagnostics Detection of Model Violation In Chapter 2, we study SLR models In Chapter 3, we study MLR models We got many useful results about estimation and statistical inferences However, all these results are VALID only when Some Required Assumptions are Satisfied. Questions: 1. What are these required assumptions? 2. How to detect whether these assumptions are violated or not? 3. What would happen if these assumptions are violated? 2/19/2019 ST3131, Lecture 13

Answer to Question 3: when some required assumptions are violated, 1). Theories are NO LONGER . 2). Applications will lead to results. We will see some examples later. Thus we need to answer Questions 1 and 2. AIMs of Chapter 4: 1). State the . 2). Study the to detect model violations. 2/19/2019 ST3131, Lecture 13

1. The Linearity Assumption (about the form of the model ) Standard Regression Assumptions: 1. The Linearity Assumption (about the form of the model ) 2. The Measurement Error Assumption (about the measurement errors) 3. The Predictor Assumption (about the predictor variables) 4. The Observation Assumption (about the observation) The Form Assumption: the i-th observation can be written as Detection Method: For SLR (p=1), use of Y against X to detect the linearity. A linear scatter plot ensures linearity. For MLR (p>1), it is a difficult task, we may be able to use the of Y against X1, X2, …, Xp. 2/19/2019 ST3131, Lecture 13

1. Assumption: are normally distributed. 2. The Measurement Error Assumption iid means Independently Identically Distributed. This assumption implies 4 sub-Assumptions: 1. Assumption: are normally distributed. Detection Method: plot of residuals. 2. Assumption: have mean 0. Detection Method: . 3. Assumption: have the same but variance. When this assumption is violated, the problem is called heterogeneity or heteroscedasticity problem. Detection Method: see Chapter 7. Assumption: are independent of each other. When this assumption is violated, the problem is called autocorrelation problem Detection Method: see Chapter 8. 2/19/2019 ST3131, Lecture 13

3. The Predictor Assumption contains 3 sub-assumptions: a). The Non-random Assumption: X1, X2, …, Xp are non-random. are assumed to be nonrandom or selected in advance. Design Data : . Non-design Data or observational data: . When this assumption is violated, all inferences are valid, conditional to the observed data. In this course, we assume this condition is always satisfied. Detection Method: beyond our consideration in this course. b). The Without Measurement Error Assumption: X1, X2, …, Xp can be accurately observed, or can be measured without errors. Detection Method: beyond our consideration in this course. In this course, we assume this condition is always satisfied. This assumption is hardly satisfied. If violated, will affect the residual variances, coefficient estimation and fitted values. 2/19/2019 ST3131, Lecture 13

The Linearly Independence Assumption: X1, X2, …, Xp are assumed to be linearly independent of each other. This assumption guarantees the of the Least Squares estimates of the regression coefficients. When this assumption is violated, there are multiple solutions for the least squares estimates of the regression coefficients. Detection Method: check if the design matrix is of full rank. 4. The Observation Assumption: all observations are equally reliable, play approximately equal role in determining the regression results and influencing conclusions. 2/19/2019 ST3131, Lecture 13

Consequences of the Violations of the Assumptions: In general, violations of the assumptions invalidate the inference or conclusions too much. However, will distort the conclusions. Thus, we should study how to detect these violations. Let us see some examples below: the Anscombe’s Quartet Data Y1 X1 Y2 X2 Y3 X3 Y4 X4 8.04 10 9.14 10 7.46 10 6.58 8 6.95 8 8.14 8 6.77 8 5.76 8 7.58 13 8.74 13 12.74 13 7.71 8 8.81 9 8.77 9 7.11 9 8.84 8 8.33 11 9.26 11 7.81 11 8.47 8 9.96 14 8.10 14 8.84 14 7.04 8 7.24 6 6.13 6 6.08 6 5.25 8 4.26 4 3.10 4 5.39 4 12.50 19 10.84 12 9.13 12 8.15 12 5.56 8 4.82 7 7.26 7 6.42 7 7.91 8 5.68 5 4.74 5 5.73 5 6.89 8 2/19/2019 ST3131, Lecture 13

2/19/2019 ST3131, Lecture 13

2/19/2019 ST3131, Lecture 13

2/19/2019 ST3131, Lecture 13

2/19/2019 ST3131, Lecture 13

2/19/2019 ST3131, Lecture 13

2/19/2019 ST3131, Lecture 13

Methods of Detecting Violations Using . Using some statistical measures (we will learn some of them soon) Combining a) and b). Most of the above methods for detecting assumption violations are Residual-Based methods or use Residual Plots. The latter can reveal many features about the data that might be missed or overlooked using just summary statistics, e.g., many widely-used statistics, such as, the correlation coefficients, the regression coefficients, etc. based on all the 4 data sets of the Anscombe Data are the same. Thus, we need study the residuals. Various Types of Residuals are a). Ordinary Residuals b). Standardized Residuals c). Studentized Residuals (Interval and External) 2/19/2019 ST3131, Lecture 13

1. Ordinary Residuals 2/19/2019 ST3131, Lecture 13

2. The Standardized Residuals 2/19/2019 ST3131, Lecture 13

a). Internally Studentized Residuals b). Externally Studentized Residuals 2/19/2019 ST3131, Lecture 13

Standard Regression Assumptions: a). about the form of the model Summary Standard Regression Assumptions: a). about the form of the model b). about the measurement errors c). about the predictor variables d). about the observations II Examples of the Anscombe’s Quartet Data show that a). Gross Violations of assumptions will lead to serious problems b). Summary statistics may miss or overlook the features of the data. III Type of Residuals a). Ordinary b). Standardized c). Studentized 2/19/2019 ST3131, Lecture 13