Validation of Regression Models

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Applied Econometrics Second edition
Probability & Statistical Inference Lecture 9
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
12 Multiple Linear Regression CHAPTER OUTLINE
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Linear Regression Using Excel 2010 Linear Regression Using Excel ® 2010 Managerial Accounting Prepared by Diane Tanner University of North Florida Chapter.
1 BIS APPLICATION MANAGEMENT INFORMATION SYSTEM Advance forecasting Forecasting by identifying patterns in the past data Chapter outline: 1.Extrapolation.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Least Square Regression
The Islamic University of Gaza Faculty of Engineering Civil Engineering Department Numerical Analysis ECIV 3306 Chapter 17 Least Square Regression.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Linear Regression Analysis 5E Montgomery, Peck and Vining 1 Chapter 6 Diagnostics for Leverage and Influence.
1 Chapter 1 Introduction Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis What is regression ?What is regression ? Best-fit lineBest-fit line Least squareLeast square What is regression ?What is regression.
Chapter 11 Simple Regression
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Design of Engineering Experiments Part 5 – The 2k Factorial Design
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Regression relationship = trend + scatter
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Chapter 10: Determining How Costs Behave 1 Horngren 13e.
1 Building the Regression Model –I Selection and Validation KNN Ch. 9 (pp )
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Regression Analysis1. 2 INTRODUCTION TO EMPIRICAL MODELS LEAST SQUARES ESTIMATION OF THE PARAMETERS PROPERTIES OF THE LEAST SQUARES ESTIMATORS AND ESTIMATION.
Regression Modeling Applications in Land use and Transport.
Individual observations need to be checked to see if they are: –outliers; or –influential observations Outliers are defined as observations that differ.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Chapter 10Design & Analysis of Experiments 8E 2012 Montgomery 1.
Global predictors of regression fidelity A single number to characterize the overall quality of the surrogate. Equivalence measures –Coefficient of multiple.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Part 5 - Chapter 17.
AP Statistics Chapter 14 Section 1.
Regression Analysis Part D Model Building
Cautions about Correlation and Regression
Chapter 5 Introduction to Factorial Designs
…Don’t be afraid of others, because they are bigger than you
S519: Evaluation of Information Systems
Regression Analysis 4e Montgomery, Peck & Vining
Chapter 8 – Linear Regression
Correlation and Regression
CHAPTER 29: Multiple Regression*
Part 5 - Chapter 17.
Linear Regression.
CHAPTER 3 Describing Relationships
Prediction of new observations
Design & Analysis of Experiments 7E 2009 Montgomery
Simple Linear Regression
Least Square Regression
11C Line of Best Fit By Eye, 11D Linear Regression
Adequacy of Linear Regression Models
Essentials of Statistics for Business and Economics (8e)
Algebra Review The equation of a straight line y = mx + b
Model Adequacy Checking
Linear Regression Analysis 5th edition Montgomery, Peck & Vining
2k Factorial Design k=2 Ex:.
Presentation transcript:

Validation of Regression Models Chapter 11 Validation of Regression Models Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining 11.1 Introduction What the regression equation was created for, may not always be what it is used for. Model Adequacy Checking – Residual analysis, lack of fit testing, determining influential observations. Checks the fit of the model to the available data. Model Validation – determining if the model will behave or function as it was intended in the operating environment. Linear Regression Analysis 5E Montgomery, Peck & Vining

11.2 Validation Techniques Analysis of model coefficients and predicted values Check for “inappropriate” signs on the coefficients; Check for unusual magnitudes on the coefficients; Check for stability in the coefficient estimates; Check the predicted values (do they make sense for the nature of the data?) 2. Collection of new data Usually 15-20 new observations are adequate Linear Regression Analysis 5E Montgomery, Peck & Vining

Example 11.1 The Hald Cement Data Coefficients of x1 very similar, coefficients of x2 and the intercept moderately different Difference in predicted values? Linear Regression Analysis 5E Montgomery, Peck & Vining

Which model would you prefer? Linear Regression Analysis 5E Montgomery, Peck & Vining

Example 11.2 The Delivery Time Data Compare the residual mean square to the average squared prediction error Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining New data: Average squared prediction error Linear Regression Analysis 5E Montgomery, Peck & Vining

How does this compare to the R2 for prediction based on PRESS? Linear Regression Analysis 5E Montgomery, Peck & Vining

11.2 Validation Techniques 3. Data splitting (aka cross validation) Divide the data into two parts: estimation data and prediction data The PRESS statistic is an estimate of performance based on data splitting We can also use PRESS to compute an R2 type statistic for prediction: Linear Regression Analysis 5E Montgomery, Peck & Vining

11.2 Validation Techniques 3. Data splitting (aka cross validation) If the time sequence is known, data splitting can be done by time order (common in time series or forecasting) Other characteristics of the data (are data grouped by operator, machine, location, etc.) Double cross validation Drawbacks? A more formal approach? The DUPLEX algorithm Linear Regression Analysis 5E Montgomery, Peck & Vining

Example 11.3 The Delivery Time Data A portion of Table 11.3 showing prediction and estimation data determined with DUPLEX, Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

A portion of Table 11.4 is reproduced here. Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining

Example 11.3 The Delivery Time Data Linear Regression Analysis 5E Montgomery, Peck & Vining

Linear Regression Analysis 5E Montgomery, Peck & Vining