Regression Analysis Part D Model Building

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Chapter 13 Multiple Regression
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Chapter 10 Simple Regression.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Multiple Regression
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Simple Linear Regression Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Inference for regression - Simple linear regression
Correlation and Linear Regression
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Linear Regression Least Squares Method: the Meaning of r 2.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Lecturer: Kem Reat, Viseth, PhD (Economics)
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Regression Analysis Part B Calculation Procedures Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Chapter 5 Demand Estimation Managerial Economics: Economic Tools for Today’s Decision Makers, 4/e By Paul Keat and Philip Young.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
Section 12.3 Regression Analysis HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc. All.
Correlation & Regression Analysis
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Linear Regression and Correlation Chapter GOALS 1. Understand and interpret the terms dependent and independent variable. 2. Calculate and interpret.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
Regression Analysis Part A Basic Linear Regression Analysis and Estimation of Parameters Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied.
Multiple Regression.
Chapter 15 Multiple Regression Model Building
Chapter 14 Introduction to Multiple Regression
Regression and Correlation
Regression Analysis AGEC 784.
Inference for Least Squares Lines
Decomposition of Sum of Squares
26134 Business Statistics Week 5 Tutorial
Chapter 11 Simple Regression
Essential Statistics (a.k.a: The statistical bare minimum I should take along from STAT 101)
Statistics in Data Mining on Finance by Jian Chen
Correlation and Regression
Stats Club Marnie Brennan
CHAPTER 26: Inference for Regression
Multiple Regression.
Chapter 10 Correlation and Regression
Multiple Regression Models
Correlation and Regression
Product moment correlation
Ch 4.1 & 4.2 Two dimensions concept
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Regression Analysis Part D Model Building Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied Approach. L01D MGS 8110 - Regression - Model Building

Regression Analysis Modules Part A – Basic Model & Parameter Estimation Part B – Calculation Procedures Part C – Inference: Confidence Intervals & Hypothesis Testing Part D – Goodness of Fit Part E – Model Building Part F – Transformed Variables Part G – Standardized Variables Part H – Dummy Variables Part I – Eliminating Intercept Part J - Outliers Part K – Regression Example #1 Part L – Regression Example #2 Part N – Non-linear Regression Part P – Non-linear Example R L01D MGS 8110 - Regression - Goodness of Fit L01C MGS 8110 - Regression Inference

Overview of Goodness of Fit Standard Error Prediction Interval Validation C Statistic R2adjusted, Adjusted Coefficient of Determination L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit Primary Measures R2, Coefficient of Determination se, Standard Error of Regression Prediction interval Validation of Fit C Statistic Secondary Measures R2adjusted, Adjusted Coefficient of Determination R, correlation between observed and predicted. Press Statistic L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: R2 – Coefficient of Determination Varies between 0 and 1. “Is the proportion of the variability in the dependent variable that is accounted for by the regression equation.” Calculated as L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: se – Standard Error of Regression se is the standard deviation of the residuals. 66%, 95% and 99.7% of the residuals will be between plus or minus 1, 2 and 3 se. Calculated as se does not necessarily get smaller as n gets larger. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Comparison R2 and se L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Prediction Interval Calculated as Includes se and is a more encompassing measure of goodness of fit than just se. More difficult to calculate than se and not a unique value (there is a prediction interval for every possible Xf).. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: C Statistic Manually calculated in Excel & SPSS where k=p+1, p = # variables in the transformed database Model Selection Criterion: C <= k then select smallest C value. Poor model if C > k. A less asymptotic measure of fit than R2. Definitely takes into consideration the number of variables in the model. Does not consider the intrinsic characteristics of the variables. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit (1 of 5) Collect a sample of data and fit a regression equation to the data. Collect a second, comparable sample of data and see how well the previously derived regression equation agrees with this data. Do not actually fit a regression equation to the second set of data. Manually calculate the R2 (and se)for the second data set. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit (2 of 5) Calculate the R2 of the second sample as And see if this R2 is almost as good as the original R2. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit (3 of 5) The two alternative formulas for R2 will give identical results for the original sample but different results for the validation sample. An infeasible R2 (less than 0 or greater than 1) may be obtained for the validation sample if the second formula is used. More likely to occur if the sample size for the validation sample is very small. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit (4 of 5) The sum of squares forms (shown on the right) of the deviation squared formulas are also not valid for the validation data base. USE Do NOT use Do NOT use Do NOT use L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit (5 of 5) Likewise, the matrix form of the sum-of-square formulas should NOT be used. Do NOT use Do NOT use None of the matrix formulations can be used. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit – Numerical Example (1 of 5) L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit – Numerical Example (2 of 5) Using Algebraic Formulas on Original Data L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit – Numerical Example (3 of 5) Using Matrix Formulas on Original Data Same Values. So same R-sq’s L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit – Numerical Example (4 of 5) Using Algebraic Formulas on Verification Data L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Validation of Fit – Numerical Example (5 of 5) Using Matrix Formulas on Verification Data L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

A revised method of calculating a R2 “type” of metric. Goodness of Fit: R2a – Adjusted (Corrected) Coefficient of Determination A revised method of calculating a R2 “type” of metric. The advantage of this metric is that the R2 will decrease when a variable of marginal value is added to a regression equation. The disadvantage of this metric is that it does not have a meaningful physical interpretation. In particular, R2adjusted does NOT represent the percentage of the total variation that is explained by the regression equation. A second disadvantage is the decrease in the R2adjusted value is very small and there are no decision rules to interpret the reduction in the R2 value. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Alternative calculation procedures Goodness of Fit: R2a – Adjusted (Corrected) Coefficient of Determination Alternative calculation procedures L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

L01D MGS 8110 - Regression - Model Building Goodness of Fit: R2a – Adjusted (Corrected) Coefficient of Determination L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

L01D MGS 8110 - Regression - Model Building Goodness of Fit: R2a – Adjusted (Corrected) Coefficient of Determination L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

L01D MGS 8110 - Regression - Model Building Goodness of Fit: R2a – Adjusted (Corrected) Coefficient of Determination L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: R, Multiple Correlation Coefficient Correlation between the observed y’s and predicted y’s. Calculated as L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building

Goodness of Fit: Press Statistic Not calculated in Excel & SPSS and very tedious to calculate manually. Does not provide a traditional measure of Goodness of Fit. Rather, provides a metric as to whether a model may be effective for predictions at extreme points in the database. an indication of which data points could be considered extreme points in the database. L01D MGS 8110 - Regression - Goodness of Fit L01D MGS 8110 - Regression - Model Building