Chapter 14, part C Goodness of Fit..

Slides:



Advertisements
Similar presentations
Coefficient of Determination- R²
Advertisements

Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Statistics Measures of Regression and Prediction Intervals.
 Coefficient of Determination Section 4.3 Alan Craig
Ch.6 Simple Linear Regression: Continued
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Chapter Topics Types of Regression Models
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Regression Analysis In regression analysis we analyze the relationship between two or more variables. The relationship between two or more variables could.
BCOR 1020 Business Statistics
Introduction to Linear Regression and Correlation Analysis
Econ 3790: Business and Economics Statistics
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal
Anthony Greene1 Regression Using Correlation To Make Predictions.
Further Topics in Regression Analysis Objectives: By the end of this section, I will be able to… 1) Explain prediction error, calculate SSE, and.
Linear Regression Least Squares Method: the Meaning of r 2.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Objective: Understanding and using linear regression Answer the following questions: (c) If one house is larger in size than another, do you think it affects.
Regression Analysis Relationship with one independent variable.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Chapter Thirteen Bivariate Correlation and Regression Chapter Thirteen.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Chapter 16 Multiple Regression and Correlation
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Section 9.3 Measures of Regression and Prediction Intervals.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Chapter 14, continued More simple linear regression Download this presentation.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 19 Measure of Variation in the Simple Linear Regression Model (Data)Data.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Multiple Regression.
The simple linear regression model and parameter estimation
Lecture 11: Simple Linear Regression
More Multiple Regression
Chapter 20 Linear and Multiple Regression
Lecture #26 Thursday, November 17, 2016 Textbook: 14.1 and 14.3
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Statistics for Managers using Microsoft Excel 3rd Edition
Linear Regression and Correlation Analysis
Prediction, Goodness-of-Fit, and Modeling Issues
Ch12.1 Simple Linear Regression
Essentials of Modern Business Statistics (7e)
Statistics for Business and Economics (13e)
Relationship with one independent variable
Chapter 13 Simple Linear Regression
(Residuals and
Quantitative Methods Simple Regression.
Econ 3790: Business and Economics Statistics
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Section 11.1: Least squares estimation CIS Computational.
Linear Regression.
Multiple Regression.
Prepared by Lee Revere and John Large
Simple Linear Regression
Least-Squares Regression
More Multiple Regression
24/02/11 Tutorial 3 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ)
PENGOLAHAN DAN PENYAJIAN
More Multiple Regression
Least Squares Method: the Meaning of r2
Relationship with one independent variable
Correlation and Regression
Least-Squares Regression
Simple Linear Regression
Introduction to Regression
St. Edward’s University
Chapter 13 Simple Linear Regression
Forecasting 3 Regression Analysis Ardavan Asef-Vaziri
Presentation transcript:

Chapter 14, part C Goodness of Fit.

III. Coefficient of Determination We developed an equation, but we don’t really know how well it fits the data. The coefficient of variation gives us a measure of the goodness of fit for an estimated regression equation. How closely an estimate, comes to the actual value, yi is called a residual.

A. Total Sum of Squares (SST) If you had to estimate repair cost but had no knowledge of the car’s age, what would be your best guess? Probably the mean repair cost. If we subtract each yi from the mean, we calculate the error involved in using the mean to estimate cost. I hope our regression equation does a better job of estimating repair cost than just using the mean!

The calculation of SST For the 4th observation, this difference is 300-276=24. Do this for each observation, square it and sum them and you calculate SST=62,870.

B. Sum of Squares due to Error (SSE) Every ith observation has a residual. The process of Least Squares minimizes the sum of the squared residuals. Some observations will be overestimated, some underestimated. A predicted yi that is $20 too high is just as large of a “miss” as $20 too low. So squaring each residual gives equal weight to positive and negative residuals of equal magnitude.

You can also see the variation around the mean, 276. Take the 4th observation. The estimated repair cost for a 4-year old car is $351.50, but the actual data for y4=$300. So the residual is 51.50. Square this for every observation and sum them and you get SSE=5867.50. =51.50 =276 You can also see the variation around the mean, 276.

C. Sum of Squares due to Regression (SSR) So SSE measures how closely observations are clustered around the regression line, SST measures how closely they are clustered around the mean. What’s left over is called SSR. SST = SSE + SSR, where Since our regression model is designed to minimize SSE, I would hope that SSR would make up the bulk of the total variation in y. =57,002.5

D. Coefficient of Determination (R2) All of the variation in y is represented by SST, and since least squares is designed to minimize SSE, then a very good model is one that explains most of the variation in y and would thus have a very small SSE. Equivalently, you could think of a good model as having a large SSR, relative to SSE. If so, SSR/SST is very close to being equal to 1.

This ratio, of SSR to SST is called R2, the coefficient of determination. A terrible model has a very large SSE, and a very small SSR, so R2 is very close to zero. An excellent model has a R2 very close to 1.

Interpretation of R2 In the repair cost example, R2=.9067. This means that 90.67% of the total sum of squares can be explained by using the estimated regression equation between age and repair cost.

Excel Output I’ve highlighted the relevant information in the table of regression output. Can you pick out the important information that we have been discussing?