M23- Residuals & Minitab 1  Department of ISM, University of Alabama, 1992-2003 ResidualsResiduals A continuation of regression analysis.

Slides:



Advertisements
Similar presentations
Inference for Regression
Advertisements

Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
2nd Day: Bear Example Length (in) Weight (lb)
Objectives (BPS chapter 24)
Chapter 12 Simple Linear Regression
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
Multiple regression analysis
BA 555 Practical Business Analysis
Simple Linear Regression Analysis
Part 18: Regression Modeling 18-1/44 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics.
Correlation and Regression Analysis
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Linear Regression/Correlation
C HAPTER 3: E XAMINING R ELATIONSHIPS. S ECTION 3.3: L EAST -S QUARES R EGRESSION Correlation measures the strength and direction of the linear relationship.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Part 3: Regression and Correlation 3-1/41 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Descriptive Methods in Regression and Correlation
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Simple linear regression Linear regression with one predictor variable.
Describing the Relation Between Two Variables
Variable selection and model building Part II. Statement of situation A common situation is that there is a large set of candidate predictor variables.
M22- Regression & Correlation 1  Department of ISM, University of Alabama, Lesson Objectives  Know what the equation of a straight line is,
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 5 Summarizing Bivariate Data.
Chapter 10 Correlation and Regression
Summarizing Bivariate Data
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
M25- Growth & Transformations 1  Department of ISM, University of Alabama, Lesson Objectives: Recognize exponential growth or decay. Use log(Y.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12: Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Copyright ©2011 Nelson Education Limited Linear Regression and Correlation CHAPTER 12.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
Sequential sums of squares … or … extra sums of squares.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Non-linear Regression Example.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Multiple regression. Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Interaction regression models. What is an additive model? A regression model with p-1 predictor variables contains additive effects if the response function.
Chapter 26: Inference for Slope. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Descriptive measures of the degree of linear association R-squared and correlation.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
David Housman for Math 323 Probability and Statistics Class 05 Ion Sensitive Electrodes.
Chapter 20 Linear and Multiple Regression
Inference for Least Squares Lines
Least Square Regression
Least Square Regression
Regression and Residual Plots
EQ: How well does the line fit the data?
CHAPTER 3 Describing Relationships
Honors Statistics Review Chapters 7 & 8
Presentation transcript:

M23- Residuals & Minitab 1  Department of ISM, University of Alabama, ResidualsResiduals A continuation of regression analysis

M23- Residuals & Minitab 2  Department of ISM, University of Alabama, Lesson Objectives  Continue to build on regression analysis.  Learn how residual plots help identify problems with the analysis.

M23- Residuals & Minitab 3  Department of ISM, University of Alabama, Example 1: Sample of n = 5 students, Y = Weight in pounds, X = Height in inches. Case X Y Wt = – Ht ^ Prediction equation: r-square = ? Std. error = ? To be found later. continued …

M23- Residuals & Minitab 4  Department of ISM, University of Alabama,     HEIGHT Y = – X ^^  Residuals = distance from point to line, measured parallel to Y- axis. WEIGHT Example 1, continued

M23- Residuals & Minitab 5  Department of ISM, University of Alabama, Calculation: For each case, ^ e i = y i - y i residual = observed valueestimated mean For the i th case,

M23- Residuals & Minitab 6  Department of ISM, University of Alabama, Compute the fitted value and residual for the 4 th person in the sample; i.e., X = 72 inches, Y = 207 lbs. ^ fitted value = y = ( ) = _________ residual = e 4 = ^ y 4 - y 4 = = __________ Example 1, continued

M23- Residuals & Minitab 7  Department of ISM, University of Alabama, Residual Plots Scatterplot of residuals vs. the predicted means of Y, Y; or an X-variable. ^

M23- Residuals & Minitab 8  Department of ISM, University of Alabama,     HEIGHT Y = – X ^  WEIGHT Residuals = distance from point to line, measured parallel to Y- axis. Example 1, continued e 4 =

M23- Residuals & Minitab 9  Department of ISM, University of Alabama,     HEIGHT  Residuals Residual Plot e 4 is the residual for the 4 th case, = Example 1, continued Regression line from previous plot is rotated to horizontal.

M23- Residuals & Minitab 10  Department of ISM, University of Alabama, Residual Plot Expect random dispersion around a horizontal line at zero. Problems occur if: Unusual patterns Unusual cases Scatterplot of residuals versus the predicted means of Y, Y; or an X-variable, or Time. ^

M23- Residuals & Minitab 11  Department of ISM, University of Alabama, Residuals versus X Good random pattern 0 Residuals X, or time

M23- Residuals & Minitab 12  Department of ISM, University of Alabama, Residuals versus X Outliers? 0 Residuals X, or time Next step: ________ to determine if a recording error has occurred.

M23- Residuals & Minitab 13  Department of ISM, University of Alabama, X, or time Nonlinear relationship Residuals versus X 0 Residuals Next step: Add a “quadratic term,” or use “ ______.”

M23- Residuals & Minitab 14  Department of ISM, University of Alabama, Variance is increasing Residuals Residuals versus X X, or time Next step: Stabilize variance by using “________.”

M23- Residuals & Minitab 15  Department of ISM, University of Alabama, Residual Plots help identify Unusual patterns:  Possible curvature in the data.  Variances that are not constant as X changes. Unusual cases:  Outliers  High leverage cases  Influential cases

M23- Residuals & Minitab 16  Department of ISM, University of Alabama, Three properties of Residuals illustrated with some computations.

Y = Weight X = Height Y = Weight X = Height Y = – X ^ X Y Y ^ e = Y – Y ^.01 – Residuals Find the sum of the residuals. Find the sum of the residuals. Property 1.  round-off error

M23- Residuals & Minitab 18  Department of ISM, University of Alabama, Residuals always sum to zero. Properties of Least Squares Line  e i = 0.

Y = Weight X = Height Y = Weight X = Height Y = – X ^ X Y Y ^ e = Y – Y ^ – – e2e Property 2. Find the sum of squares of the residuals.

M23- Residuals & Minitab 20  Department of ISM, University of Alabama, Residuals always sum to zero. Properties of Least Squares Line 2.This “least squares” line produces a smaller “Sum of squared residuals” than any other straight line can.  e i 2 = SSE = < “SSE for any other line”.

M23- Residuals & Minitab 21  Department of ISM, University of Alabama,     HEIGHT  X = 68.4, Y = 159 X Y WEIGHT Property 3.

M23- Residuals & Minitab 22  Department of ISM, University of Alabama, Residuals always sum to zero. 2.This “least squares” line produces a smaller “Sum of squared residuals” than any other straight line can. Properties of Least Squares Line 3.Line always passes through the point ( x, y ).

M23- Residuals & Minitab 23  Department of ISM, University of Alabama, Illustration of unusual cases:  Outliers  Leverage  Influential

M23- Residuals & Minitab 24  Department of ISM, University of Alabama, Y X outlieroutlier X not pattern near the X-mean “Unusual point” does not follow pattern. It’s near the X-mean; the entire line pulled toward it.

M23- Residuals & Minitab 25  Department of ISM, University of Alabama, Y X outlieroutlier X not pattern twisted slightly “Unusual point” does not follow pattern. The line is pulled down and twisted slightly.

M23- Residuals & Minitab 26  Department of ISM, University of Alabama, Y X High leverage X far fromX-mean follows pattern “Unusual point” is far from the X-mean, but still follows the pattern.

M23- Residuals & Minitab 27  Department of ISM, University of Alabama, Y X leverage & outlier,influential X far from the X-mean not pattern really twists “Unusual point” is far from the X-mean, but does not follow the pattern. Line really twists !

M23- Residuals & Minitab 28  Department of ISM, University of Alabama, High Leverage Case: extreme X value An extreme X value relative to the other X values. Outlier: pattern An unusual y-value relative to the pattern of the other cases. Usually has a large residual. Definitions:

M23- Residuals & Minitab 29  Department of ISM, University of Alabama, has an unusually large effect on the slope of the least squares line. Influential Case Definitions: continued

M23- Residuals & Minitab 30  Department of ISM, University of Alabama, High leverage Definitions: continued High leverage & Outlier influential!! potentially influential. Conclusion:

M23- Residuals & Minitab 31  Department of ISM, University of Alabama, not resistant The least squares regression line is not resistant to unusual cases. Why do we care about identifying unusual cases?

M23- Residuals & Minitab 32  Department of ISM, University of Alabama, Regression Analysis in Minitab

M23- Residuals & Minitab 33  Department of ISM, University of Alabama, Lesson Objectives  Learn two ways to use Minitab to run a regression analysis.  Learn how to read output from Minitab.

M23- Residuals & Minitab 34  Department of ISM, University of Alabama, Can height be predicted using shoe size? Example 3, continued … Step 1? DTDP

M23- Residuals & Minitab 35  Department of ISM, University of Alabama, Can height be predicted using shoe size? Example 3, continued … “Jitter” added in X-direction. Scatterplot Graph Plot … The scatter for each subpopulation is about the same; i.e., there is “constant variance.” Female Male

M23- Residuals & Minitab 36  Department of ISM, University of Alabama, Stat Regression Regression … Y = a + bX Example 3, continued … Method 1

M23- Residuals & Minitab 37  Department of ISM, University of Alabama, Regression Analysis: Height versus Shoe Size The regression equation is Height = Shoe Size Predictor Coef SE Coef T P Constant Shoe Siz S = R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression Error Total Can height be predicted using shoe size? Example 3, continued … Copied from “Session Window.”

M23- Residuals & Minitab 38  Department of ISM, University of Alabama, Regression Analysis: Height versus Shoe Size The regression equation is Height = Shoe Size Predictor Coef SE Coef T P Constant Shoe Siz S = R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression Error Total Can height be predicted using shoe size? Example 3, continued … Least squares estimated coefficients. Total “Degrees of Freedom” = Number of cases - 1

M23- Residuals & Minitab 39  Department of ISM, University of Alabama, Regression Analysis: Height versus Shoe Size The regression equation is Height = Shoe Size Predictor Coef SE Coef T P Constant Shoe Siz S = R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression Error Total Can height be predicted using shoe size? Example 3, continued … R-Sq = SSR TSS =

M23- Residuals & Minitab 40  Department of ISM, University of Alabama, Regression Analysis: Height versus Shoe Size The regression equation is Height = Shoe Size Predictor Coef SE Coef T P Constant Shoe Siz S = R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression Error Total Can height be predicted using shoe size? Example 3, continued … S = MSE= 3.8 Standard Error of Regression. Standard Error of Regression. Measure of variation around the regression line. Mean Squared Error MSE Sum of squared residuals

M23- Residuals & Minitab 41  Department of ISM, University of Alabama, Are there any problems visible in this plot? ___________ No “Jitter” added. Can height be predicted using shoe size? Example 3, continued …

M23- Residuals & Minitab 42  Department of ISM, University of Alabama, Can height be predicted using shoe size? Example 3, continued … r-square = 79.1%, Std. error = inches Least squares regression equation: Height = Shoe The two summary measures that should always be given with the equation.

M23- Residuals & Minitab 43  Department of ISM, University of Alabama, Stat Regression Fitted Line Plot … Y = a + bX Can height be predicted using shoe size? Example 3, continued … This program gives a scatterplot with the regression superimposed on it. Method 2

M23- Residuals & Minitab 44  Department of ISM, University of Alabama, Can height be predicted using shoe size? Example 3, continued … The fit looks The fit looks

M23- Residuals & Minitab 45  Department of ISM, University of Alabama, Regression Analysis: Height versus Shoe Size The regression equation is Height = Shoe Size Predictor Coef SE Coef T P Constant Shoe Siz S = R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression Error Total Can height be predicted using shoe size? Example 3, continued … What information do these values provide?

M23- Residuals & Minitab 46  Department of ISM, University of Alabama, How do you determine if the X-variable is a useful predictor? Use the “t-statistic” or the F-stat. “t” measures how many standard errors the estimated coefficient is from “zero.” “F” = t 2 for simple regression. 1

M23- Residuals & Minitab 47  Department of ISM, University of Alabama, A “P-value” is associated with “t” and “F”. The further “t” and “F” are from zero, in either direction, the smaller the corresponding P-value will be. P-value: a measure of the “likelihood that the true coefficient IS ZERO.” How do you determine if the X-variable is a useful predictor? 2

M23- Residuals & Minitab 48  Department of ISM, University of Alabama, If the P-value is NOT SMALL (i.e., “> 0.10”), then conclude: 1. For all practical purposes the true coefficient MAY BE ZERO; therefore 2. The X variable IS NOT a useful predictor of the Y variable. Don’t use it. then conclude: 1. It is unlikely that the true coefficient is really zero, and therefore, 2. The X variable IS a useful predictor for the Y variable. Keep the variable! If the P-value IS SMALL (typically “< 0.10”), 3

M23- Residuals & Minitab 49  Department of ISM, University of Alabama, Regression Analysis: Height versus Shoe Size The regression equation is Height = Shoe Size Predictor Coef SE Coef T P Constant Shoe Siz S = R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression Error Total P-value: a measure of the likelihood that the true coefficient is “zero.” “t” measures how many standard errors the estimated coefficient is from “zero.” Can height be predicted using shoe size? Example 3, continued … The P-value for Shoe Size IS SMALL (< 0.10). Conclusion: The “shoe size” coefficient is NOT zero! “Shoe size” IS a useful predictor of the mean of “height”. The P-value for Shoe Size IS SMALL (< 0.10). Conclusion: The “shoe size” coefficient is NOT zero! “Shoe size” IS a useful predictor of the mean of “height”. Could “shoe size” have a true coefficient that is actually “zero”?

M23- Residuals & Minitab 50  Department of ISM, University of Alabama, The logic just explained is statistical inference. This will be covered in more detail during the last three weeks of the course.