Presentation is loading. Please wait.

Presentation is loading. Please wait.

M23- Residuals & Minitab 1  Department of ISM, University of Alabama, 1992-2003 ResidualsResiduals A continuation of regression analysis.

Similar presentations


Presentation on theme: "M23- Residuals & Minitab 1  Department of ISM, University of Alabama, 1992-2003 ResidualsResiduals A continuation of regression analysis."— Presentation transcript:

1

2 M23- Residuals & Minitab 1  Department of ISM, University of Alabama, 1992-2003 ResidualsResiduals A continuation of regression analysis

3 M23- Residuals & Minitab 2  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Continue to build on regression analysis.  Learn how residual plots help identify problems with the analysis.

4 M23- Residuals & Minitab 3  Department of ISM, University of Alabama, 1992-2003 Example 1: Sample of n = 5 students, Y = Weight in pounds, X = Height in inches. Case X Y 1 73 175 2 68 158 3 67 140 4 72 207 5 62 115 Wt = – 332.73 + 7.189 Ht ^ Prediction equation: r-square = ? Std. error = ? To be found later. continued …

5 M23- Residuals & Minitab 4  Department of ISM, University of Alabama, 1992-2003     100 120 140 160 180 200 220 6064687276 HEIGHT Y = – 332.7 + 7.189X ^^  Residuals = distance from point to line, measured parallel to Y- axis. WEIGHT Example 1, continued

6 M23- Residuals & Minitab 5  Department of ISM, University of Alabama, 1992-2003 Calculation: For each case, ^ e i = y i - y i residual = observed valueestimated mean For the i th case,

7 M23- Residuals & Minitab 6  Department of ISM, University of Alabama, 1992-2003 Compute the fitted value and residual for the 4 th person in the sample; i.e., X = 72 inches, Y = 207 lbs. ^ fitted value = y = 4 -332.73 + 7.189 ( ) = _________ residual = e 4 = ^ y 4 - y 4 = = __________ Example 1, continued

8 M23- Residuals & Minitab 7  Department of ISM, University of Alabama, 1992-2003 Residual Plots Scatterplot of residuals vs. the predicted means of Y, Y; or an X-variable. ^

9 M23- Residuals & Minitab 8  Department of ISM, University of Alabama, 1992-2003     100 120 140 160 180 200 220 6064687276 HEIGHT Y = – 332.7 + 7.189X ^  WEIGHT Residuals = distance from point to line, measured parallel to Y- axis. Example 1, continued e 4 = +22.12.

10 M23- Residuals & Minitab 9  Department of ISM, University of Alabama, 1992-2003     -24 -16 -8 0 8 16 24 6064687276 HEIGHT  Residuals Residual Plot e 4 is the residual for the 4 th case, = +22.12. Example 1, continued Regression line from previous plot is rotated to horizontal.

11 M23- Residuals & Minitab 10  Department of ISM, University of Alabama, 1992-2003 Residual Plot Expect random dispersion around a horizontal line at zero. Problems occur if: Unusual patterns Unusual cases Scatterplot of residuals versus the predicted means of Y, Y; or an X-variable, or Time. ^

12 M23- Residuals & Minitab 11  Department of ISM, University of Alabama, 1992-2003 Residuals versus X Good random pattern 0 Residuals X, or time

13 M23- Residuals & Minitab 12  Department of ISM, University of Alabama, 1992-2003 Residuals versus X Outliers? 0 Residuals X, or time Next step: ________ to determine if a recording error has occurred.

14 M23- Residuals & Minitab 13  Department of ISM, University of Alabama, 1992-2003 X, or time Nonlinear relationship Residuals versus X 0 Residuals Next step: Add a “quadratic term,” or use “ ______.”

15 M23- Residuals & Minitab 14  Department of ISM, University of Alabama, 1992-2003 0 Variance is increasing Residuals Residuals versus X X, or time Next step: Stabilize variance by using “________.”

16 M23- Residuals & Minitab 15  Department of ISM, University of Alabama, 1992-2003 Residual Plots help identify Unusual patterns:  Possible curvature in the data.  Variances that are not constant as X changes. Unusual cases:  Outliers  High leverage cases  Influential cases

17 M23- Residuals & Minitab 16  Department of ISM, University of Alabama, 1992-2003 Three properties of Residuals illustrated with some computations.

18 Y = Weight X = Height Y = Weight X = Height Y = – 332.73 + 7.189 X ^ 73 175 68 158 67 140 72 207 62 115 X Y Y ^ e = Y – Y ^.01 –17.07 1.88 192.07... 156.12 Residuals Find the sum of the residuals. Find the sum of the residuals. Property 1.  round-off error

19 M23- Residuals & Minitab 18  Department of ISM, University of Alabama, 1992-2003 1. Residuals always sum to zero. Properties of Least Squares Line  e i = 0.

20 Y = Weight X = Height Y = Weight X = Height Y = – 332.73 + 7.189 X ^ 73 175 68 158 67 140 72 207 62 115 X Y Y ^ 192.07 156.12 148.93 184.88 112.99 e = Y – Y ^ –17.07 1.88 –8.93 22.12 2.01 e2e2 291.38 3.53 79.74 489.29 4.04.01 867.98 Property 2. Find the sum of squares of the residuals.

21 M23- Residuals & Minitab 20  Department of ISM, University of Alabama, 1992-2003 1. Residuals always sum to zero. Properties of Least Squares Line 2.This “least squares” line produces a smaller “Sum of squared residuals” than any other straight line can.  e i 2 = SSE = 867.98 < “SSE for any other line”.

22 M23- Residuals & Minitab 21  Department of ISM, University of Alabama, 1992-2003     100 120 140 160 180 200 220 6064687276 HEIGHT  X = 68.4, Y = 159 X Y WEIGHT Property 3.

23 M23- Residuals & Minitab 22  Department of ISM, University of Alabama, 1992-2003 1. Residuals always sum to zero. 2.This “least squares” line produces a smaller “Sum of squared residuals” than any other straight line can. Properties of Least Squares Line 3.Line always passes through the point ( x, y ).

24 M23- Residuals & Minitab 23  Department of ISM, University of Alabama, 1992-2003 Illustration of unusual cases:  Outliers  Leverage  Influential

25 M23- Residuals & Minitab 24  Department of ISM, University of Alabama, 1992-2003 Y X outlieroutlier X not pattern near the X-mean “Unusual point” does not follow pattern. It’s near the X-mean; the entire line pulled toward it.

26 M23- Residuals & Minitab 25  Department of ISM, University of Alabama, 1992-2003 Y X outlieroutlier X not pattern twisted slightly “Unusual point” does not follow pattern. The line is pulled down and twisted slightly.

27 M23- Residuals & Minitab 26  Department of ISM, University of Alabama, 1992-2003 Y X High leverage X far fromX-mean follows pattern “Unusual point” is far from the X-mean, but still follows the pattern.

28 M23- Residuals & Minitab 27  Department of ISM, University of Alabama, 1992-2003 Y X leverage & outlier,influential X far from the X-mean not pattern really twists “Unusual point” is far from the X-mean, but does not follow the pattern. Line really twists !

29 M23- Residuals & Minitab 28  Department of ISM, University of Alabama, 1992-2003 High Leverage Case: extreme X value An extreme X value relative to the other X values. Outlier: pattern An unusual y-value relative to the pattern of the other cases. Usually has a large residual. Definitions:

30 M23- Residuals & Minitab 29  Department of ISM, University of Alabama, 1992-2003 has an unusually large effect on the slope of the least squares line. Influential Case Definitions: continued

31 M23- Residuals & Minitab 30  Department of ISM, University of Alabama, 1992-2003 High leverage Definitions: continued High leverage & Outlier influential!! potentially influential. Conclusion:

32 M23- Residuals & Minitab 31  Department of ISM, University of Alabama, 1992-2003 not resistant The least squares regression line is not resistant to unusual cases. Why do we care about identifying unusual cases?

33 M23- Residuals & Minitab 32  Department of ISM, University of Alabama, 1992-2003 Regression Analysis in Minitab

34 M23- Residuals & Minitab 33  Department of ISM, University of Alabama, 1992-2003 Lesson Objectives  Learn two ways to use Minitab to run a regression analysis.  Learn how to read output from Minitab.

35 M23- Residuals & Minitab 34  Department of ISM, University of Alabama, 1992-2003 Can height be predicted using shoe size? Example 3, continued … Step 1? DTDP

36 M23- Residuals & Minitab 35  Department of ISM, University of Alabama, 1992-2003 Can height be predicted using shoe size? Example 3, continued … “Jitter” added in X-direction. Scatterplot Graph Plot … The scatter for each subpopulation is about the same; i.e., there is “constant variance.” Female Male

37 M23- Residuals & Minitab 36  Department of ISM, University of Alabama, 1992-2003 Stat Regression Regression … Y = a + bX Example 3, continued … Method 1

38 M23- Residuals & Minitab 37  Department of ISM, University of Alabama, 1992-2003 Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Coef SE Coef T P Constant 50.5230 0.5912 85.45 0.000 Shoe Siz 1.87241 0.06033 31.04 0.000 S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression 1 3650.0 3650.0 963.26 0.000 Error 255 966.3 3.8 Total 256 4616.3 Can height be predicted using shoe size? Example 3, continued … Copied from “Session Window.”

39 M23- Residuals & Minitab 38  Department of ISM, University of Alabama, 1992-2003 Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Coef SE Coef T P Constant 50.5230 0.5912 85.45 0.000 Shoe Siz 1.87241 0.06033 31.04 0.000 S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression 1 3650.0 3650.0 963.26 0.000 Error 255 966.3 3.8 Total 256 4616.3 Can height be predicted using shoe size? Example 3, continued … Least squares estimated coefficients. Total “Degrees of Freedom” = Number of cases - 1

40 M23- Residuals & Minitab 39  Department of ISM, University of Alabama, 1992-2003 Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Coef SE Coef T P Constant 50.5230 0.5912 85.45 0.000 Shoe Siz 1.87241 0.06033 31.04 0.000 S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression 1 3650.0 3650.0 963.26 0.000 Error 255 966.3 3.8 Total 256 4616.3 Can height be predicted using shoe size? Example 3, continued … R-Sq = SSR TSS 3650.0 4616.3 =

41 M23- Residuals & Minitab 40  Department of ISM, University of Alabama, 1992-2003 Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Coef SE Coef T P Constant 50.5230 0.5912 85.45 0.000 Shoe Siz 1.87241 0.06033 31.04 0.000 S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression 1 3650.0 3650.0 963.26 0.000 Error 255 966.3 3.8 Total 256 4616.3 Can height be predicted using shoe size? Example 3, continued … S = MSE= 3.8 Standard Error of Regression. Standard Error of Regression. Measure of variation around the regression line. Mean Squared Error MSE Sum of squared residuals

42 M23- Residuals & Minitab 41  Department of ISM, University of Alabama, 1992-2003 Are there any problems visible in this plot? ___________ No “Jitter” added. Can height be predicted using shoe size? Example 3, continued …

43 M23- Residuals & Minitab 42  Department of ISM, University of Alabama, 1992-2003 Can height be predicted using shoe size? Example 3, continued … r-square = 79.1%, Std. error = 1.947 inches Least squares regression equation: Height = 50.52 + 1.872 Shoe The two summary measures that should always be given with the equation.

44 M23- Residuals & Minitab 43  Department of ISM, University of Alabama, 1992-2003 Stat Regression Fitted Line Plot … Y = a + bX Can height be predicted using shoe size? Example 3, continued … This program gives a scatterplot with the regression superimposed on it. Method 2

45 M23- Residuals & Minitab 44  Department of ISM, University of Alabama, 1992-2003 Can height be predicted using shoe size? Example 3, continued … The fit looks The fit looks

46 M23- Residuals & Minitab 45  Department of ISM, University of Alabama, 1992-2003 Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Coef SE Coef T P Constant 50.5230 0.5912 85.45 0.000 Shoe Siz 1.87241 0.06033 31.04 0.000 S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression 1 3650.0 3650.0 963.26 0.000 Error 255 966.3 3.8 Total 256 4616.3 Can height be predicted using shoe size? Example 3, continued … What information do these values provide?

47 M23- Residuals & Minitab 46  Department of ISM, University of Alabama, 1992-2003 How do you determine if the X-variable is a useful predictor? Use the “t-statistic” or the F-stat. “t” measures how many standard errors the estimated coefficient is from “zero.” “F” = t 2 for simple regression. 1

48 M23- Residuals & Minitab 47  Department of ISM, University of Alabama, 1992-2003 A “P-value” is associated with “t” and “F”. The further “t” and “F” are from zero, in either direction, the smaller the corresponding P-value will be. P-value: a measure of the “likelihood that the true coefficient IS ZERO.” How do you determine if the X-variable is a useful predictor? 2

49 M23- Residuals & Minitab 48  Department of ISM, University of Alabama, 1992-2003 If the P-value is NOT SMALL (i.e., “> 0.10”), then conclude: 1. For all practical purposes the true coefficient MAY BE ZERO; therefore 2. The X variable IS NOT a useful predictor of the Y variable. Don’t use it. then conclude: 1. It is unlikely that the true coefficient is really zero, and therefore, 2. The X variable IS a useful predictor for the Y variable. Keep the variable! If the P-value IS SMALL (typically “< 0.10”), 3

50 M23- Residuals & Minitab 49  Department of ISM, University of Alabama, 1992-2003 Regression Analysis: Height versus Shoe Size The regression equation is Height = 50.5 + 1.87 Shoe Size Predictor Coef SE Coef T P Constant 50.5230 0.5912 85.45 0.000 Shoe Siz 1.87241 0.06033 31.04 0.000 S = 1.947 R-Sq = 79.1% R-Sq(adj) = 79.0% Analysis of Variance Source DF SS MS F P Regression 1 3650.0 3650.0 963.26 0.000 Error 255 966.3 3.8 Total 256 4616.3 P-value: a measure of the likelihood that the true coefficient is “zero.” “t” measures how many standard errors the estimated coefficient is from “zero.” Can height be predicted using shoe size? Example 3, continued … The P-value for Shoe Size IS SMALL (< 0.10). Conclusion: The “shoe size” coefficient is NOT zero! “Shoe size” IS a useful predictor of the mean of “height”. The P-value for Shoe Size IS SMALL (< 0.10). Conclusion: The “shoe size” coefficient is NOT zero! “Shoe size” IS a useful predictor of the mean of “height”. Could “shoe size” have a true coefficient that is actually “zero”?

51 M23- Residuals & Minitab 50  Department of ISM, University of Alabama, 1992-2003 The logic just explained is statistical inference. This will be covered in more detail during the last three weeks of the course.


Download ppt "M23- Residuals & Minitab 1  Department of ISM, University of Alabama, 1992-2003 ResidualsResiduals A continuation of regression analysis."

Similar presentations


Ads by Google