Presentation is loading. Please wait.

Presentation is loading. Please wait.

Residuals, Residual Plots, & Influential points. Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of.

Similar presentations


Presentation on theme: "Residuals, Residual Plots, & Influential points. Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of."— Presentation transcript:

1 Residuals, Residual Plots, & Influential points

2 Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of the residuals is always zero error = observed - expected

3 Residual plot A scatterplot of the (x, residual) pairs. Residuals can be graphed against other statistics besides x linear associationPurpose is to tell if a linear association exist between the x & y variables no pattern linearIf no pattern exists between the points in the residual plot, then the association is linear.

4 Linear Not linear

5 AgeRange of Motion 35154 24142 40137 31133 28122 25126 26135 16135 14108 20120 21127 30122 One measure of the success of knee surgery is post-surgical range of motion for the knee joint following a knee dislocation. Is there a linear relationship between age & range of motion? Sketch a residual plot. Since there is no pattern in the residual plot, there is a linear relationship between age and range of motion x Residuals

6 AgeRange of Motion 35154 24142 40137 31133 28122 25126 26135 16135 14108 20120 21127 30122 Plot the residuals against the y- hats. How does this residual plot compare to the previous one? Residuals

7 Residual plots are the same no matter if plotted against x or y-hat. x Residuals

8 Coefficient of determination- r 2 variationygives the proportion of variation in y that can be attributed to an approximate linear relationship between x & y remains the same no matter which variable is labeled x

9 AgeRange of Motion 35154 24142 40137 31133 28122 25126 26135 16135 14108 20120 21127 30122 Let’s examine r 2. Suppose you were going to predict a future y but you didn’t know the x-value. Your best guess would be the overall mean of the existing y’s. SS y = 1564.917 Sum of the squared residuals (errors) using the mean of y.

10 AgeRange of Motion 35154 24142 40137 31133 28122 25126 26135 16135 14108 20120 21127 30122 Now suppose you were going to predict a future y but you DO know the x-value. Your best guess would be the point on the LSRL for that x-value (y-hat). Sum of the squared residuals (errors) using the LSRL. SS y = 1085.735

11 AgeRange of Motion 35154 24142 40137 31133 28122 25126 26135 16135 14108 20120 21127 30122 By what percent did the sum of the squared error go down when you went from just an “overall mean” model to the “regression on x” model? SS y = 1085.735 SS y = 1564.917 This is r 2 – the amount of the variation in the y-values that is explained by the x-values.

12 AgeRange of Motion 35154 24142 40137 31133 28122 25126 26135 16135 14108 20120 21127 30122 How well does age predict the range of motion after knee surgery? Approximately 30.6% of the variation in range of motion after knee surgery can be explained by the linear regression of age and range of motion.

13 Interpretation of r 2 r 2 % y xy Approximately r 2 % of the variation in y can be explained by the LSRL of x & y.

14 Computer-generated regression analysis of knee surgery data: PredictorCoefStdevTP Constant107.5811.129.670.000 Age0.87100.41462.100.062 s = 10.42R-sq = 30.6%R-sq(adj) = 23.7% What is the equation of the LSRL? Find the slope & y-intercept. NEVER use adjusted r 2 ! before Be sure to convert r 2 to decimal before taking the square root! What are the correlation coefficient and the coefficient of determination?

15 Outlier – largeIn a regression setting, an outlier is a data point with a large residual

16 Influential point- A point that influences where the LSRL is located If removed, it will significantly change the slope of the LSRL

17 RacketResonance Acceleration (Hz) (m/sec/sec) 110536.0 210635.0 311034.5 411136.8 511237.0 611334.0 711334.2 811433.8 911435.0 1011935.0 1112033.6 1212134.2 1312636.2 1418930.0 One factor in the development of tennis elbow is the impact-induced vibration of the racket and arm at ball contact. Sketch a scatterplot of these data. Calculate the LSRL & correlation coefficient. Does there appear to be an influential point? If so, remove it and then calculate the new LSRL & correlation coefficient.

18 (189,30) could be influential. Remove & recalculate LSRL

19 (189,30) was influential since it moved the LSRL

20 Which of these measures are resistant? LSRL Correlation coefficient Coefficient of determination NONE NONE – all are affected by outliers


Download ppt "Residuals, Residual Plots, & Influential points. Residuals (error) - The vertical deviation between the observations & the LSRL always zerothe sum of."

Similar presentations


Ads by Google