Presentation is loading. Please wait.

Presentation is loading. Please wait.

Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.

Similar presentations


Presentation on theme: "Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances."— Presentation transcript:

1 Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances represent “left- over” variation in the response after fitting the regression line, these distances are called residuals.

2 Or in other words, the residuals are the distances from the points to the LSRL.

3 Calculating a Residual‏ One subject's NEA rose by 135 calories and he gained 2.7 kg of fat. The predicted gain for 135 calories from the regression equation is: The residual for this subject is therefore: observed - predicted

4 Fat Gain & NEA (yet again!)‏ Here are the residuals for all 16 data values from the NEA experiment: Although residuals can be calculated from any model that is fitted to the data, the residuals from the least- squares line have a special property: the sum of the least-squares residuals is always zero. (Try adding the numbers above- - they add up to zero!)‏

5 The line y=0 corresponds with the regression line, and also marks the mean of our residuals. The residuals plot magnifies the deviations from the line to make patterns easier to see.

6 Residual Plots What to look for when examining a residual plot: 1. Residual plots should have no pattern.

7 Residual Plots What to look for when examining a residual plot: A curved pattern shows that the relationships may not be linear. Increasing spread about the line as x increases indicates the prediction will be less accurate for larger x values. Similarly, decreasing spread indicates the prediction will be less accurate for smaller x values.

8 Residual Plots What to look for when examining a residual plot: 1. The residual plot should show no pattern. 2.The residuals should be relatively small in size.

9 The role of r 2 in regression A residual plot is a graphical tool for evaluating how well a linear model fits the data. Look at the residual plot first to see if a linear model is a good fit. If the linear model is a good fit, then there is also a numerical quantity that tells us how well the LSRL does at predicting values of the response variable y. It is r 2, the coefficient of determination.

10 The role of r 2 in regression r 2 is actually the correlation squared, but there's more to the story... The idea of r 2 is this: how much better is the least- squares line at predicting responses y than if we just used our mean?

11 The role of r 2 in regression Is the LSRL better at predicting the data values than the mean? r 2 tells us how much better. Here's the line that represents the y mean of our data. Here's our LSRL

12 Note: Remember we defined the variance back when we talked about standard deviation. r 2 compares the variance from the mean (the SST part of the equation) with the residuals (the SSE part of the equation). Here's the formula:

13 For example, if r 2 =0.606 (as it does in the NEA example), then about 61% of the variation in fat gain among the individual subjects is due to the straight-line relationship between fat gain and NEA. The other 39% is individual variation among subjects that is not explained by the linear relationship.

14 When you report a regression, give r 2 as a measure of how successful the regression was in explaining the response. When you see a correlation, square it to get a better feel for the strength of the linear relationship.

15 Review Facts About Least-Square Regression  The distinction between explanatory and response variables is essential in regression. In the regression setting you must know clearly which variable is explanatory!

16 Review Facts About Least-Square Regression There is a close connection between correlation and the slope of the LSRL. The slope is This equation says that along the regression line, a change of one standard deviation in x corresponds to a change of r standard deviations in y.

17 The least-squares regression line of y on x always passes through the point (mean of x values, mean of y values) Review Facts About Least-Square Regression

18 The correlation r describes the strength of a straight-line relationship. The square of the correlation, r 2, is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x.


Download ppt "Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances."

Similar presentations


Ads by Google