3.2 - Least- Squares Regression
Where else have we seen “residuals?” Sx = data point - mean (observed - predicted) z-scores = observed - expected * note : this is just the numerator of these calculations Remember:AP
Below is the LSRL for sprint time (seconds) and the long jump distance (inches) Find and interpret the residual for John who had a time of 8.09 seconds and a jump of 151 inches. predicted long jump distance = (sprint time) residual = observed - predicted 151 residual = inches John jumped much farther than what was predicted by our least squares regression line. He jumped almost 70 inches farther, based on his sprint time
So why least squared regression line? bcs.whfreeman.com/tps4e /#628644__666392__
Residual Plots a scatterplot of the residuals against the explanatory variable. Use to help assess the strength of your regression line
Residual Plots with Normal Probability Plots we want the graphs to be linear to support the Normality of our data. with Residual Plots we want the residuals to be very scattered so our data is can be model with a linear regression. Remember: Correlation does NOT assess linearity, just strength and direction!
What’s a Good Residual Plot? No obvious pattern - the LSRL would be in the middle of the data, some data above and some below Relatively small residuals - the data points are close to the LSRL
Do the following residual plots support or refute a linear model?
ssk2xqLJNuePfgeyx44Hy
How to Graph? Take each data point and determine the residual Plot the residuals versus the explanatory variable i.e. (explanatory data, residual) explanatory variable residual use the same numbers as your scatterplot
Calculator Construction If you have a lot of data, follow the instructions on page 178 to construct your residual plot (you will also have to have done the technology corner on p. 170)
What is Standard Deviation? the average squared distance a data point is from the mean Is there a s x ? Is there a s y ? So why not s? (standard deviation of residuals)
Standard Deviation of Residuals gives the approximate size of an “average” or “typical” prediction error from our LSRL formula on page 177 Why divide by n-2?