Least-Squares Regression
Regression Line (Model) It has the form y = a + bx, where b is the slope, the amount by which y changes when x increases by 1 unit where a is the intercept, the value of y when x = 0
Slope: Intercept: The line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible.
Linear Regression Purpose: To predict the value of a difficult to measure variable, Y, based on an easy to measure variable, X. Examples Predict state revenues Predict GPA based on SAT predict reaction time from blood alcohol level
Extrapolation Extrapolation is the use of a regression line for prediction far outside the range of values of the independent variable x that you used to obtain the line. Such predictions are not accurate. GRE consideration? Be Careful!
Interpreting Results The regression line always passes through the point The slope ‘says’ that along the regression line, a change of one standard deviation in x corresponds to a change of r standard deviations in y When r = 1 or –1 the change in standard units is the same
Variation The R squared value, , is the % of the variation of Y explained by the model. The higher the value, the better the model.
No Straight Line? What if the scatterplot shows a straight line model is not appropriate? Might see if some function of y is approximately linear in some function of x. Examples Plot y versus ln(x) Plot 1/y versus 1/x If so, fit straight line model in terms of new variables.
Example Let’s use the alcoholic beverage and recall data How can we tell if it is reasonable to fit a linear regression model? Let’s run the analysis and interpret the results