The Science of Predicting Outcome Linear Regression The Science of Predicting Outcome
Least-Squares Regression LSR is a method for finding a line that summarizes the relationship between two variables Regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes We often use a regression line to predict the value of y for a given value of x
LSRL: Least Square Regression Line Y-intercept Slope
Example #1 - Finding the LSRL Shoe Size (men’s U.S.) Height (in) 7 64 10 69 12 71 8 68 9.5 10.5 70 11 72 12.5 74 13.5 77 Consider the following data: With this data, find the LSRL Start by entering this data into list 1 and list 2
Example #1 - Finding the LSRL We need our graphing calculator to solve the first Case for today
Example #1 - Finding the LSRL You should then see the results of the regression. a=53.24 b=1.65 r-squared=.8422 r=.9177 This is the correlation coefficient for the scatterplot!!!
Example #2 – Interpreting LSRL Interpreting the intercept When your shoe size is 0, you should be about 53.24 inches tall (Of course this does not make much sense in the context of the problem) Interpreting the slope For each increase of 1 in the shoe size, we would expect the height to increase by 1.65 inches
Example #3 – Using LSRL Making predictions How tall might you expect someone to be who has a shoe size of 12.5? Just plug in 12.5 for the shoe size above, so… Height = 53.24+1.65 (12.5)=73.865 inches (this is a prediction and is therefore not exact.)
Practice A. Find the strength of correlation between the 2 variables Student Number of Beers Blood Alcohol Level 1 5 0.1 2 0.03 3 9 0.19 6 7 0.095 0.07 0.02 11 4 13 0.085 8 0.12 0.04 0.06 10 0.05 12 14 0.09 15 0.01 16 A. Find the strength of correlation between the 2 variables B. Write the linear model for this data set C. What will be your BAC level if you drink 6 bottle of beers.
Coefficients a and b The slope is: The intercept is: S-sub y and s-sub x are the sample standard deviations of y and x (kinda like rise over run) The slope is: The intercept is: y-bar and x-bar are the mean y and x respectively The equation of the least squares regression line is written as:
This table describes a study that recorded data on number of beers consumed and blood alcohol content (BAC) for 16 students. Here is some partial computer output from Minitab relating to these data: Y-intercept Slope (a) Use the computer output to write the equation of the least-squares line. (b) Interpret the slope and y intercept of the equation in this setting. (c) What blood alcohol level would your equation predict for a student who consumed 6 beers?
Answers (a) If y = blood alcohol content (BAC) and x = number of beers, BAC = −0.01270 + 0.017964(number of beers). (b) Slope: for every extra beer consumed, the BAC will increase by an average of 0.017964. Intercept: if no beers are consumed, the BAC will be, on average, −0.01270 (obviously meaningless). (c) Predicted BAC = 0.0951
Here’s a computer generated output of 2 bivariate data Here’s a computer generated output of 2 bivariate data. Write a linear model that corresponds to these set of data. y-hat = -0.124 + 0.0179(x)
“On predicting height given arm span “ Class Activity: Arm-span vs Height “On predicting height given arm span “ Students will measure their height and arm span. Then they will write the LSRL from the data they collected and predict a person’s arm span with their height.