Download presentation
Presentation is loading. Please wait.
Published byGiles Booker Modified over 8 years ago
1
Unit 3 – Association: Contingency, Correlation, and Regression Lesson 3-3 Linear Regression, Residuals, and Variation
2
3-3 Learning Objectives 1) Regression Line 2) Regression Equation 3) Residuals 4) Calculating the Regression Line/Equation 5) Slope vs. Correlation 6) Squared Correlation (Variation)
3
Objective 1: REGRESSION LINE A regression line the value of the response variable (y) for value of the explanatory variable (x). In previous math classes we called it… …because it is the best possible line to fit the given data. FYI - The line itself is calculated using the method of ‘least squares’, which is a calculation that minimizes the sum of the squared residuals or differences of every value.predicts any given line of best fit
4
Objective 1: REGRESSION LINE EXAMPLE A: What is the predicted value of y when x = 45? y = 200 (approx.) NOT ON GUIDED NOTES!! The regression line runs through the calculated point calculated point
5
Objective 2: REGRESSION EQUATION The regression equation is the exact equation of the regression line. It tells us the predicted value of y for any value of x. Predicted value of y ‘y hat’ y-intercept slope Think about it: How does this differ from a similar equation you have learned in previous math courses?
6
Objective 2: REGRESSION EQUATION The y-intercept (a): Tells us the predicted value for y when x = 0 Helps in plotting the line (to see where it starts) May not have any interpretative value if no observations had x values near 0 The slope (b): (gets multiplied by x value) Measures the change in the predicted variable (y) for a 1 unit increase in the explanatory variable in (x)
7
Objective 2: REGRESSION EQUATION Example B: Anthropologists predict the height of a human using their remains with the following regression equation: Understand the variables: ____ is the predicted height and ____ is the length of a femur (thighbone), measured in centimeters Interpret the slope in the context: A ___ cm increase in femur length results in a ____ cm increase in predicted height. Interpret the y-intercept in the context: A person with a femur length of ____ is ____ cm tall. 1 2.4 0 61.4 See, sometimes y-intercept has no contextual value!
8
Objective 2: REGRESSION EQUATION Example B: Anthropologists predict the height of a human using their remains with the following regression equation: (50) Use the regression equation: to predict the height of a person whose recovered femur length was 50 centimeters. 120 181.4 1 cm =.033 feet 181.4(.033) = 5 feet 11 inches
9
Objective 2: REGRESSION EQUATION So regression lines and equations allow us to: Predict a single value of the response variable But… we should not expect all subjects at that value of x to have the same value of y… Variability occurs in the y values! It’s only a prediction! (levels of change)
10
Objective 3: RESIDUALS A single regression line/equation is used to represent correlation between many data points, thus some of the actual data will fall above the predicted regression or below. We call the difference between where a value is and where the regression equation says it should be, its… residual (A residual is found by taking the real y value and subtracting the predicted y value)
11
Objective 3: RESIDUALS A few things to note about residuals: Measures the size of the prediction ___________ error (copy this plot onto your notes) Is there error between the points and line?
12
A few things to note about residuals: It is the ____________ distance between the point and the regression line Objective 3: RESIDUALS Some above, some below. Various distance. vertical
13
A few things to note about residuals: Each and every observation has a residual. Objective 3: RESIDUALS All 5 points have one here. Is it possible for a point to have a residual of 0?
14
A few things to note about residuals: A large residual indicates an… Objective 3: RESIDUALS Which point has the largest residual? What would we call it? unusual observation
15
A few things to note about residuals: All residuals when summed will equal… Objective 3: RESIDUALS 10 + 3 = 13 -5 + -2 + -6 = -13 13 + -13 = 0 0 10 3 -5 -2 -6
16
Objective 3: RESIDUALS Example C: From earlier we saw the predicted value for a person with femur bone length = 50 cm was: _______ If an actual person who had a 50 cm femur bone had a height of 192cm, what is their residual? 181.4 192 – 181.4 = 10.6 cm They were 10.6 cm taller than was predicted.
17
Objective 4: CALCULATING REGRESSION Your objective is to take all the given information and convert it into a prediction/regression equation Standard deviation of y, divided by standard deviation of x. Multiplied by r, which is the correlation of the data Take b (from above formula) and multiply by the mean of the x values. Subtract from the mean of the y values.
18
Objective 4: CALCULATING REGRESSION Example D: Generate the regression equation with the given information (pure math example). 0.653 0.368 0.0091 = 26.4 4.979.275 26.4 = -2.281 -2.28126.4
19
Objective 4: CALCULATING REGRESSION Using a calculator to find regression equations.
20
Objective 5: SLOPE VS CORRELATION Describes the strength of the linear association Does not change when the units change (stuck between -1 and 1) Does not depend upon which variable is the response and which is the explanatory (flipping them won’t change the strength) Does not have specific predictive power Does not tell us strength of the association (tells us rate of change) Changes based on the units The two variables must be identified as response and explanatory Used to predict values of the response variable for given values of the explanatory variable SLOPE CORRELATION BOTH Give us direction of the association (positive or negative)
21
Objective 6: SQUARED CORRELATION Looking at the last calculator screen above (in your notes), which of those items have we not discussed yet? Squaring the correlation is a way to measure the proportion of the variation in the response variable (y-values) that is accounted for by the linear relationship with the explanatory variable (x-values). r2r2r2r2 A correlation of r =.90 for example…. So ____% of the variation in y-values can be explained by the x-values..90 2 (strong) =.81 81 (squaring also removes negative values)
22
Objective 6: SQUARED CORRELATION Example E: Current GPA (response) was measured for 50 students as well as their self-reported study time which had a correlation of r =.68 and their number of absences which had a correlation of r = -.48. Interpret the direction and strength of the correlations (r): Interpret the variation in response due to each explanatory (r 2 ): Study time: Absences:.68 is a strong, positive correlation -.48 is a weak, negative correlation.68 2 =.4624 -.48 2 =.2304 46% of the variation in GPA can be accounted for by study time. 23% of the variation in GPA can be accounted for by attendance.
23
‘Running’ Out of Interest: REVIEW AND APPLY The scatterplot (page 4 in your notes) represents the average number of hours a newly purchased treadmill is used by a single person per week for the first year of ownership. (i.e. in the first month, they used it 10 hours the first week, 9 the second week, 10 the third week, and 11 the fourth week for an average of 10 hours in the first month.) Work through each part referring to the entire lesson. Check your work when finished. SOLUTIONS: A) B) C) D) E) Slope: Every 1 month of owning treadmill, hours of use decrease by.66 (about 40 minutes) Y-Intercept: At x = 0 (brand new) used 9.86 hours. (3, 7.86) (13, 1.22) 9.86/.665 = 14.8 (14.8, 0) 10 was really 2, prediction says 10 is 3.21 2 – 3.21 = -1.21 (uses 1.21 hours less than predicted).808 or 81% of the variation in hours using treadmill is accounted for by the time they have owned it.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.