Download presentation
Presentation is loading. Please wait.
Published byLydia Harmon Modified over 9 years ago
1
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression
2
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 2 of 20 Chapter 4 – Section 2 ●Learning objectives Find the least-squares regression line and use the line to make predictions Interpret the slope and the y-intercept of the least squares regression line Compute the sum of squared residuals 1 2 3
3
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 3 of 20 Chapter 4 – Section 2 ●Learning objectives Find the least-squares regression line and use the line to make predictions Interpret the slope and the y-intercept of the least squares regression line Compute the sum of squared residuals 1 2 3
4
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 4 of 20 Chapter 4 – Section 2 ●If we have two variables X and Y, we often would like to model the relation as a line ●Draw a line through the scatter diagram ●We want to find the line that “best” describes the linear relationship … the regression line
5
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 5 of 20 Chapter 4 – Section 2 ●We want to use a linear model ●Linear models can be written in several different (equivalent) ways y = m x + b y – y 1 = m (x – x 1 ) y = b 1 x + b 0 ●We want to use a linear model ●Linear models can be written in several different (equivalent) ways y = m x + b y – y 1 = m (x – x 1 ) y = b 1 x + b 0 ●Because the slope and the intercept both are important to analyze, we will use y = b 1 x + b 0
6
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 6 of 20 Chapter 4 – Section 2 ●One difference between math and stat is that statistics assumes that the measurements are not exact, that there is an error or residual ●The formula for the residual is always Residual = Observed – Predicted ●One difference between math and stat is that statistics assumes that the measurements are not exact, that there is an error or residual ●The formula for the residual is always Residual = Observed – Predicted ●This relationship is not just for this chapter … it is the general way of defining error in statistics
7
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 7 of 20 Chapter 4 – Section 2 ●For example, say that we want to predict a value of y for a specific value of x Assume that we are using y = 10 x + 25 as our model ●For example, say that we want to predict a value of y for a specific value of x Assume that we are using y = 10 x + 25 as our model To predict the value of y when x = 3, the model gives us y = 10 3 + 25 = 55, or a predicted value of 55 ●For example, say that we want to predict a value of y for a specific value of x Assume that we are using y = 10 x + 25 as our model To predict the value of y when x = 3, the model gives us y = 10 3 + 25 = 55, or a predicted value of 55 Assume the actual value of y for x = 3 is equal to 50 ●For example, say that we want to predict a value of y for a specific value of x Assume that we are using y = 10 x + 25 as our model To predict the value of y when x = 3, the model gives us y = 10 3 + 25 = 55, or a predicted value of 55 Assume the actual value of y for x = 3 is equal to 50 The actual value is 50, the predicted value is 55, so the residual (or error) is 50 – 55 = –5
8
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 8 of 20 Chapter 4 – Section 2 ●What the residual is on the scatter diagram The model line The x value of interest The observed value y The residual The predicted value y
9
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 9 of 20 Chapter 4 – Section 2 ●We want to minimize the residuals, but we need to define what this means ●We use the method of least-squares We consider a possible linear mode We calculate the residual for each point We add up the squares of the residuals ●We want to minimize the residuals, but we need to define what this means ●We use the method of least-squares We consider a possible linear mode We calculate the residual for each point We add up the squares of the residuals ●The line that has the smallest is called the least-squares regression line
10
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 10 of 20 Chapter 4 – Section 2 ●The equation for the least-squares regression line is given by y = b 1 x + b 0 b 1 is the slope of the least-squares regression line b 0 is the y-intercept of the least-squares regression line
11
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 11 of 20 Chapter 4 – Section 2 ●Finding the values of b 1 and b 0, by hand, is a very tedious process ●You should use software for this ●Finding the values of b 1 and b 0, by hand, is a very tedious process ●You should use software for this ●Finding the coefficients b 1 and b 0 is only the first step of a regression analysis We need to interpret the slope b 1 We need to interpret the y-intercept b 0 We need to do quite a bit more statistical analysis … this is covered in Section 4.3 and also in Chapter 14
12
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 12 of 20 Chapter 4 – Section 2 ●Learning objectives Find the least-squares regression line and use the line to make predictions Interpret the slope and the y-intercept of the least squares regression line Compute the sum of squared residuals 1 2 3
13
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 13 of 20 Chapter 4 – Section 2 ●Interpreting the slope b 1 The slope is sometimes defined as as The slope is also sometimes defined as as ●The slope relates changes in y to changes in x
14
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 14 of 20 Chapter 4 – Section 2 ●For example, if b 1 = 4 If x increases by 1, then y will increase by 4 If x decreases by 1, then y will decrease by 4 A positive linear relationship ●For example, if b 1 = 4 If x increases by 1, then y will increase by 4 If x decreases by 1, then y will decrease by 4 A positive linear relationship ●For example, if b 1 = –7 If x increases by 1, then y will decrease by 7 If x decreases by 1, then y will increase by 7 A negative linear relationship
15
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 15 of 20 Chapter 4 – Section 2 ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●The model used is y = 300 x + 12,000 ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●The model used is y = 300 x + 12,000 ●A slope of 300 means that the model predicts that, on the average, the population increases by 300 per year
16
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 16 of 20 Chapter 4 – Section 2 ●Interpreting the y-intercept b 0 ●Sometimes b 0 has an interpretation, and sometimes not If 0 is a reasonable value for x, then b 0 can be interpreted as the value of y when x is 0 If 0 is not a reasonable value for x, then b 0 does not have an interpretation ●Interpreting the y-intercept b 0 ●Sometimes b 0 has an interpretation, and sometimes not If 0 is a reasonable value for x, then b 0 can be interpreted as the value of y when x is 0 If 0 is not a reasonable value for x, then b 0 does not have an interpretation ●In general, we should not use the model for values of x that are much larger or much smaller than the observed values
17
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 17 of 20 Chapter 4 – Section 2 ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●The model used is y = 300 x + 12,000 ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable) To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●The model used is y = 300 x + 12,000 ●An intercept of 12,000 means that the model predicts that the town had a population of 12,000 in the year 1900 (i.e. when x = 0)
18
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 18 of 20 Chapter 4 – Section 2 ●Learning objectives Find the least-squares regression line and use the line to make predictions Interpret the slope and the y-intercept of the least squares regression line Compute the sum of squared residuals 1 2 3
19
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 19 of 20 Chapter 4 – Section 2 ●After finding the slope b 1 and the intercept b 0, it is very useful to compute the residuals, particularly ●Again, this is a tedious computation ●All the least-squares regression software would compute this quantity ●We will use it in future sections
20
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 20 of 20 Summary: Chapter 4 – Section 2 ●We can find the least-squares regression line that is the “best” linear model for a set of data ●The slope can be interpreted as the change in y for every change of 1 in x ●The intercept can be interpreted as the value of y when x is 0, as long as a value of 0 for x is reasonable
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.