Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 2 of 20 Chapter 4 – Section 2 ●Learning objectives  Find the least-squares regression line and use the line to make predictions  Interpret the slope and the y-intercept of the least squares regression line  Compute the sum of squared residuals 1 2 3

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 4 of 20 Chapter 4 – Section 2 ●If we have two variables X and Y, we often would like to model the relation as a line ●Draw a line through the scatter diagram ●We want to find the line that “best” describes the linear relationship … the regression line

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 5 of 20 Chapter 4 – Section 2 ●We want to use a linear model ●Linear models can be written in several different (equivalent) ways  y = m x + b  y – y 1 = m (x – x 1 )  y = b 1 x + b 0 ●We want to use a linear model ●Linear models can be written in several different (equivalent) ways  y = m x + b  y – y 1 = m (x – x 1 )  y = b 1 x + b 0 ●Because the slope and the intercept both are important to analyze, we will use y = b 1 x + b 0

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 6 of 20 Chapter 4 – Section 2 ●One difference between math and stat is that statistics assumes that the measurements are not exact, that there is an error or residual ●The formula for the residual is always Residual = Observed – Predicted ●One difference between math and stat is that statistics assumes that the measurements are not exact, that there is an error or residual ●The formula for the residual is always Residual = Observed – Predicted ●This relationship is not just for this chapter … it is the general way of defining error in statistics

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 7 of 20 Chapter 4 – Section 2 ●For example, say that we want to predict a value of y for a specific value of x  Assume that we are using y = 10 x + 25 as our model ●For example, say that we want to predict a value of y for a specific value of x  Assume that we are using y = 10 x + 25 as our model  To predict the value of y when x = 3, the model gives us y = 10  3 + 25 = 55, or a predicted value of 55 ●For example, say that we want to predict a value of y for a specific value of x  Assume that we are using y = 10 x + 25 as our model  To predict the value of y when x = 3, the model gives us y = 10  3 + 25 = 55, or a predicted value of 55  Assume the actual value of y for x = 3 is equal to 50 ●For example, say that we want to predict a value of y for a specific value of x  Assume that we are using y = 10 x + 25 as our model  To predict the value of y when x = 3, the model gives us y = 10  3 + 25 = 55, or a predicted value of 55  Assume the actual value of y for x = 3 is equal to 50  The actual value is 50, the predicted value is 55, so the residual (or error) is 50 – 55 = –5

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 8 of 20 Chapter 4 – Section 2 ●What the residual is on the scatter diagram The model line The x value of interest The observed value y The residual The predicted value y

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 9 of 20 Chapter 4 – Section 2 ●We want to minimize the residuals, but we need to define what this means ●We use the method of least-squares  We consider a possible linear mode  We calculate the residual for each point  We add up the squares of the residuals ●We want to minimize the residuals, but we need to define what this means ●We use the method of least-squares  We consider a possible linear mode  We calculate the residual for each point  We add up the squares of the residuals ●The line that has the smallest is called the least-squares regression line

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 10 of 20 Chapter 4 – Section 2 ●The equation for the least-squares regression line is given by y = b 1 x + b 0  b 1 is the slope of the least-squares regression line  b 0 is the y-intercept of the least-squares regression line

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 11 of 20 Chapter 4 – Section 2 ●Finding the values of b 1 and b 0, by hand, is a very tedious process ●You should use software for this ●Finding the values of b 1 and b 0, by hand, is a very tedious process ●You should use software for this ●Finding the coefficients b 1 and b 0 is only the first step of a regression analysis  We need to interpret the slope b 1  We need to interpret the y-intercept b 0  We need to do quite a bit more statistical analysis … this is covered in Section 4.3 and also in Chapter 14

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 13 of 20 Chapter 4 – Section 2 ●Interpreting the slope b 1  The slope is sometimes defined as as  The slope is also sometimes defined as as ●The slope relates changes in y to changes in x

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 14 of 20 Chapter 4 – Section 2 ●For example, if b 1 = 4  If x increases by 1, then y will increase by 4  If x decreases by 1, then y will decrease by 4  A positive linear relationship ●For example, if b 1 = 4  If x increases by 1, then y will increase by 4  If x decreases by 1, then y will decrease by 4  A positive linear relationship ●For example, if b 1 = –7  If x increases by 1, then y will decrease by 7  If x decreases by 1, then y will increase by 7  A negative linear relationship

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 15 of 20 Chapter 4 – Section 2 ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable)  To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable)  To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●The model used is y = 300 x + 12,000 ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable)  To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●The model used is y = 300 x + 12,000 ●A slope of 300 means that the model predicts that, on the average, the population increases by 300 per year

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 16 of 20 Chapter 4 – Section 2 ●Interpreting the y-intercept b 0 ●Sometimes b 0 has an interpretation, and sometimes not  If 0 is a reasonable value for x, then b 0 can be interpreted as the value of y when x is 0  If 0 is not a reasonable value for x, then b 0 does not have an interpretation ●Interpreting the y-intercept b 0 ●Sometimes b 0 has an interpretation, and sometimes not  If 0 is a reasonable value for x, then b 0 can be interpreted as the value of y when x is 0  If 0 is not a reasonable value for x, then b 0 does not have an interpretation ●In general, we should not use the model for values of x that are much larger or much smaller than the observed values

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 17 of 20 Chapter 4 – Section 2 ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable)  To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable)  To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●The model used is y = 300 x + 12,000 ●For example, say that a researcher studies the population in a town (the y or response variable) in each year (the x or predictor variable)  To simplify the calculations, years are measured from 1900 (i.e. x = 55 is the year 1955) ●The model used is y = 300 x + 12,000 ●An intercept of 12,000 means that the model predicts that the town had a population of 12,000 in the year 1900 (i.e. when x = 0)

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 19 of 20 Chapter 4 – Section 2 ●After finding the slope b 1 and the intercept b 0, it is very useful to compute the residuals, particularly ●Again, this is a tedious computation ●All the least-squares regression software would compute this quantity ●We will use it in future sections

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 20 of 20 Summary: Chapter 4 – Section 2 ●We can find the least-squares regression line that is the “best” linear model for a set of data ●The slope can be interpreted as the change in y for every change of 1 in x ●The intercept can be interpreted as the value of y when x is 0, as long as a value of 0 for x is reasonable

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.

Similar presentations

Presentation on theme: "Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.

Similar presentations

Presentation on theme: "Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression."— Presentation transcript:

Similar presentations

About project

Feedback