Download presentation
Presentation is loading. Please wait.
1
Chapter 3 Describing Relationships Section 3.2
Least-Squares Regression
2
Least-Squares Regression
MAKE predictions using regression lines, keeping in mind the dangers of extrapolation. CALCULATE and interpret a residual. INTERPRET the slope and y intercept of a regression line. DETERMINE the equation of a least-squares regression line using technology or computer output. CONSTRUCT and INTERPRET residual plots to assess whether a regression model is appropriate.
3
Least-Squares Regression
INTERPRET the standard deviation of the residuals and r2 and use these values to assess how well a least-squares regression line models the relationship between two variables. DESCRIBE how the least-squares regression line, standard deviation of the residuals, and r2 are influenced by outliers. FIND the slope and y intercept of the least-squares regression line from the means and standard deviations of x and y and their correlation.
4
Regression Lines Linear (straight-line) relationships between two quantitative variables are common. A regression line summarizes the relationship between two variables, but only in a specific setting: when one variable helps explain the other.
5
Regression Lines Linear (straight-line) relationships between two quantitative variables are common. A regression line summarizes the relationship between two variables, but only in a specific setting: when one variable helps explain the other.
6
Regression Lines Linear (straight-line) relationships between two quantitative variables are common. A regression line summarizes the relationship between two variables, but only in a specific setting: when one variable helps explain the other. A regression line is a line that describes how a response variable y changes as an explanatory variable x changes. Regression lines are expressed in the form π¦ = π 0 + π 1 π₯ where π¦ (pronounced βy-hatβ) is the predicted value of y for a given value of x.
7
Prediction A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Predict the price of a Ford F-150 that has been driven 100,000 miles.
8
Prediction A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Predict the price of a Ford F-150 that has been driven 100,000 miles.
9
Prediction πππππ =38257β0.1629 πππππ ππππ£ππ
A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Predict the price of a Ford F-150 that has been driven 100,000 miles. πππππ =38257β πππππ ππππ£ππ
10
Prediction πππππ =38257β0.1629 πππππ ππππ£ππ πππππ =38257β0.1629 100000
A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Predict the price of a Ford F-150 that has been driven 100,000 miles. πππππ =38257β πππππ ππππ£ππ πππππ =38257β
11
Prediction πππππ =38257β0.1629 πππππ ππππ£ππ πππππ =38257β0.1629 100000
A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Predict the price of a Ford F-150 that has been driven 100,000 miles. πππππ =38257β πππππ ππππ£ππ πππππ =38257β πππππ =$21,967
12
Extrapolation Can we predict the price of a Ford F-150 with 300,000 miles driven?
13
Extrapolation πππππ =38257β0.1629 πππππ ππππ£ππ
Can we predict the price of a Ford F-150 with 300,000 miles driven? πππππ =38257β πππππ ππππ£ππ
14
Extrapolation πππππ =38257β0.1629 πππππ ππππ£ππ
Can we predict the price of a Ford F-150 with 300,000 miles driven? πππππ =38257β πππππ ππππ£ππ πππππ =38257β
15
Extrapolation πππππ =38257β0.1629 πππππ ππππ£ππ
Can we predict the price of a Ford F-150 with 300,000 miles driven? πππππ =38257β πππππ ππππ£ππ πππππ =38257β πππππ =β$10,613
16
Extrapolation πππππ =38257β0.1629 πππππ ππππ£ππ
Can we predict the price of a Ford F-150 with 300,000 miles driven? πππππ =38257β πππππ ππππ£ππ πππππ =38257β Extrapolation is the use of a regression line for prediction far outside the interval of x values used to obtain the line. Such predictions are often not accurate. πππππ =β$10,613
17
Extrapolation πππππ =38257β0.1629 πππππ ππππ£ππ
Can we predict the price of a Ford F-150 with 300,000 miles driven? πππππ =38257β πππππ ππππ£ππ πππππ =38257β Extrapolation is the use of a regression line for prediction far outside the interval of x values used to obtain the line. Such predictions are often not accurate. πππππ =β$10,613 CAUTION: Donβt make predictions using values of x that are much larger or much smaller than those that actually appear in your data.
18
Residuals In most cases, no line will pass exactly through all the points in a scatterplot. Because we use the line to predict y from x, the prediction errors we make are errors in y, the vertical direction in the scatterplot. These vertical distances are called residuals (the βleftoverβ variation in the response variable).
19
Residuals In most cases, no line will pass exactly through all the points in a scatterplot. Because we use the line to predict y from x, the prediction errors we make are errors in y, the vertical direction in the scatterplot. These vertical distances are called residuals (the βleftoverβ variation in the response variable).
20
Residuals In most cases, no line will pass exactly through all the points in a scatterplot. Because we use the line to predict y from x, the prediction errors we make are errors in y, the vertical direction in the scatterplot. These vertical distances are called residuals (the βleftoverβ variation in the response variable). A residual is the difference between the actual value of y and the value of y predicted by the regression line.
21
Residuals In most cases, no line will pass exactly through all the points in a scatterplot. Because we use the line to predict y from x, the prediction errors we make are errors in y, the vertical direction in the scatterplot. These vertical distances are called residuals (the βleftoverβ variation in the response variable). A residual is the difference between the actual value of y and the value of y predicted by the regression line. πππ πππ’ππ=πππ‘π’ππ π¦ βπππππππ‘ππ π¦ =π¦ β π¦
22
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles.
23
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price.
24
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. πππππ =38257β πππππ ππππ£ππ
25
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. πππππ =38257β πππππ ππππ£ππ πππππ =38257β
26
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. πππππ =38257β πππππ ππππ£ππ πππππ =38257β πππππ =$26,759
27
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. Find the residual. πππππ =38257β πππππ ππππ£ππ πππππ =38257β πππππ =$26,759
28
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. Find the residual. πππππ =38257β πππππ ππππ£ππ πππ πππ’ππ=πππππ β πππππ πππππ =38257β πππππ =$26,759
29
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. Find the residual. πππππ =38257β πππππ ππππ£ππ πππ πππ’ππ=πππππ β πππππ πππππ =38257β πππ πππ’ππ=πππππ βπππππ πππππ =$26,759
30
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. Find the residual. πππππ =38257β πππππ ππππ£ππ πππ πππ’ππ=πππππ β πππππ πππππ =38257β πππ πππ’ππ=πππππ βπππππ πππππ =$26,759 πππ πππ’ππ=β$4765
31
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. Find the residual. πππππ =38257β πππππ ππππ£ππ πππ πππ’ππ=πππππ β πππππ πππππ =38257β πππ πππ’ππ=πππππ βπππππ πππππ =$26,759 πππ πππ’ππ=β$4765 Interpret the residual.
32
Residuals A random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s was selected from among those listed for sale at autotrader.com. The data are shown in the table. For these data, the regression equation is πππππ =38257β πππππ ππππ£ππ . Calculate and interpret the residual for the truck that was driven 70,583 miles. Find the predicted price. Find the residual. πππππ =38257β πππππ ππππ£ππ πππ πππ’ππ=πππππ β πππππ πππππ =38257β πππ πππ’ππ=πππππ βπππππ πππππ =$26,759 πππ πππ’ππ=β$4765 Interpret the residual. The actual price of this truck is $4765 less than the cost predicted by the regression line with x = miles driven.
33
Interpreting a Regression Line
A regression line is a model for the data, much like the density curves of Chapter 2. The y intercept and slope of the regression line describe what this model tells us about the relationship between the response variable y and the explanatory variable x.
34
Interpreting a Regression Line
A regression line is a model for the data, much like the density curves of Chapter 2. The y intercept and slope of the regression line describe what this model tells us about the relationship between the response variable y and the explanatory variable x. In the regression equation π¦ = π 0 + π 1 π₯ : π 0 is the y intercept, the predicted value of y when x = 0 π 1 is the slope, the amount by which the predicted value of y changes when x increases by 1 unit
35
Interpreting a Regression Line
Recall that for a random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s, the regression equation is πππππ =38257β πππππ ππππ£ππ . Interpret the slope of the regression line. Does the value of the y intercept have meaning in this context? If so, interpret the y intercept. If not, explain why.
36
Interpreting a Regression Line
Recall that for a random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s, the regression equation is πππππ =38257β πππππ ππππ£ππ . Interpret the slope of the regression line. Does the value of the y intercept have meaning in this context? If so, interpret the y intercept. If not, explain why. Interpret the slope.
37
Interpreting a Regression Line
Recall that for a random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s, the regression equation is πππππ =38257β πππππ ππππ£ππ . Interpret the slope of the regression line. Does the value of the y intercept have meaning in this context? If so, interpret the y intercept. If not, explain why. Interpret the slope. The predicted price of a used Ford F-150 goes down by $ (16.29 cents) for each additional mile that the truck has been driven.
38
Interpreting a Regression Line
Recall that for a random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s, the regression equation is πππππ =38257β πππππ ππππ£ππ . Interpret the slope of the regression line. Does the value of the y intercept have meaning in this context? If so, interpret the y intercept. If not, explain why. Interpret the slope. The predicted price of a used Ford F-150 goes down by $ (16.29 cents) for each additional mile that the truck has been driven. Interpret the y intercept.
39
Interpreting a Regression Line
Recall that for a random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s, the regression equation is πππππ =38257β πππππ ππππ£ππ . Interpret the slope of the regression line. Does the value of the y intercept have meaning in this context? If so, interpret the y intercept. If not, explain why. Interpret the slope. The predicted price of a used Ford F-150 goes down by $ (16.29 cents) for each additional mile that the truck has been driven. Interpret the y intercept. The predicted price (in dollars) of a used Ford F-150 that has been driven 0 miles. (The y intercept does have meaning in this case, as it is possible to have a number of miles driven near 0 miles.)
40
Interpreting a Regression Line
Recall that for a random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s, the regression equation is πππππ =38257β πππππ ππππ£ππ . Interpret the slope of the regression line. Does the value of the y intercept have meaning in this context? If so, interpret the y intercept. If not, explain why. important to include the word predicted (or equivalent) in your response. Otherwise, it might appear that you believe the regression equation provides actual values of y. When asked to interpret the slope or y intercept, it is very CAUTION: Interpret the slope. The predicted price of a used Ford F-150 goes down by $ (16.29 cents) for each additional mile that the truck has been driven. Interpret the y intercept. The predicted price (in dollars) of a used Ford F-150 that has been driven 0 miles. (The y intercept does have meaning in this case, as it is possible to have a number of miles driven near 0 miles.)
41
The Least-Squares Regression Line
There are many different lines we could use to model the association in a particular scatterplot. A good regression line makes the residuals as small as possible. The regression line we prefer is the one that minimizes the sum of the squared residuals.
42
The Least-Squares Regression Line
There are many different lines we could use to model the association in a particular scatterplot. A good regression line makes the residuals as small as possible. The regression line we prefer is the one that minimizes the sum of the squared residuals.
43
The Least-Squares Regression Line
There are many different lines we could use to model the association in a particular scatterplot. A good regression line makes the residuals as small as possible. The regression line we prefer is the one that minimizes the sum of the squared residuals. The least-squares regression line is the line that makes the sum of the squared residuals as small as possible.
44
p. 184 and 187 Using your Calculator
We are going to practice using your calculator to make a scatter plot and residual plot.
45
Determining if a Linear Model Is Appropriate: Residual Plots
One of the first principles of data analysis is to look for an overall pattern and for striking departures from the pattern. A regression line describes the overall pattern of a linear relationship between an explanatory variable and a response variable. We see departures from this pattern by looking at a residual plot.
46
Determining if a Linear Model Is Appropriate: Residual Plots
One of the first principles of data analysis is to look for an overall pattern and for striking departures from the pattern. A regression line describes the overall pattern of a linear relationship between an explanatory variable and a response variable. We see departures from this pattern by looking at a residual plot. A residual plot is a scatterplot that displays the residuals on the vertical axis and the explanatory variable on the horizontal axis.
47
Determining if a Linear Model Is Appropriate: Residual Plots
One of the first principles of data analysis is to look for an overall pattern and for striking departures from the pattern. A regression line describes the overall pattern of a linear relationship between an explanatory variable and a response variable. We see departures from this pattern by looking at a residual plot. A residual plot is a scatterplot that displays the residuals on the vertical axis and the explanatory variable on the horizontal axis.
48
Determining if a Linear Model Is Appropriate: Residual Plots
A residual plot magnifies the deviations of the points from the line, making it easier to see unusual observations and patterns. If a regression model is appropriate: The residual plot should show no obvious patterns. The residuals should be relatively small in size.
49
Determining if a Linear Model Is Appropriate: Residual Plots
A residual plot magnifies the deviations of the points from the line, making it easier to see unusual observations and patterns. If a regression model is appropriate: The residual plot should show no obvious patterns. The residuals should be relatively small in size.
50
Determining if a Linear Model Is Appropriate: Residual Plots
A residual plot magnifies the deviations of the points from the line, making it easier to see unusual observations and patterns. If a regression model is appropriate: The residual plot should show no obvious patterns. The residuals should be relatively small in size. Pattern in residuals Linear model not appropriate
51
Determining if a Linear Model Is Appropriate: Residual Plots
How to Interpret a Residual Plot To determine whether the regression model is appropriate, look at the residual plot. If there is no leftover curved pattern in the residual plot, the regression model is appropriate. If there is a leftover curved pattern in the residual plot, consider using a regression model with a different form.
52
Residual Plots Recall that for a random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s, the least-squares regression equation is πππππ =38257β πππππ ππππ£ππ . For this model, technology produced the following residual plot. Is a linear model appropriate for these data? Explain.
53
Residual Plots Recall that for a random sample of 16 used Ford F-150 SuperCrew 4 Γ 4s, the least-squares regression equation is πππππ =38257β πππππ ππππ£ππ . For this model, technology produced the following residual plot. Is a linear model appropriate for these data? Explain. Because there is no obvious pattern left over in the residual plot, the linear model is appropriate.
54
How Well the Line Fits the Data: The Role of s and r2 in Regression
Start here tomorrow.
55
Assignment 3.2 p #38-54 even (38, 40, 42, 44, 46, 48, 50, 52, 54) If you are stuck on any of these, look at the odd before or after and the answer in the back of your book. If you are still not sure text a friend or me for help (before 8pm). Wednesday we will check homework and continue 3.2 notes. Tomorrow I will be out in meetings all day. You will work together to complete Barbie Bungee part 2 (front and back).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.