Download presentation
Presentation is loading. Please wait.
Published byHannu-Pekka Karjalainen Modified over 5 years ago
1
Measures of Regression Prediction Interval
2
Variation About a Regression Line
Three types of variation about a regression line Total variation Explained variation Unexplained variation To find these variations, you must first calculate The total deviation The explained deviation The unexplained deviation For each ordered pair (xi, yi) in a data set. .
3
Variation About a Regression Line
.
4
Variation About a Regression Line
Total variation The sum of the squares of the differences between the y-value of each ordered pair and the mean of y. Explained variation The sum of the squares of the differences between each predicted y- value and the mean of y. Total variation = Explained variation = .
5
Variation About a Regression Line
Unexplained variation The sum of the squares of the differences between the y-value of each ordered pair and each corresponding predicted y-value. Unexplained variation = The sum of the explained and unexplained variation is equal to the total variation. Total variation = Explained variation + Unexplained variation .
6
Coefficient of Determination
The ratio of the explained variation to the total variation. Denoted by r2 .
7
Example: Coefficient of Determination
The correlation coefficient for the gross domestic products and carbon dioxide emissions data is r ≈ Find the coefficient of determination. What does this tell you about the explained variation of the data about the regression line? About the unexplained variation? Solution: r2 ≈ (0.874)2 ≈ 0.764 About 76.4% of the variation in the carbon emissions can be explained by the variation in the gross domestic products. About 23.6% of the variation is unexplained. .
8
The Standard Error of Estimate
The standard deviation of the observed yi -values about the predicted ŷ-value for a given xi -value. It is given by The closer the observed y-values are to the predicted y-values, the smaller the standard error of estimate will be. n is the number of ordered pairs in the data set .
9
The Standard Error of Estimate
In Words In Symbols Make a table that includes the column heading shown. Use the regression equation to calculate the predicted y-values. Calculate the sum of the squares of the differences between each observed y-value and the corresponding predicted y-value. Find the standard error of estimate. .
10
Example: Standard Error of Estimate
The regression equation for the gross domestic products and carbon dioxide emissions data is ŷ = x Find the standard error of estimate. Solution: Use a table to calculate the sum of the squared differences of each observed y-value and the corresponding predicted y-value. .
11
Solution: Standard Error of Estimate
.
12
Solution: Standard Error of Estimate
n = 10, Σ(yi – ŷ i)2 = 161, 161, − 2 ≈ The standard error of estimate of the carbon dioxide emissions for a specific gross domestic product is about million metric tons. .
13
Prediction Intervals Two variables have a bivariate normal distribution if for any fixed value of x, the corresponding values of y are normally distributed and for any fixed values of y, the corresponding x-values are normally distributed. .
14
Prediction Intervals A prediction interval can be constructed for the true value of y. Given a linear regression equation ŷ = mx + b and x0, a specific value of x, a c-prediction interval for y is ŷ – E < y < ŷ + E where The point estimate is ŷ and the margin of error is E. The probability that the prediction interval contains y is c, assuming that the estimation process is repeated a large number of times. .
15
Constructing a Prediction Interval for y for a Specific Value of x
In Words In Symbols Identify the number of ordered pairs in the data set n and the degrees of freedom. Use the regression equation and the given x-value to find the point estimate ŷ. Find the critical value tc that corresponds to the given level of confidence c. d.f. = n – 2 Use Table 5 in Appendix B. .
16
Constructing a Prediction Interval for y for a Specific Value of x
In Words In Symbols Find the standard error of estimate se. Find the margin of error E. Find the left and right endpoints and form the prediction interval. Left endpoint: ŷ – E Right endpoint: ŷ + E Interval: ŷ – E < y < ŷ + E .
17
Example: Constructing a Prediction Interval
Construct a 90% prediction interval for the carbon dioxide emission when the gross domestic product is $2.8 trillion. What can you conclude? Solution: Because n = 10, there are d.f. = 10 – 2 = 8 degrees of freedom. .
18
Solution: Constructing a Prediction Interval
Using the regression equation y = x and x = 2.8 the point estimate is = (2.8) = .
19
Solution: Constructing a Prediction Interval
From Table 5, the critical value is tc = and from Example 2, se ≈ Recall that x = 22.9 and x2 = Also, x = Using these values, the margin of error is E = tc se n + n(x0 − x )2 n x2 − ( x )2 ≈ (1.860)( ) (2.8 − 2.29)2 10(65.49) – (22.9)2 ≈ .
20
Solution: Constructing a Prediction Interval
Using y = and E ≈ , the prediction interval is constructed as shown. Left Endpoint Right Endpoint y – E ≈ – y + E ≈ = = < y < You can be 90% confident that when the gross domestic product is $2.8 trillion, the carbon dioxide emissions will be between and million metric tons. .
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.