Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 1 Chapter Correlation and Regression 9
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 2 Chapter Outline 9.1 Correlation 9.2 Linear Regression 9.3 Measures of Regression and Prediction Intervals 9.4 Multiple Regression.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 3 Section 9.3 Measures of Regression and Prediction Intervals.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 4 Section 9.3 Objectives How to interpret the three types of variation about a regression line How to find and interpret the coefficient of determination How to find and interpret the standard error of the estimate for a regression line How to construct and interpret a prediction interval for y.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 5 Variation About a Regression Line Three types of variation about a regression line Total variation Explained variation Unexplained variation To find the total variation, you must first calculate The total deviation The explained deviation The unexplained deviation.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 6 Variation About a Regression Line (x i, ŷ i ) x y (x i, y i ) Unexplained deviation Total deviation Explained deviation Total Deviation = Explained Deviation = Unexplained Deviation =.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 7 Total variation The sum of the squares of the differences between the y-value of each ordered pair and the mean of y. Explained variation The sum of the squares of the differences between each predicted y-value and the mean of y. Variation About a Regression Line Total variation = Explained variation =.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 8 Unexplained variation The sum of the squares of the differences between the y-value of each ordered pair and each corresponding predicted y-value. Variation About a Regression Line Unexplained variation = The sum of the explained and unexplained variation is equal to the total variation. Total variation = Explained variation + Unexplained variation.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 9 Coefficient of Determination Coefficient of determination The ratio of the explained variation to the total variation. Denoted by r 2.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 10 Example: Coefficient of Determination About 78% of the variation in the carbon emissions can be explained by the variation in the gross domestic products. About 22% of the variation is unexplained. The correlation coefficient for the gross domestic products and carbon dioxide emissions data is r ≈ Find the coefficient of determination. What does this tell you about the explained variation of the data about the regression line? About the unexplained variation? Solution:.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 11 The Standard Error of Estimate Standard error of estimate The standard deviation of the observed y i -values about the predicted ŷ-value for a given x i -value. Denoted by s e. The closer the observed y-values are to the predicted y-values, the smaller the standard error of estimate will be. n is the number of ordered pairs in the data set.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 12 The Standard Error of Estimate 1.Make a table that includes the column heading shown. 2.Use the regression equation to calculate the predicted y-values. 3.Calculate the sum of the squares of the differences between each observed y-value and the corresponding predicted y-value. 4.Find the standard error of estimate. In WordsIn Symbols.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 13 Example: Standard Error of Estimate The regression equation for the gross domestic products and carbon dioxide emissions data as calculated in section 9.2 is ŷ = x Find the standard error of estimate. Solution: Use a table to calculate the sum of the squared differences of each observed y-value and the corresponding predicted y-value..
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 14 Solution: Standard Error of Estimate xyŷ iŷ i (y i – ŷ i ) , , , , Σ = 152, unexplained variation.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 15 Solution: Standard Error of Estimate n = 10, Σ(y i – ŷ i ) 2 = 152, The standard error of estimate of the carbon dioxide emissions for a specific gross domestic product is about million metric tons..
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 16 Prediction Intervals Two variables have a bivariate normal distribution if for any fixed value of x, the corresponding values of y are normally distributed and for any fixed values of y, the corresponding x-values are normally distributed..
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 17 Prediction Intervals A prediction interval can be constructed for the true value of y. Given a linear regression equation ŷ = mx + b and x 0, a specific value of x, a c-prediction interval for y is ŷ – E < y < ŷ + E where The point estimate is ŷ and the margin of error is E. The probability that the prediction interval contains y is c..
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 18 Constructing a Prediction Interval for y for a Specific Value of x 1.Identify the number of ordered pairs in the data set n and the degrees of freedom. 2.Use the regression equation and the given x-value to find the point estimate ŷ. 3.Find the critical value t c that corresponds to the given level of confidence c. Use Table 5 in Appendix B. In WordsIn Symbols d.f. = n – 2.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 19 Constructing a Prediction Interval for y for a Specific Value of x 4.Find the standard error of estimate s e. 5.Find the margin of error E. 6.Find the left and right endpoints and form the prediction interval. In WordsIn Symbols Left endpoint: ŷ – E Right endpoint: ŷ + E Interval: ŷ – E < y < ŷ + E.
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 20 Example: Constructing a Prediction Interval Construct a 95% prediction interval for the carbon dioxide emission when the gross domestic product is $3.5 trillion. What can you conclude? Recall, n = 10, ŷ = x , s e = Solution: Point estimate: ŷ = (3.5) ≈ Critical value: d.f. = n –2 = 10 – 2 = 8 t c =
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 21 Solution: Constructing a Prediction Interval Left Endpoint: ŷ – ERight Endpoint: ŷ + E < y < – ≈ ≈ You can be 95% confident that when the gross domestic product is $3.5 trillion, the carbon dioxide emissions will be between and million metric tons..
Copyright © 2015, 2012, and 2009 Pearson Education, Inc. 22 Section 9.3 Summary Interpreted the three types of variation about a regression line Found and interpreted the coefficient of determination Found and interpreted the standard error of the estimate for a regression line Constructed and interpreted a prediction interval for y.