Section 9-3 We already know how to calculate the correlation coefficient, r. The square of this coefficient is called the coefficient of determination. The coefficient of determination is equal to the ratio of the unexplained variation to the total variation. In other words, if 𝑟 2 = .81, then 81% of the variation between x and y can be explained by the relationship between x and y. The other 19% of the variation is unexplained and is due to other factors or to sampling error.
Section 9-3 How to find the Standard Error of Estimate: 1) Go to STAT – Edit – Put x values into L1 and y values into L2 2) STAT – TEST – F The Standard Error of Estimate is the standard deviation of the residuals (s). Scroll down the list of values given as the results of the test, and find s.
Section 9-3 Construct a Prediction Interval for a Specific x-value (x0). 1) Determine degrees of freedom (n – 2) 2) Use regression equation and given x (x0) to find 𝑦 . 3) Find the critical t value (tc) that corresponds to the level of confidence (c ) by using the calculator (InvT( 1−𝑐 2 ), with degrees of freedom being found at n – 2). 4) Use the tc value and the Se value, to calculate the margin of error (E).
Section 9-3 Construct a Prediction Interval for a Specific x-value (x0). 𝐸= 𝑡 𝑐 𝑆 𝑒 1+ 1 𝑛 + 𝑛 𝑥 0 − 𝑥 2 𝑛 𝑥 2 − 𝑥 2 𝑛= sample size 𝑥 0 is the x value that you used to find 𝑦 𝑥 is the sample mean. 𝑥 2 is total of all the squared x’s. Square first, then add them up. 𝑥 2 is the total of x’s squared. Add first, then square the answer. The values for 𝑥 , 𝑥 2 , and 𝑥 can all be found by going to STAT– Calc–1 (1-Var Stats) 5) Find the left and right endpoints by subtracting E from 𝑦 and then adding E to 𝑦 . These answers are your interval.
Example 1 (Page 526) The correlation coefficient for the advertising expenses and company sales data as calculated in Example 4 of Section 9-1 is 𝑟≈0.913. Find the coefficient of determination. What does this tell you about the explained variation of the data about the regression line? About the unexplained variation? 𝑟 2 = 0.913 2 ≈0.834 About 83.4% of the variation in the company sales can be explained by the variation in the advertising expenditures. About 16.6% (the rest) of the variation is unexplained and is due to chance or other variables.
Put x values into L1 and y values into L2 2) STAT–TEST–F Example 2 (Page 528) The regression equation for the advertising expenses and company sales data as calculated in Example 1 of Section 9-2 is 𝑦 =50.729𝑥+104.061. Find the standard error of estimate. 1) Go to STAT–Edit– Put x values into L1 and y values into L2 2) STAT–TEST–F Find s on the list of values given from this test. 3) The standard error of the estimate is 10.29 x 2.4 1.6 2.0 2.6 1.4 2.2 y 225 184 220 240 180 186 215
Example 3 (Page 530) Using the results of Example 2, construct a 95% prediction interval for the company sales when the advertising expenses are $2100. What can you conclude? We were told that 𝑦 =50.729𝑥+104.061, so we plug in 2.1 for x to find 𝑦 . 𝑦 =210.59. From here, we need to be able to use the formula for the margin of error. It’s not pretty, but it works. 𝐸= 𝑡 𝑐 𝑠 𝑒 1+ 1 𝑛 + 𝑛( 𝑥 0 − 𝑥 ) 2 𝑛( 𝑥 2 )−( 𝑥 ) 2 𝑡 𝑐 =2.447 (2nd VARS 4, (1 - .95)/2, with 6 degrees of freedom). 𝑠 𝑒 =10.29, from last example. 𝑛=8 𝑥 0 =2.1 (this is the x value we used to find 𝑦 . 𝑥 =1.975, ( 𝑥 2 )=32.44, ( 𝑥) =15.8 These values are from STAT- Calc–1 (1-Var Stats).
Example 3 (Page 530) Using the results of Example 2, construct a 95% prediction interval for the company sales when the advertising expenses are $2100. What can you conclude? 𝐸= 2.447 10.29 1+ 1 8 + 8( 2.1−1.975) 2 8 32.44 − 15.8 2 ≈26.857 We subtract 26.857 from 210.59 to get the low end of the estimate. We add 26.857 to 210.59 to get the upper end of the estimate. 183.733 < y < 237.447 We can be 95% confident that when advertising expenses are $2100, the company sales will be between $183,733 and $237,447.
Use the data given below to construct a 99% confidence interval for y when x = 2950, if possible. If not possible, tell why. Use 0.01 for α x y 1467 606 1728 659 2150 716 2651 774 2629 797 2619 860 2533 894 3080 959 3475 1024 Enter x-values into L1 and y-values into L2 STAT–Test–F Use L1 for Xlist and L2 for Ylist Leave Freq: at 1 Select ≠ for a two-tailed test For RegEQ, press Alpha Trace and then Enter This will automatically store the linear regression equation into the Y= screen for you. Calculate. The standardized test statistic (t) = 8.262 The p-value is 7.416E-5 (.00007) The Standard Error (s) is 44.981 The correlation coefficient (r) is .952 This indicates a strong positive correlation. The coefficient of determination (r2) is .907 This means that 90.7% of the variation between x and y is due to their relationship.
Use the data given below to construct a 99% confidence interval for y when x = 2950, if possible. If not possible, tell why. Use 0.01 for α x y 1467 606 1728 659 2150 716 2651 774 2629 797 2619 860 2533 894 3080 959 3475 1024 t = 8.262 p = 7.416E-5 (.00007) s = 44.981 r = .952 r2 = .907 n = 9 Because p < α, we reject the null and conclude that we have a significant relationship. We may proceed and use the equation to make predictions. y = .210x + 288.1799 OR 288.1799 + .210x Setting our Graph to start at 2950 (2nd Window) gives us a y-value of 908.428. (this works because the equation is already in y=) 908.428 is y-hat. STAT-Calc-1 on L1 (the x-values) gives us: x-bar = 2481.333 Σx = 22332 Σx2 = 58537290 2nd VARS 4 on (1-.99)/2 with 7 (n – 2) degrees of freedom gives us a tc value of 3.499.
Use the data given below to construct a 99% confidence interval for y when x = 2950, if possible. If not possible, tell why. Use 0.01 for α x y 1467 606 1728 659 2150 716 2651 774 2629 797 2619 860 2533 894 3080 959 3475 1024 t = 8.262 p = 7.416E-5 s = 44.981 r = .952 r2 = .907 n = 9 tc = 3.499 𝑦 =908.428 𝑥 =2481.333 𝑥=22332 𝑥 2 =58537290 Let’s plug the numbers into the formula to find our interval. 𝐸= 𝑡 𝑐 𝑠 𝑒 1+ 1 𝑛 + 𝑛( 𝑥 0 − 𝑥 ) 2 𝑛( 𝑥 2 )−( 𝑥 ) 2 becomes 𝐸=3.499(44.981) 1+ 1 9 + 9( 2950−2481.333) 2 9(58537290)− 22332 2 𝐸=171.070 Now, add and subtract E from 𝑦 to get our interval. 908.428 – 171.070 < y < 908.428 + 171.070 737.358 < y < 1079.498 We can be 99% confident that when x is 2950, the actual value of y will be between 737.358 and 1079.498.
Assignments: Classwork: Page 531 #4-8 All Homework: Pages 531-534, # 9-24 All