Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.

Similar presentations


Presentation on theme: "Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis."— Presentation transcript:

1 Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis

2 Copyright © Cengage Learning. All rights reserved. 13.4 Inferences Concerning the Slope of the Regression Line

3 3 Inferences Concerning the Slope of the Regression Line If random samples of size n are repeatedly taken from a bivariate population, then the calculated slopes, the b 1 ’s, will form a sampling distribution that is normally distributed with a mean of  1, the population value of the slope, and with a variance of, where provided there is no lack of fit. An appropriate estimator for is obtained by replacing by, the estimate of the variance of the error about the regression line: (13.10) (13.11)

4 4 Inferences Concerning the Slope of the Regression Line This formula may be rewritten in the following, more manageable form:

5 5 Inferences Concerning the Slope of the Regression Line Note: The “standard error of ___ ” is the standard deviation of the sampling distribution of ___. Therefore, the standard error of regression (slope) is and is estimated by.

6 6 Inferences Concerning the Slope of the Regression Line Assumptions for inferences about the linear regression: The set of (x, y) ordered pairs forms a random sample, and the y values at each x have a normal distribution. Since the population standard deviation is unknown and replaced with the sample standard deviation, the t-distribution will be used with n – 2 degrees of freedom.

7 7 Confidence Interval Procedure

8 8 The slope,  1, of the regression line of the population can be estimated by means of a confidence interval.

9 9 Example 7 – Constructing a Confidence Interval for  1, The Population Slope Of the Line of Best Fit Suppose you move to a new city and find a job. You will, of course, be concerned about the problems you will face commuting to and from work. For example, you would like to know how long it will take you to drive to work each morning. Let’s use “one-way distance to work” as a measure of where you live. You live x miles away from work and want to know how long it will take you to commute each day. Your new employer, foreseeing this question, has already collected a random sample of data to be used in answering your question. Fifteen of your new co-workers were asked to give their one-way travel times and distances to work.

10 10 Example 7 – Constructing a Confidence Interval for  1, The Population Slope Of the Line of Best Fit The resulting data are shown in Table 13.2. (For convenience, the data have been arranged so that the x values are in numerical order.) Find the line of best fit and the variance of y about the line of best fit,. Find the 95% confidence interval for the population’s slope,  1. Data on Commute Distances and Times [TA13-2] Table 13.2 cont’d

11 11 Example 7 – Solution Step 1 a. Parameter of interest: The slope  1, of the line of best fit for the population Step 2 a. Assumptions: The ordered pairs form a random sample, and we will assume that the y values (minutes) at each x (miles) have a normal distribution. b. Probability distribution and formula: Student’s t–distribution and formula (13.14).

12 12 Example 7 – Solution c. Level of confidence: 1 –  = 0.95 Step 3 Sample information: n = 15, b 1 = 1.89, = 0.0813 Step 4 a. Confidence coefficients: From Table 6 in Appendix B, we find t (df,  /2) = t (13, 0.025) = 2.16. cont’d

13 13 Example 7 – Solution b. Maximum error of estimate: We use formula (13.14) to find E = t (n – 2,  /2)  : E = (2.16)  = 0.6159 c. Lower and upper confidence limits: b 1 – E to b 1 + E 1.89 – 0.62 to 1.89 + 0.62 Thus, 1.27 to 2.51 is the 95% confidence interval for  1. cont’d

14 14 Example 7 – Solution Step 5 Confidence interval: We can say that the slope of the line of best fit of the population from which the sample was drawn is between 1.27 and 2.51 with 95% confidence. That is, we are 95% confident that, on average, every extra mile will take between 1.27 minutes (1 min, 16 sec) and 2.5 minutes (2 min, 31 sec) of time to make the commute. cont’d

15 15 Hypothesis-Testing Procedure

16 16 Hypothesis-Testing Procedure We are now ready to test the hypothesis  1 = 0. That is, we want to determine whether the equation of the line of best fit is of any real value in predicting y. For this hypothesis test, the null hypothesis is always H o :  1 = 0. It will be tested using Student’s t-distribution with df = n – 2 and the test statistic t found using formula (13.15):

17 17 Example 9 – One-tailed Hypothesis Test for the Slope of the Regression Line Suppose you move to a new city and find a job. You will, of course, be concerned about the problems you will face commuting to and from work. For example, you would like to know how long it will take you to drive to work each morning. Let’s use “one-way distance to work” as a measure of where you live. You live x miles away from work and want to know how long it will take you to commute each day. Your new employer, foreseeing this question, has already collected a random sample of data to be used in answering your question. Fifteen of your new co-workers were asked to give their one-way travel times and distances to work.

18 18 Example 9 – One-tailed Hypothesis Test for the Slope of the Regression Line The resulting data are shown in Table 13.2. (For convenience, the data have been arranged so that the x values are in numerical order.) Find the line of best fit and the variance of y about the line of best fit,. Data on Commute Distances and Times [TA13-2] Table 13.2 cont’d

19 19 Example 9 – One-tailed Hypothesis Test for the Slope of the Regression Line Is the slope of the line of best fit significant enough to show that one-way distance is useful in predicting one-way travel time? Use  = 0.05. Solution: Step 1 a. Parameter of interest:  1, the slope of the line of best fit for the population b. Statement of hypotheses: H a :  1 = 0 (This implies that x is of no use in predicting y; that is, would be as effective.) The alternative hypothesis can be either one-tailed or two-tailed. If we suspect that the slope is positive, a one- tailed test is appropriate. cont’d

20 20 Example 9 – Solution Ha:  1 > 0. (We expect travel time y to increase as the distance x increases.) Step 2 a. Assumptions: The ordered pairs form a random sample, and we will assume that the y values (minutes) at each x (miles) have a normal distribution. b. Probability distribution and test statistic: The t-distribution with df = n – 2 = 13, and the test statistic t from formula (13.15) c. Level of significance:  = 0.05 cont’d

21 21 Example 9 – Solution Step 3 a. Sample information: n = 15, b 1 = 1.89, and = 0.0813 b. Test statistic: Using formula (13.15), we find the observed value of t: cont’d

22 22 Example 9 – Solution Step 4 Probability Distribution: p-Value: a. Use the right-hand tail because H a expresses concern for values related to “positive.” P = P (t > 6.63, with df = 13) as shown in the figure. cont’d

23 23 Example 9 – Solution To find the p-value, use one of three methods: 1. Use Table 6 (Appendix B) to place bounds on the p-value: P < 0.005. 2. Use Table 7 (Appendix B) to place bounds on the p-value: P < 0.001. 3. Use a computer or calculator to find the p-value: P = 0.0000082. b. The p-value is smaller than the level of significance, . cont’d

24 24 Example 9 – Solution Classical: a. The critical region is the right-hand tail because H a expresses concern for values related to “positive.” The critical value is found in Table 6: b. t is in the critical region, as shown in red in the figure. cont’d

25 25 Example 9 – Solution Step 5 a. Decision: Reject H o. b. Conclusion: At the 0.05 level of significance, we conclude that the slope of the line of best fit in the population is greater than zero. The evidence indicates that there is a linear relationship and that the one-way distance (x) is useful in predicting the travel time to work (y). cont’d


Download ppt "Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis."

Similar presentations


Ads by Google