1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5. Prediction and Prediction Interval
2 Review of Lecture 3 Properties: (1).Linearity (2).Unbiased (3). Normality The LS- Estimators of and :
3 (4).Best Linear Unbiased Estimators. For any other Unbiased Linear Estimators and of and, we have Review of Lecture 3 (cont.)
4 Noise Variance is usually unknown. It has a natural estimator using residuals, given by SSE=Sum of Squared Errors (Residuals) n-2= degrees of freedom of SSE=Sample size- # of Coefficients Using the estimated noise variance, we can obtain the measures of the accuracy (Standard Errors, s.e.) of the estimators and Accuracy of the LS-Estimates
5 In the SLR model, we often want to know if (1) “ Y and X are linearly uncorrelated” (X has no effect on Y) ( ) (2) “ the linear regression line passes the origin” (no intercept is needed).( ) A hypothesis test has two hypotheses: H0 (Null) and H1 ( Alternative). A general test about the slope (intercept ) can be written as versus where ( or ) is a predetermined constant (belief of the experts). To test (1) [or (2)] is equivalent to test =0 [or =0] (a) versus(b) Significance Test of Parameters
6 To test the hypotheses, we need construct test statistics. Test statistic can usually be constructed as: For example, to test (1) and (2), we have test statistics respectively Which have t-distributions with n-2 degrees of freedom (dfs). Let be the Upper - percentile of t-distribution with n-2 dfs. Then the level two- sided test for test (1) or (2) is Test Statistics and 2-Sided Tests can be found in the Appendix (Table A.2, Page 339)
7 In another way, we can compute the p-value of the test statistic to perform the test: If the p-value<, then the test is level significant. The smaller the p-value, the higher significant level the test is. In general, we say 1) when p-value<.05, the test is significant; 2) when p-value <.01, the test is very significant; 3) when p-value>.20, the test is not significant. Example For computer repair data, to test Since 6.948>2.18, the test is significant at 5%-level significance. The <.005. So the test is highly significant P-value of the Test
8 One-Sided Tests: (a) (b) (c) (d) ( ) ( )
9 Results for: P027.txt Regression Analysis: Minutes versus Units The regression equation is Minutes = Units Predictor Coef SE Coef T P-value Constant Units S = R-Sq = 98.7% R-Sq(adj) = 98.6% Minitab Output Note that the T-values calculated here are for testing`` the true parameters are zeros”.
10 Since =0 is equivalent to Cor(Y,X)=0. So the following two tests are equivalent to each other: (3) (4) where denotes the true correlation between Y and X. To test (4), we can use the following test statistic We can also compute to perform the test A Test Using Correlation Coefficient
11 For the computer repair data, since Cor(Y,X)=.996, we have Which is much larger than. So the correlation is highly Significantly different from 0. That is X has high impact on Y. We reject H0 of test (4) or (3) at 5%-significant level. The corresponding p-value much less than So the significance is very high. Example:
12 Confidence Interval Since We have (similarly for ): For SLR Model
13 Interpretation of CI: Example
14 For SLR Model Prediction Interval
15 Prediction Interval (cont.)
16 Standard Errors Reasons: (1). is a random variable (2). is a constant.
17 Prediction Interval/Prediction Limits/Forecast Intervals Remark:
18 Example
19 Note that the PI bands (blue) are wider than the CI bands (red)
20 Remark