Prediction and Prediction Intervals Lecture 11 Review of Lecture 10 Prediction and Prediction Intervals More Examples about Model Comparison 11/21/2018 ST3131, Lecture 11
Steps for Model Comparison : RM H0: The RM is adequate vs FM H1: The FM is adequate Step1: Fit the FM and get SSE (in the ANOVA table) df (in the ANOVA table) R_sq (under the Coefficient Table) Step 2: Fit the RM and get SSE, df, and R_sq. Step 3: Compute F-statistic: Step 4: Conclusion: Reject H0 if F>F(r,df(SSE,F),alpha) Can’t Reject H0 otherwise. 11/21/2018 ST3131, Lecture 11
Special Case: ANOVA Table (Analysis of Variance) Source Sum of Squares df Mean Square F-test P-value Regression SSR p MSR=SSR/p F=MSR/MSE Residuals SSE n-p-1 MSE=SSE/(n-p-1) Total SST n-1 11/21/2018 ST3131, Lecture 11
Predictions: Recall the prediction for the SLR model: 11/21/2018 ST3131, Lecture 11
Prediction: for MLR Model Standard Errors 11/21/2018 ST3131, Lecture 11
Problem 3. 5 (Page 76, textbook) Table 3 Problem 3.5 (Page 76, textbook) Table 3.11 shows the regression output, with some numbers erased, when a simple regression model relating a response variable Y to a predictor variable X1 is fitted based on 20 observations. Complete the 13 missing numbers, then compute Var(Y) and Var(X1). ANOVA Table Source Sum of Squares df Mean Square F-test Regression 1848.76 Residual Total Coefficient Table Variable Coefficients s.e. T-test P-value Constant -23.4325 12.74 .0824 X1 .1528 8.32 <.0001 n= R^2= Ra^2= S= df 11/21/2018 ST3131, Lecture 11
11/21/2018 ST3131, Lecture 11
11/21/2018 ST3131, Lecture 11
Sex (X1): An indicator variable(man=1, woman=0) Problem 3.12 (Page 78, textbook) Table 3.14 shows the regression output of a MLR model relating the beginning salaries in dollars of employees in a given company to the following predictor variables: Sex (X1): An indicator variable(man=1, woman=0) Education(X2): Years of Schooling at the time of hire Experience(X3): Number of months previous work experience Months(X4): Number of months with the company In (a)-(b) below, specify the null and alternative hypotheses the test used, and your conclusion using a 5% level of significance. 11/21/2018 ST3131, Lecture 11
Table 3.14 ANOVA Table Coefficient Table 11/21/2018 ST3131, Lecture 11 Source Sum of Squares df Mean Square F-test Regression 23665352 4 5916338 22.98 Residual 22657938 88 257477 Total 46323290 92 Coefficient Table Variable Coefficients s.e. T-test P-value Constant 3526.4 327.7 10.76 .000 Sex 722.5 117.8 6.13 Education 90.02 24.69 3.65 Experience 1.2690 .5877 2.16 .034 Month 23.406 5.201 4.50 n=93 R^2=.515 Ra^2=.489 S=507.4 Df=88 11/21/2018 ST3131, Lecture 11
Conclusion: H0, the overall fit is significant. Conduct the F-test for the overall fit of the regression (F(4,88,.05)<2.53) Test H0: vs H1: Statistic F= df=( , ) Conclusion: H0, the overall fit is significant. Is there a positive linear relationship between Salary and Experience, after accounting for the effect of the variables Sex, Education, and Months. Test H0: vs H1: Statistic T= P-value= Conclusion: H0. The positive relationship is significant at 5% significance level. 11/21/2018 ST3131, Lecture 11
(c) What salary would you forecast for a man with 12 years of Education, 10 months of Experience, and 15 months with the company? (d) What salary would you forecast, on average, for a man with 12 years of Education, 10 months of Experience, and 15 months with the company? 11/21/2018 ST3131, Lecture 11
(e) What salary would you forecast, on average, for a woman with 12 years of Education, 10 months of Experience, and 15 months with the company? Problem 3.13 (Page 79, textbook) Consider the regression model that generated output in Table 31.4 to be a Full Model. Now consider the Reduced Model in which Salary is regression on only Education . The ANOVA table obtained when fitting this model is shown in Table 3.15. Conduct a single test to compare the Full and Reduced Models. What conclusion can be drawn from the result of the test? (Use 5% significant level). 11/21/2018 ST3131, Lecture 11
Statistic SSE(R )= df(R )= SSE(F)= df(F)= F= df=( , ). Table 3.15 ANOVA Table Source Sum of Squares df Mean Square F-test Regression 7862535 1 18.60 Residual 38460756 91 422646 Total 46323291 92 Test H0: vs H1: Statistic SSE(R )= df(R )= SSE(F)= df(F)= F= df=( , ). Conclusion: H0. The Reduced Model is significant 11/21/2018 ST3131, Lecture 11
After-class Questions: Why ANOVA table can be used to test if R_sq=0? Why F-test can be used to test if the effect of a predictor variable is significant or not? 11/21/2018 ST3131, Lecture 11