Presentation is loading. Please wait.

Presentation is loading. Please wait.

Solution 9 1. a) From the matrix plot, 1) The assumption about linearity seems ok; 2).The assumption about measurement errors can not be checked at this.

Similar presentations


Presentation on theme: "Solution 9 1. a) From the matrix plot, 1) The assumption about linearity seems ok; 2).The assumption about measurement errors can not be checked at this."— Presentation transcript:

1 Solution 9 1. a) From the matrix plot, 1) The assumption about linearity seems ok; 2).The assumption about measurement errors can not be checked at this level 3). The assumption about the predictor variables seems be violated since there is strong colllinearity between some predictor variables, e.g., between X3 and X6, between X1 and X6. 4). The assumption about observations may be violated. There seems have some outliers. 1/1/2019 ST Solution 9

2 The assumptions about the measurement errors may be checked via the residual plot on the right-hand sided. a). From the Normal probability plot and the histogram of the standard residuals it seems that the Normality assumption is violated. b). From the index plot of the standard residuals, it seems that the homogeneity is slightly violated since it seems the variances in the left-end are smaller than the variances in the right-end of the plot. c). Mean 0 assumption is never checked. d). The independence assumption seems ok. This may be seen from the index plot of the standard residuals. However, we are not 100% sure based on just the picture. From the index plot, it seems that Observations 34 and 38 are outliers. Thus, the assumption about the observation equal liability is violated. 1/1/2019 ST Solution 9

3 b). The table is omitted here!
c). The plots are as below! From the index plot of SRES, we can see that observations 34 and 38 are outliers. From the index plot of Cook, we can see that observations 34 and 38 are influential points. The cutoff value 4(p+1)/(n-p-1)=4*7/(40-6-1)=.8485 fails to identify any influential points. 1/1/2019 ST Solution 9

4 From the index plot of DFIT, we can see observation 34 and 38 are influential points. Here the cutoff value 2 ((p+1)/(n-p-1))^{1/2} = works. From the Hadi measure, we fail to detect any influential points. 1/1/2019 ST Solution 9

5 The potential-residuals plot
From HHi-axis, it seems observations 8,9, and 15 should be identified as high leverage points but they are not outliers. From Ddi-axis, it seems observations 34 and 38 are outliers but they are not high leverage points. d). Observations 34 and 38 are outliers (in Y-directions) but not high leverage points. Observations 8, 9 and 15 are high leverage points but they are not outliers in Y-directions. 1/1/2019 ST Solution 9

6 Regression Analysis: Y versus X1, X2, X3
The regression equation is Y = X X X3 Predictor Coef SE Coef T P Constant X X X S = R-Sq = 94.1% R-Sq(adj) = 93.6% (a). Sum(u_iv_i)= , Sum(v_I^2)= , Thus, Beta3=sum(u_iv_i)/sum(v_I^2)= , verified. (b). SEbeta3=S/sum(v_I^2)=31.63/ ^{1/2}= , as desired. 2. From the SRES-axis, we can see Observations 7 and 18 are outliers. But Observation 18 is not a high leverage point. From the Pii-axis, we can see observations 7 and 11 are high leverage points. But observation 11 is not an outlier in Y-direction. 1/1/2019 ST Solution 9

7 4. a) The added-variable plot is drawn and put in the right-hand side
4. a) The added-variable plot is drawn and put in the right-hand side. The fitted results are as below. From the F-test in the ANOVA table, we can see that the overall fit is highly significant with p-value It follows that we should add X4 into the model. Regression Analysis: R(YoX123) versus R(X4oX123) The regression equation is R(YoX123) = R(X4oX123) S = R-Sq = 24.2 % R-Sq(adj) = 22.2 % Analysis of Variance Source DF SS MS F P Regression Error Total 1/1/2019 ST Solution 9

8 2. b) The added-variable plot is put in the right-hand side
2. b) The added-variable plot is put in the right-hand side. It seems that the fitted line is almost flat. The F-test for the overall fit is not significant with p-value .434 (from the ANOVA table below). It follows that we should not add X5 into the model. Regression Analysis: R(YoX1234) versus R(X5oX1234) The regression equation is R(YoX1234) = R(X5oX1234) S = R-Sq = 1.6 % R-Sq(adj) = 0.0 % Analysis of Variance Source DF SS MS F P Regression Error Total 1/1/2019 ST Solution 9

9 2. c) The added-variable plot is put in the right-hand side
2. c) The added-variable plot is put in the right-hand side. It seems that the fitted line is almost flat. The F-test for the overall fit is not significant with p-value .849 (from the ANOVA table below). It follows that we should not add X6 into the model. 2. d). Since we can not add X5 and X6 into the model, the best model contains at most 4 predictor variables. Since all coefficients except the intercept of the model Y vs X1, X2, X3, and X4 are significant, the best model should be Y vs X1, X2, X3, and X4. Regression Analysis: R(YoX1234) versus R(X6oX1234) The regression equation is R(YoX1234) = R(X6oX1234) S = R-Sq = 0.1 % R-Sq(adj) = 0.0 % Analysis of Variance Source DF SS MS F P Regression E Error Total 1/1/2019 ST Solution 9


Download ppt "Solution 9 1. a) From the matrix plot, 1) The assumption about linearity seems ok; 2).The assumption about measurement errors can not be checked at this."

Similar presentations


Ads by Google