Model selection Stepwise regression
Statement of problem A common problem is that there is a large set of candidate predictor variables. (Note: The examples herein are really not that large.) Goal is to choose a small subset from the larger set so that the resulting regression model is simple, yet have good predictive ability.
Example: Cement data Response y: heat evolved in calories during hardening of cement on a per gram basis Predictor x 1 : % of tricalcium aluminate Predictor x 2 : % of tricalcium silicate Predictor x 3 : % of tetracalcium alumino ferrite Predictor x 4 : % of dicalcium silicate
Example: Cement data
Two basic methods of selecting predictors Stepwise regression: Enter and remove predictors, in a stepwise manner, until there is no justifiable reason to enter or remove more. Best subsets regression: Select the subset of predictors that do the best at meeting some well-defined objective criterion.
Stepwise regression: the idea Start with no predictors in the “stepwise model.” At each step, enter or remove a predictor based on partial F-tests (that is, the t-tests). Stop when no more predictors can be justifiably entered or removed from the stepwise model.
Stepwise regression: Preliminary steps 1.Specify an Alpha-to-Enter (α E = 0.15) significance level. 2.Specify an Alpha-to-Remove (α R = 0.15) significance level.
Stepwise regression: Step #1 1.Fit each of the one-predictor models, that is, regress y on x 1, regress y on x 2, … regress y on x p-1. 2.The first predictor put in the stepwise model is the predictor that has the smallest t-test P-value (below α E = 0.15). 3.If no P-value < 0.15, stop.
Stepwise regression: Step #2 1.Suppose x 1 was the “best” one predictor. 2.Fit each of the two-predictor models with x 1 in the model, that is, regress y on (x 1, x 2 ), regress y on (x 1, x 3 ), …, and y on (x 1, x p-1 ). 3.The second predictor put in stepwise model is the predictor that has the smallest t-test P-value (below α E = 0.15). 4.If no P-value < 0.15, stop.
Stepwise regression: Step #2 (continued) 1.Suppose x 2 was the “best” second predictor. 2.Step back and check P-value for β 1 = 0. If the P-value for β 1 = 0 has become not significant (above α R = 0.15), remove x 1 from the stepwise model.
Stepwise regression: Step #3 1.Suppose both x 1 and x 2 made it into the two-predictor stepwise model. 2.Fit each of the three-predictor models with x 1 and x 2 in the model, that is, regress y on (x 1, x 2, x 3 ), regress y on (x 1, x 2, x 4 ), …, and regress y on (x 1, x 2, x p-1 ).
Stepwise regression: Step #3 (continued) 1.The third predictor put in stepwise model is the predictor that has the smallest t-test P-value (below α E = 0.15). 2.If no P-value < 0.15, stop. 3.Step back and check P-values for β 1 = 0 and β 2 = 0. If either P-value has become not significant (above α R = 0.15), remove the predictor from the stepwise model.
Stepwise regression: Stopping the procedure The procedure is stopped when adding an additional predictor does not yield a t-test P-value below α E = 0.15.
Example: Cement data
Predictor Coef SE Coef T P Constant x Predictor Coef SE Coef T P Constant x Predictor Coef SE Coef T P Constant x Predictor Coef SE Coef T P Constant x
Predictor Coef SE Coef T P Constant x x Predictor Coef SE Coef T P Constant x x Predictor Coef SE Coef T P Constant x x
Predictor Coef SE Coef T P Constant x x x Predictor Coef SE Coef T P Constant x x x
Predictor Coef SE Coef T P Constant x x
Predictor Coef SE Coef T P Constant x x x Predictor Coef SE Coef T P Constant x x x
Predictor Coef SE Coef T P Constant x x
Stepwise Regression: y versus x1, x2, x3, x4 Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is y on 4 predictors, with N = 13 Step Constant x T-Value P-Value x T-Value P-Value x T-Value P-Value S R-Sq R-Sq(adj) C-p
Caution about stepwise regression! Do not jump to the conclusion … –that all the important predictor variables for predicting y have been identified, or –that all the unimportant predictor variables have been eliminated.
Caution about stepwise regression! Many t-tests for β k = 0 are conducted in a stepwise regression procedure. The probability is high … –that we included some unimportant predictors –that we excluded some important predictors
Drawbacks of stepwise regression The final model is not guaranteed to be optimal in any specified sense. The procedure yields a single final model, although in practice there are often several equally good models. It doesn’t take into account a researcher’s knowledge about the predictors.
Example: Modeling PIQ
Stepwise Regression: PIQ versus MRI, Height, Weight Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is PIQ on 3 predictors, with N = 38 Step 1 2 Constant MRI T-Value P-Value Height T-Value P-Value S R-Sq R-Sq(adj) C-p
The regression equation is PIQ = MRI Height Predictor Coef SE Coef T P Constant MRI Height S = R-Sq = 29.5% R-Sq(adj) = 25.5% Analysis of Variance Source DF SS MS F P Regression Error Total Source DF Seq SS MRI Height
Example: Modeling BP
Stepwise Regression: BP versus Age, Weight, BSA, Duration, Pulse, Stress Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is BP on 6 predictors, with N = 20 Step Constant Weight T-Value P-Value Age T-Value P-Value BSA 4.6 T-Value 3.04 P-Value S R-Sq R-Sq(adj) C-p
The regression equation is BP = Age Weight BSA Predictor Coef SE Coef T P Constant Age Weight BSA S = R-Sq = 99.5% R-Sq(adj) = 99.4% Analysis of Variance Source DF SS MS F P Regression Error Total Source DF Seq SS Age Weight BSA
Stepwise regression in Minitab Stat >> Regression >> Stepwise … Specify response and all possible predictors. If desired, specify predictors that must be included in every model. (This is where researcher’s knowledge helps!!) Select OK. Results appear in session window.