Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stat 324 – Day 28 Model Validation (Ch. 11).

Similar presentations


Presentation on theme: "Stat 324 – Day 28 Model Validation (Ch. 11)."— Presentation transcript:

1 Stat 324 – Day 28 Model Validation (Ch. 11)

2 Announcements Submit lab assignment tonight Be working on project 2
Feedback on project 1 Exam Thursday Posting review handout, problems Submit questions Tuesday night

3 Previously Variable selection Forward selection Backward selection
Stepwise (mixed) selection Best Subsets

4 Practice Problem - Participation
After running a best subsets through Minitab, it appears the best size of model to use would be with two variables: debt and part time. This is because this model contains the smallest S value, the smallest Mallow's cp, the smallest PRESS, and the largest r-squared adjusted. These are all indicators of a good-fitting model. Compared to the principal components analysis, we see this to be true as well, given that the first component is correlated the highest with debt and the second component is correlated highest with parttime. After running the best subsets procedure in JMP, I found that with just one variable, debt, we had explained roughly 96% of the variability in participation. While adding the variable for part-time work to the model does increase our R^2 to 99%, I simply do not think the jump in variability explained is impressive enough to justify adding another variable to the model. Since debt alone is doing such a good job in predicting participation, I would recommend sticking with the single variable model for ease of interpretation.

5 Previously – Penalized regression
Bias vs. Variance

6 Previously – Penalized regression
Original goal: more “robust” estimates of the slope coefficients More recently: can be used for variable selection Variable selection? Shrinkage? Yes No Lasso, Elastic Net Ridge Forward selection Ordinary least squares

7 Which method? Shrinkage methods very helpful when p is close to, or even larger than, n Helpful when have lots of multicollinearity Might prefer Ridge if want to keep all the variables in the model (some information in all of the predictors) rather than variable elimination (none of the information from some) Use lasso if believe only a few of the predictors should be important (selection)

8 Previously: Piecewise Linear
E(Y) = b­0 + b1x1 + b2(x1 – C)+ For x1 < C: E(Y) = ­b0 + b1x1 For x1 > C: E(Y) = (b0 - Cb2)+ (b1 + b2)x1 E(Y) = b0 + b1x1 + b2(x1 – C1)+ + b3(x1 – C2)+ For C1 < x1 < C2: E(Y) =(b0 - Cb2)+ (b1 + b2)x1 For x1 > C2 : E(Y) = b0 + b1x1 + b2(x1 – C1) + b3(x1 – C2) = (b0 – b2C1 – b3C2) + (b1 + b2 + b3)x1

9 Previously: Cubic Spines

10 Previously More smoothing…

11 Example: predicting diabetes

12 Validation Measures (Root) Mean squared prediction error R2 prediction

13 Recap Whereas the Lasso method probably has a lower R2 value than the ordinary least squares, the regression coefficients and predictions are much more robust and hold up nearly as well for the test data. Improved prediction and interpretability.


Download ppt "Stat 324 – Day 28 Model Validation (Ch. 11)."

Similar presentations


Ads by Google