Best subsets regression

Slides:



Advertisements
Similar presentations
All Possible Regressions and Statistics for Comparing Models
Advertisements

Multicollinearity.
Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
4/14/ lecture 81 STATS 330: Lecture 8. 4/14/ lecture 82 Collinearity Aims of today’s lecture: Explain the idea of collinearity and its connection.
Review of Univariate Linear Regression BMTRY 726 3/4/14.
Simple Linear Regression and Correlation
Reading – Linear Regression Le (Chapter 8 through 8.1.6) C &S (Chapter 5:F,G,H)
Comparing the Various Types of Multiple Regression
Applied Linear Regression CSTAT Workshop March 16, 2007 Vince Melfi.
MARE 250 Dr. Jason Turner Multiple Regression. y Linear Regression y = b 0 + b 1 x y = dependent variable b 0 + b 1 = are constants b 0 = y intercept.
Part I – MULTIVARIATE ANALYSIS C3 Multiple Linear Regression II © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
MARE 250 Dr. Jason Turner Multiple Regression. y Linear Regression y = b 0 + b 1 x y = dependent variable b 0 + b 1 = are constants b 0 = y intercept.
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Multiple Regression MARE 250 Dr. Jason Turner.
Lecture 11 Multivariate Regression A Case Study. Other topics: Multicollinearity  Assuming that all the regression assumptions hold how good are our.
Session 7. Applied Regression -- Prof. Juran2 Outline Chi-square Goodness-of-Fit Tests Fit to a Normal Simulation Modeling Autocorrelation, serial correlation.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Model selection Stepwise regression. Statement of problem A common problem is that there is a large set of candidate predictor variables. (Note: The examples.
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Simple linear regression Linear regression with one predictor variable.
Prediction concerning Y variable. Three different research questions What is the mean response, E(Y h ), for a given level, X h, of the predictor variable?
M23- Residuals & Minitab 1  Department of ISM, University of Alabama, ResidualsResiduals A continuation of regression analysis.
Variable selection and model building Part II. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Selecting Variables and Avoiding Pitfalls Chapters 6 and 7.
Multiple Regression Selecting the Best Equation. Techniques for Selecting the "Best" Regression Equation The best Regression equation is not necessarily.
Chapter 14 Multiple Regression Models. 2  A general additive multiple regression model, which relates a dependent variable y to k predictor variables.
STA302/ week 911 Multiple Regression A multiple regression model is a model that has more than one explanatory variable in it. Some of the reasons.
Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Regression For the purposes of this class: –Does Y depend on X? –Does a change in X cause a change in Y? –Can Y be predicted from X? Y= mX + b Predicted.
Introduction to Linear Regression
Stepwise Regression SAS. Download the Data atData.htmhttp://core.ecu.edu/psyc/wuenschk/StatData/St atData.htm.
Prediction concerning the response Y. Where does this topic fit in? Model formulation Model estimation Model evaluation Model use.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
Chapter 11 Multiple Linear Regression Group Project AMS 572.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
Sequential sums of squares … or … extra sums of squares.
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Multiple regression. Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the.
Inference with computer printouts. Coefficie nts Standard Errort StatP-value Lower 95% Upper 95% Intercept
Diagnostics – Part II Using statistical tests to check to see if the assumptions we made about the model are realistic.
Multiple Regression Selecting the Best Equation. Techniques for Selecting the "Best" Regression Equation The best Regression equation is not necessarily.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Overview of our study of the multiple linear regression model Regression models with more than one slope parameter.
Chapter 8 Minitab Recipe Cards. Confidence intervals for the population mean Choose Basic Statistics from the Stat menu and 1- Sample t from the sub-menu.
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
A first order model with one binary and one quantitative predictor variable.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
1 Building the Regression Model –I Selection and Validation KNN Ch. 9 (pp )
Chapter 11: Linear Regression and Correlation Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Multiple Regression David A. Kenny January 12, 2014.
Variable selection and model building Part I. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Multicollinearity. Multicollinearity (or intercorrelation) exists when at least some of the predictor variables are correlated among themselves. In observational.
1 Experimental Statistics - week 12 Chapter 11: Linear Regression and Correlation Chapter 12: Multiple Regression.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
Model selection and model building. Model selection Selection of predictor variables.
Stat 324 – Day 28 Model Validation (Ch. 11).
Linear Model Selection and regularization
Model selection Stepwise regression.
Chapter 11 Variable Selection Procedures
Presentation transcript:

Best subsets regression Model selection Best subsets regression

Statement of problem A common problem is that there is a large set of candidate predictor variables. Goal is to choose a small subset from the larger set so that the resulting regression model is simple, yet have good predictive ability.

Example: Cement data Response y: heat evolved in calories during hardening of cement on a per gram basis Predictor x1: % of tricalcium aluminate Predictor x2: % of tricalcium silicate Predictor x3: % of tetracalcium alumino ferrite Predictor x4: % of dicalcium silicate

Example: Cement data

Two basic methods of selecting predictors Stepwise regression: Enter and remove predictors, in a stepwise manner, until no justifiable reason to enter or remove more. Best subsets regression: Select the subset of predictors that do the best at meeting some well-defined objective criterion.

Why best subsets regression? # of predictors (p-1) # of regression models 1 2 : ( ) (x1) 2 4 : ( ) (x1) (x2) (x1, x2) 3 8: ( ) (x1) (x2) (x3) (x1, x2) (x1, x3) (x2, x3) (x1, x2, x3) 4 16: 1 none, 4 one, 6 two, 4 three, 1 four

Why best subsets regression? If there are p-1 possible predictors, then there are 2p-1 possible regression models containing the predictors. For example, 10 predictors yields 210 = 1024 possible regression models. A best subsets algorithm determines the best subsets of each size, so that choice of the final model can be made by researcher.

What is used to judge “best”? R-squared Adjusted R-squared MSE (or S = square root of MSE) Mallow’s Cp

R-squared Use the R-squared values to find the point where adding more predictors is not worthwhile because it leads to a very small increase in R-squared.

Adjusted R-squared or MSE Adjusted R-squared increases only if MSE decreases, so adjusted R-squared and MSE provide equivalent information. Find a few subsets for which MSE is smallest (or adjusted R-squared is largest) or so close to the smallest (largest) that adding more predictors is not worthwhile.

Mallow’s Cp criterion The goal is to minimize the total standardized mean square error of prediction: which equals: which in English is:

Mallow’s Cp criterion Mallow’s Cp statistic estimates where: SSEp is the error sum of squares for the fitted (subset) regression model with p parameters. MSE(X1,…, Xp-1) is the MSE of the model containing all p-1 predictors. It is an unbiased estimator of σ2. p is the number of parameters in the (subset) model

Facts about Mallow’s Cp Subset models with small Cp values have a small total standardized MSE of prediction. When the Cp value is … near p, the bias is small (next to none), much greater than p, the bias is substantial, below p, it is due to sampling error; interpret as no bias. For the largest model with all possible predictors, Cp= p (always).

Using the Cp criterion So, identify subsets of predictors for which: the Cp value is smallest, and the Cp value is near p (if possible) In general, though, don’t always choose the largest model just because it yields Cp= p.

Best Subsets Regression: y versus x1, x2, x3, x4 Response is y x x x x Vars R-Sq R-Sq(adj) C-p S 1 2 3 4 1 67.5 64.5 138.7 8.9639 X 1 66.6 63.6 142.5 9.0771 X 2 97.9 97.4 2.7 2.4063 X X 2 97.2 96.7 5.5 2.7343 X X 3 98.2 97.6 3.0 2.3087 X X X 3 98.2 97.6 3.0 2.3121 X X X 4 98.2 97.4 5.0 2.4460 X X X X

Stepwise Regression: y versus x1, x2, x3, x4 Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is y on 4 predictors, with N = 13 Step 1 2 3 4 Constant 117.57 103.10 71.65 52.58 x4 -0.738 -0.614 -0.237 T-Value -4.77 -12.62 -1.37 P-Value 0.001 0.000 0.205 x1 1.44 1.45 1.47 T-Value 10.40 12.41 12.10 P-Value 0.000 0.000 0.000 x2 0.416 0.662 T-Value 2.24 14.44 P-Value 0.052 0.000 S 8.96 2.73 2.31 2.41 R-Sq 67.45 97.25 98.23 97.87 R-Sq(adj) 64.50 96.70 97.64 97.44 C-p 138.7 5.5 3.0 2.7

Example: Modeling PIQ

Best Subsets Regression: PIQ versus MRI, Height, Weight Response is PIQ H W e e i i M g g R h h Vars R-Sq R-Sq(adj) C-p S I t t 1 14.3 11.9 7.3 21.212 X 1 0.9 0.0 13.8 22.810 X 2 29.5 25.5 2.0 19.510 X X 2 19.3 14.6 6.9 20.878 X X 3 29.5 23.3 4.0 19.794 X X X

Stepwise Regression: PIQ versus MRI, Height, Weight Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is PIQ on 3 predictors, with N = 38 Step 1 2 Constant 4.652 111.276 MRI 1.18 2.06 T-Value 2.45 3.77 P-Value 0.019 0.001 Height -2.73 T-Value -2.75 P-Value 0.009 S 21.2 19.5 R-Sq 14.27 29.49 R-Sq(adj) 11.89 25.46 C-p 7.3 2.0

Example: Modeling BP

Best Subsets Regression: BP versus Age, Weight, ... Response is BP D u W r S e a P t i t u r A g B i l e g h S o s s Vars R-Sq R-Sq(adj) C-p S e t A n e s 1 90.3 89.7 312.8 1.7405 X 1 75.0 73.6 829.1 2.7903 X 2 99.1 99.0 15.1 0.53269 X X 2 92.0 91.0 256.6 1.6246 X X 3 99.5 99.4 6.4 0.43705 X X X 3 99.2 99.1 14.1 0.52012 X X X 4 99.5 99.4 6.4 0.42591 X X X X 4 99.5 99.4 7.1 0.43500 X X X X 5 99.6 99.4 7.0 0.42142 X X X X X 5 99.5 99.4 7.7 0.43078 X X X X X 6 99.6 99.4 7.0 0.40723 X X X X X X

Stepwise Regression: BP versus Age, Weight, BSA, Duration, Pulse, Stress Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is BP on 6 predictors, with N = 20 Step 1 2 3 Constant 2.205 -16.579 -13.667 Weight 1.201 1.033 0.906 T-Value 12.92 33.15 18.49 P-Value 0.000 0.000 0.000 Age 0.708 0.702 T-Value 13.23 15.96 P-Value 0.000 0.000 BSA 4.6 T-Value 3.04 P-Value 0.008 S 1.74 0.533 0.437 R-Sq 90.26 99.14 99.45 R-Sq(adj) 89.72 99.04 99.35 C-p 312.8 15.1 6.4

Best subsets regression Stat >> Regression >> Best subsets … Specify response and all possible predictors. If desired, specify predictors that must be included in every model. (Researcher’s knowledge!) Select OK. Results appear in session window.