Model selection Stepwise regression.

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

All Possible Regressions and Statistics for Comparing Models
Chapter 5 Multiple Linear Regression
Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
Best subsets regression
Chapter 17 Making Sense of Advanced Statistical Procedures in Research Articles.
Multiple Logistic Regression RSQUARE, LACKFIT, SELECTION, and interactions.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
Psychology 202b Advanced Psychological Statistics, II February 22, 2011.
Comparing the Various Types of Multiple Regression
Statistics 350 Lecture 15. Today Last Day: More matrix results and Chapter 5 Today: Start Chapter 6.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Part I – MULTIVARIATE ANALYSIS C3 Multiple Linear Regression II © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Statistics 350 Lecture 25. Today Last Day: Start Chapter 9 ( )…please read 9.1 and 9.2 thoroughly Today: More Chapter 9…stepwise regression.
1 Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Multiple Regression.
Lecture 11 Multivariate Regression A Case Study. Other topics: Multicollinearity  Assuming that all the regression assumptions hold how good are our.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
Chapter 15: Model Building
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Basic Relationships Purpose of multiple regression Different types of multiple regression.
Model selection Stepwise regression. Statement of problem A common problem is that there is a large set of candidate predictor variables. (Note: The examples.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
CHAPTER 5 REGRESSION Discovering Statistics Using SPSS.
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
Lecture 12 Model Building BMTRY 701 Biostatistical Methods II.
Variable selection and model building Part II. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Multiple Regression Selecting the Best Equation. Techniques for Selecting the "Best" Regression Equation The best Regression equation is not necessarily.
Extension to Multiple Regression. Simple regression With simple regression, we have a single predictor and outcome, and in general things are straightforward.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Regression in SAS Caitlin Phelps. Importing Data  Proc Import:  Read in variables in data set  May need some options incase SAS doesn’t guess the format.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 12 Making Sense of Advanced Statistical.
Multiple Regression Selecting the Best Equation. Techniques for Selecting the "Best" Regression Equation The best Regression equation is not necessarily.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Copyright © 2015, 2008, 2011 Pearson Education, Inc. Section 3.2, Slide 1 Chapter 3 Systems of Linear Equations.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
1 Building the Regression Model –I Selection and Validation KNN Ch. 9 (pp )
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Variable selection and model building Part I. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Week of March 23 Partial correlations Semipartial correlations
LECTURE 11: LINEAR MODEL SELECTION PT. 1 March SDS 293 Machine Learning.
Model selection and model building. Model selection Selection of predictor variables.
1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)
CHAPTER 12 More About Regression
Chapter 15 Multiple Regression Model Building
The simple linear regression model and parameter estimation
Logistic Regression APKC – STATS AFAC (2016).
Generalized linear models
Correlation, Bivariate Regression, and Multiple Regression
Chapter 9 Multiple Linear Regression
Guide to Using Minitab 14 For Basic Statistical Applications
Forward Selection The Forward selection procedure looks to add variables to the model. Once added, those variables stay in the model even if they become.
CHAPTER 12 More About Regression
Making Sense of Advanced Statistical Procedures in Research Articles
Model Selection In multiple regression we often have many explanatory variables. How do we find the “best” model?
Linear Model Selection and regularization
Lecture 12 Model Building
Lecture 20 Last Lecture: Effect of adding or deleting a variable
Regression Analysis.
CHAPTER 12 More About Regression
Chapter 11 Variable Selection Procedures
Regression and Categorical Predictors
Regression designs Y X1 Plant size Growth rate 1 10
Presentation transcript:

Model selection Stepwise regression

Statement of problem A common problem is that there is a large set of candidate predictor variables. Goal is to choose a small subset from the larger set so that the resulting regression model is simple, yet have good predictive ability.

Stepwise regression: the idea Start with no predictors in the “stepwise model.” At each step, enter or remove a predictor based on the t-tests. Stop when no more predictors can be justifiably entered or removed from the stepwise model.

Stepwise regression: Preliminary steps Specify an Alpha-to-Enter (αE = 0.15) significance level. Specify an Alpha-to-Remove (αR = 0.15) significance level.

Stepwise regression: Step #1 Fit each of the one-predictor models, that is, regress y on x1, regress y on x2, … regress y on xp-1. The first predictor put in the stepwise model is the predictor that has the smallest P-value (below αE = 0.15). If no P-value < 0.15, stop.

Stepwise regression: Step #2 Suppose x1 was the “best” one predictor. Fit each of the two-predictor models with x1 in the model, that is, regress y on (x1, x2), regress y on (x1, x3), …, and y on (x1, xp-1). The second predictor put in stepwise model is the predictor that has the smallest P-value (below αE = 0.15). If no P-value < 0.15, stop.

Stepwise regression: Step #2 (continued) Suppose x2 was the “best” second predictor. Step back and check P-value for β1 = 0. If the P-value for β1 = 0 has become not significant (above αR = 0.15), remove x1 from the stepwise model.

Stepwise regression: Step #3 Suppose both x1 and x2 made it into the two-predictor stepwise model. Fit each of the three-predictor models with x1 and x2 in the model, that is, regress y on (x1, x2, x3), regress y on (x1, x2, x4), …, and regress y on (x1, x2, xp-1).

Stepwise regression: Step #3 (continued) The third predictor put in stepwise model is the predictor that has the smallest P-value (below αE = 0.15). If no P-value < 0.15, stop. Step back and check P-values for β1 = 0 and β2 = 0. If either P-value has become not significant (above αR = 0.15), remove the predictor from the stepwise model.

Stepwise regression: Stopping the procedure The procedure is stopped when adding an additional predictor does not yield a P-value below αE = 0.15.

Drawbacks of stepwise regression The final model is not guaranteed to be optimal in any specified sense. The procedure yields a single final model, although in practice there are often several equally good models. It doesn’t take into account a researcher’s knowledge about the predictors.