Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.

Slides:



Advertisements
Similar presentations
Topics: Multiple Regression Analysis (MRA)
Advertisements

Chapter 5 Multiple Linear Regression
Cross Sectional Designs
Research Hypotheses and Multiple Regression: 2 Comparing model performance across populations Comparing model performance across criteria.
Statistical Techniques I EXST7005 Multiple Regression.
MULTIPLE REGRESSION. OVERVIEW What Makes it Multiple? What Makes it Multiple? Additional Assumptions Additional Assumptions Methods of Entering Variables.
Comparing the Various Types of Multiple Regression
Research Hypotheses and Multiple Regression Kinds of multiple regression questions Ways of forming reduced models Comparing “nested” models Comparing “non-nested”
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Part I – MULTIVARIATE ANALYSIS C3 Multiple Linear Regression II © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Power Analysis for Correlation & Multiple Regression Sample Size & multiple regression Subject-to-variable ratios Stability of correlation values Useful.
Multiple Regression Models: Some Details & Surprises Review of raw & standardized models Differences between r, b & β Bivariate & Multivariate patterns.
January 6, morning session 1 Statistics Micro Mini Multiple Regression January 5-9, 2008 Beth Ayers.
19-1 Chapter Nineteen MULTIVARIATE ANALYSIS: An Overview.
Statistics 350 Lecture 25. Today Last Day: Start Chapter 9 ( )…please read 9.1 and 9.2 thoroughly Today: More Chapter 9…stepwise regression.
Lecture 6: Multiple Regression
Bivariate & Multivariate Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas process of.
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Chapter 15: Model Building
Multiple Regression – Basic Relationships
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Multiple Regression Dr. Andy Field.
Discriminant Analysis Testing latent variables as predictors of groups.
SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Regression Model Building
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Principal Components An Introduction
Objectives of Multiple Regression
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Model Selection1. 1. Regress Y on each k potential X variables. 2. Determine the best single variable model. 3. Regress Y on the best variable and each.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
1 Psych 5510/6510 Chapter 10. Interactions and Polynomial Regression: Models with Products of Continuous Predictors Spring, 2009.
More Regression What else?.
Testing Path Models, etc. What path models make you think about… Model Identification Model Testing –Theory Trimming –Testing Overidentified Models.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Correlation & Regression Chapter 5 Correlation: Do you have a relationship? Between two Quantitative Variables (measured on Same Person) (1) If you have.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Regression Models w/ 2 Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting the regression.
Regression Models w/ 2-group & Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting.
Multiple Regression Selecting the Best Equation. Techniques for Selecting the "Best" Regression Equation The best Regression equation is not necessarily.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Specification: Choosing the Independent.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Using SPSS Note: The use of another statistical package such as Minitab is similar to using SPSS.
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Topics: Multiple Regression Analysis (MRA)
Chapter 15 Multiple Regression Model Building
Regression.
Linear Model Selection and regularization
Lecture 20 Last Lecture: Effect of adding or deleting a variable
Multiple Regression – Split Sample Validation
Regression Analysis.
Presentation transcript:

Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling Forward Stepwise Modeling All-subsets Regression Problems & Difficulties with automated modeling

Regression Model Building -- we’ve talked about two kinds.. Descriptive Modeling explication of the relationships between the set of predictors and the criterion variable (90-95%) completeness is the goal -- often requiring the combination of information from simple correlations and various multiple regression models Predictive Modeling are going to compute y’ values to make decisions about people (5-10%) efficiency is the goal, want the “best model” with the “fewest predictors”

Interpretation of simple correlations, of full models, and of comparisons of nested and non-nested models are commonly combined to yield descriptive modeling -- remember, the interest is in completely explicating the patterns of relationship between the criterion variable and the predictors. Theory often plays a large part, directing the variables we choose and the models we compare. When interested in predictive modeling, we want 1 model that we will use to compute y’ values for future individuals. Theory can help, cost-analysis is often important, collinearity is to be avoided (because it reduces efficiency). The various “automated regression procedures” were designed for predictive model building (not for descriptive modeling).

The four most commonly used “automated” procedures are… Forward Inclusion -- start with the best predictor and add predictors to get a “best model” Backward Deletion -- start with a full model and delete predictors to get a “best model” Forward Stepwise Inclusions -- a combination of the first two All-subsets Regression -- literally getting all 1-, 2-, 3-,... k- predictors models for comparison

Forward modeling Step 1 the first predictor in the model is the “best single predictor” Select the predictor with the numerically largest simple correlation with the criterion -- if it is a significant correlation r y,x1 vs. r y,x2 vs. r y,x3 vs. r y,x4 by using this procedure we are sure that the initial model “works”

Step 2 the next predictor in the model is the one that will “contribute the most” -- with two equivalent definitions 1. The 2-predictor model (including the first predictor) with the numerically largest R² -- if the R² is significant and significantly larger than the r² from the first step R 2 y.x3,x1 vs. R 2 y.x3,x2 vs. R 2 y.x3,x4 2. Add to the model that predictor with the highest semi-partial correlation with the criterion, controlling the predictor for the predictor already in the model -- if the semi-partial is significant r y(x1.x3) vs. r y(x2.x3) vs. r y.(x4.x3) by using this procedure we are sure the 2-predictor model “works” and “works better than the 1-predictor model”

All subsequent steps -- the next predictor in the model is the one that will “contribute the most” -- with two equivalent definitions 1. The 2-predictor model (including the first predictor) with the numerically largest R² -- if the R² is significant and significantly larger than the R² from the previous step R 2 y.x3,x2,x1 vs. R 2 y.x3,x2,x4 2. Add to the model that predictor with the highest semi-partial correlation with the criterion, controlling the predictor for the predictors already in the model -- if the semi-partial is significant r y.(x1.x3,x2) vs. r y.(x4.x3,x2) by using this procedure, we are sure that each model “works” and “works better than the one before it”

When to quit ??? When no additional predictor will significantly increase the R² (same as when no multiple semi-partial is significant). Difficulties with the forward inclusion model… The major potential problem is “over-inclusion” -- a predictor that contributes to a smaller (earlier) model fails to continue to contribute as the model gets larger (with increased collinearity), but the predictor stays in the model. Fairly small “variations” in the correlation matrix can lead to very different final models -- models often differ on two “splits” of the same sample The resulting model may not be the “best” -- there may be another model with the same # predictors but larger R², etc All of these problems are exacerbated by increased collinearity !!

Backward Deletion Step 1 -- start with the full model (all predictors) -- if the R² is significant. Consider the regression weights of this model. Step 2 -- remove from the model that predictor that “contributes the least” Delete that predictor with the largest p-value associated with its regression (b) weight -- if that p-value is greater than.05. (The idea is … the predictor with the largest p- value is the one “most likely to not be contributing to the model” in the population) b x1 (p=.08) vs. b x2 (p=.02) vs. b x3 (p=.02) vs. b x4 (p=.27) by using this procedure, we know that each model works as well as the previous one (R² numerically, but not statistically smaller)

On all subsequent steps -- the next predictor dropped from the model is that with the largest (non-significant) regression weight. b x1 (p=.21) vs. b x2 (p=.14) vs. b x3 (p=.012) When to quit ?? When all the predictors in the model are contributing to the model. Difficulties with the backward deletion model… The major potential problem is “under-inclusion” -- a predictor that is deleted from a larger (earlier) model would contribute to a smaller model, but isn’t “re-included”. Fairly small “variations” in the correlation matrix can lead to very different final models -- models often differ on two “splits” of the same sample The resulting model may not be the “best” -- there may be another model with the same # predictors but larger R², etc All of these problems are exacerbated by increased collinearity !!

Forward Stepwise Modeling Step 1 the first predictor in the model is the “best single predictor” (same as the forward inclusion model) Select the predictor with the numerically largest simple correlation with the criterion -- if it is a significant correlation by using this procedure we are sure that the initial model “works”

Step 2 the next predictor in the model is the one that will “contribute the most” -- with two equivalent definitions (same as the forward inclusion model) 1. The 2-predictor model (including the first predictor) with the numerically largest R² -- if the R² is significant and significantly larger than the r² from the first step 2. Add to the model that predictor with the highest semi-partial correlation with the criterion, controlling the predictor for the predictor already in the model -- if the semi-partial is significant by using this procedure we are sure the 2-predictor model “works” and “works better than the 1-predictor model”

On all Subsequent steps (each having two parts) a. -- remove from the model that predictor that “contributes the least” (same as the backward deletion model) Delete that predictor with the largest p-value associated with its regression (b) weight -- if that p-value is greater than.05. (The idea is … the predictor with the largest p- value is the one “most likely to not be contributing to the model” in the population) -- if a predictor is deleted, look for a second (third, etc) that should also be deleted, before moving on to part b. by using this procedure, we are sure that all the predictors in the model are contributing before adding any additional predictors to the model

b. the next predictor in the model is the one that will “contribute the most” ( same as for forward inclusion ) -- with two equivalent definitions 1. The 2-predictor model (including the first predictor) with the numerically largest R² -- if the R² is significant and significantly larger than the r² from the first step 2. Add to the model that predictor with the highest semi-partial correlation with the criterion, controlling the predictor for the predictor already in the model -- if the semi-partial is significant by using this procedure we are sure the model with the added predictor “works” and “works better than the model without it”

When to quit ? -- when BOTH of two conditions exist… 1. All predictors included in the model are contributing to it 2. None of the predictors that are not in the model would contribute if they were added. by using this procedure we avoid both over-inclusion and under- inclusion Difficulty with the forward inclusion model The resulting model may not be the “best” -- there may be another model with the same # predictors but larger R² Assumes that the best model is found by starting with the best single predictor This problem is exacerbated by increased collinearity !!

All subsets regression Just what it sounds like… Get all 1-predictor models, all 2- predictor models, all 3-predictor models, etc. Difficulties with all subsets regression generates an awful lot of models (none of which were theoretically determined) Many of the models will be very comparable in R² but have very different theoretical interpretations not available in most stats packages and wearisome to do by hand All of these problems are exacerbated by increased collinearity !!

Problems of forward, backward and stepwise models … All of these procedures were designed for construction of non- theoretical models, and they rarely provide direct tests of the sorts of hypotheses we have !!! Each procedure has a different definition of “best model” that is tied to the particular selection and/or deletion rules that are used. All decisions about which variables are “potential contributors” are made based on significance tests (not really a problem, but …) All selection/deletion decisions between two or more “potential contributors” are made NUMERICALLY, and so, may result in very differently interpreted models with small differences in sampling variability.

Taken together, these problem often lead to … Different procedures (forward, backward, stepwise) often produce different “best models” -- even when starting from the same set of predictors. Different models “perform less differently” (in terms of R²) than they appear (based on their different predictor memberships). So... All-subsets models often provide a useful “wake-up call” for recognizing the limited performance difference among models that may seem very different. Finally… None of these procedures are very useful for DESCRIPTIVE purposes, which usually require careful description of the simple correlations, full model, and various model comparisons to be usefully complete !!