Multiple Regression MARE 250 Dr. Jason Turner.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Best subsets regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Objectives (BPS chapter 24)
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Statistics for the Social Sciences
Applied Linear Regression CSTAT Workshop March 16, 2007 Vince Melfi.
Statistics for Managers Using Microsoft® Excel 5th Edition
MARE 250 Dr. Jason Turner Multiple Regression. y Linear Regression y = b 0 + b 1 x y = dependent variable b 0 + b 1 = are constants b 0 = y intercept.
Statistics for Managers Using Microsoft® Excel 5th Edition
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Correlation MARE 250 Dr. Jason Turner.
Lecture 6: Multiple Regression
MARE 250 Dr. Jason Turner Multiple Regression. y Linear Regression y = b 0 + b 1 x y = dependent variable b 0 + b 1 = are constants b 0 = y intercept.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24: Thurs., April 8th
Linear Regression MARE 250 Dr. Jason Turner.
Predictive Analysis in Marketing Research
Regression Diagnostics Checking Assumptions and Data.
MARE 250 Dr. Jason Turner Correlation & Linear Regression.
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Correlation and Regression Analysis
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Analysis
Relationships Among Variables
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Correlation & Regression
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Objectives of Multiple Regression
Correlation and Regression
Variable selection and model building Part II. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
MARE 250 Dr. Jason Turner Multiple Regression. y Linear Regression y = b 0 + b 1 x y = dependent variable b 0 + b 1 = are constants b 0 = y intercept.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
MARE 250 Dr. Jason Turner Linear Regression. Linear regression investigates and models the linear relationship between a response (Y) and predictor(s)
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Variable Selection 1 Chapter 8 Variable Selection Terry Dielman Applied Regression Analysis:
ANOVA, Regression and Multiple Regression March
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.
Chapter 12: Correlation and Linear Regression 1.
Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables.
Model selection and model building. Model selection Selection of predictor variables.
MGS4020_Minitab.ppt/Jul 14, 2011/Page 1 Georgia State University - Confidential MGS 4020 Business Intelligence Regression Analysis By Using Minitab Jul.
Stats Methods at IC Lecture 3: Regression.
Correlation, Bivariate Regression, and Multiple Regression
AP Statistics Chapter 14 Section 1.
Chapter 9 Multiple Linear Regression
CHAPTER 29: Multiple Regression*
Simple Linear Regression
Regression Forecasting and Model Building
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Multiple Regression MARE 250 Dr. Jason Turner

Linear Regression y = b0 + b1x Urchin density = b0 + b1(salinity) y = dependent variable b0 + b1 = are constants b0 = y intercept b1 = slope x = independent variable y Urchin density = b0 + b1(salinity)

Multiple Regression y = b0 + b1x y = b0 + b1x1 + b2x2 …bnxn Multiple regression allows us to learn more about the relationship between several independent or predictor variables and a dependent or criterion variable For example, we might be looking for a reliable way to estimate the age of AHI at the dock instead of waiting for laboratory analyses y = b0 + b1x y = b0 + b1x1 + b2x2 …bnxn

Multiple Regression In the social and natural sciences multiple regression procedures are very widely used in research Multiple regression allows the researcher to ask “what is the best predictor of ...?” For example, educational researchers might want to learn what are the best predictors of success in high-school Psychologists may want to determine which personality variable best predicts social adjustment Sociologists may want to find out which of the multiple social indicators best predict whether or not a new immigrant group will adapt and be absorbed into society.

Multiple Regression The general computational problem that needs to be solved in multiple regression analysis is to fit a straight line to a number of points                                               In the simplest case - one dependent and one independent variable Can be visualized this in a scatterplot

The Regression Equation A line in a two dimensional or two-variable space is defined by the equation Y=a+b*X; the animation below shows a two dimensional regression equation plotted with three different confidence intervals (90%, 95% 99%)       In the multivariate case, when there is more than one independent variable, the regression line cannot be visualized in the two dimensional space, but can be computed rather easily

Residual Variance and R-square The smaller the variability of the residual values around the regression line relative to the overall variability, the better is our prediction Coefficient of determination (r2) - If we have an R-square of 0.4 we have explained 40% of the original variability, and are left with 60% residual variability. Ideally, we would like to explain most if not all of the original variability Therefore - r2 value is an indicator of how well the model fits the data (e.g., an r2 close to 1.0 indicates that we have accounted for almost all of the variability with the variables specified in the model

Assumptions, Assumptions… Assumption of Linearity It is assumed that the relationship between variables is linear - always look at bivariate scatterplot of the variables of interest Normality Assumption It is assumed in multiple regression that the residuals (predicted minus observed values) are distributed normally (i.e., follow the normal distribution) Most tests (specifically the F-test) are quite robust with regard to violations of this assumption Review the distributions of the major variables with histograms

Effects of Outliers Outliers may be influential observations A data point whose removal causes the regression equation (line) to change considerably Consider removal much like an outlier If no explanation – up to researcher To index

When is too much – too much Stepwise Regression: When is too much – too much Building Models via Stepwise Regression Stepwise model-building techniques for regression The basic procedures involve: identifying an initial model iteratively "stepping," that is, repeatedly altering the model at the previous step by adding or removing a predictor variable in accordance with the "stepping criteria," terminating the search when stepping is no longer possible given the stepping criteria

For Example… We are interested in predicting values for Y based upon several X’s…Age of AHI based upon SL, BM, OP, PF We run multiple regression and get the equation: Age = - 2.64 + 0.0382 SL + 0.209 BM + 0.136 OP + 0.467 PF We then run a STEPWISE regression to determine the best subset of these variables

How does it work… Response is Age S B O P Vars R-Sq R-Sq(adj) C-p S L M P F 1 77.7 77.4 8.0 0.96215 X 1 60.3 59.8 76.6 1.2839 X 2 78.9 78.3 5.4 0.94256 X X 2 78.6 78.0 6.6 0.94962 X X 3 79.8 79.1 3.6 0.92641 X X X 3 79.1 78.3 6.5 0.94353 X X X 4 80.0 79.0 5.0 0.92897 X X X X

How does it work… Stepwise Regression: Age versus SL, BM, OP, PF Alpha-to-Enter: 0.15 Alpha-to-Remove: 0.15 Response is Age on 4 predictors, with N = 84 Step 1 2 3 Constant -0.8013 -1.1103 -5.4795 BM 0.355 0.326 0.267 T-Value 16.91 13.17 6.91 P-Value 0.000 0.000 0.000 OP 0.096 0.101 T-Value 2.11 2.26 P-Value 0.038 0.027 SL 0.087 T-Value 1.96 P-Value 0.053 S 0.962 0.943 0.926 R-Sq 77.71 78.87 79.84 R-Sq(adj) 77.44 78.35 79.08 Mallows C-p 8.0 5.4 3.6

Who Cares? Stepwise analysis allows you (i.e. – computer) to determine which predictor variables (or combination of) best explain (can be used to predict) Y Much more important as number of predictor variables increase Helps to make better sense of complicated multivariate data