Multiple regression. Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Qualitative predictor variables
More on understanding variance inflation factors (VIFk)
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Objectives (BPS chapter 24)
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Simple Linear Regression Estimates for single and mean responses.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 11 th Edition.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Polynomial regression models Possible models for when the response function is “curved”
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Descriptive measures of the strength of a linear association r-squared and the (Pearson) correlation coefficient r.
Hypothesis tests for slopes in multiple linear regression model Using the general linear test and sequential sums of squares.
Simple linear regression Linear regression with one predictor variable.
Prediction concerning Y variable. Three different research questions What is the mean response, E(Y h ), for a given level, X h, of the predictor variable?
M23- Residuals & Minitab 1  Department of ISM, University of Alabama, ResidualsResiduals A continuation of regression analysis.
Variable selection and model building Part II. Statement of situation A common situation is that there is a large set of candidate predictor variables.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
Chapter 14 Multiple Regression Models. 2  A general additive multiple regression model, which relates a dependent variable y to k predictor variables.
Chapter 12: Linear Regression 1. Introduction Regression analysis and Analysis of variance are the two most widely used statistical procedures. Regression.
Name: Angelica F. White WEMBA10. Teach students how to make sound decisions and recommendations that are based on reliable quantitative information During.
Introduction to Probability and Statistics Chapter 12 Linear Regression and Correlation.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Summarizing Bivariate Data
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
Detecting and reducing multicollinearity. Detecting multicollinearity.
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
Chap 13-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 13-1 Chapter 13 Simple Linear Regression Basic Business Statistics 12.
Copyright ©2011 Nelson Education Limited Linear Regression and Correlation CHAPTER 12.
Solutions to Tutorial 5 Problems Source Sum of Squares df Mean Square F-test Regression Residual Total ANOVA Table Variable.
Sequential sums of squares … or … extra sums of squares.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
14- 1 Chapter Fourteen McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Non-linear Regression Example.
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Inference with computer printouts. Coefficie nts Standard Errort StatP-value Lower 95% Upper 95% Intercept
Overview of our study of the multiple linear regression model Regression models with more than one slope parameter.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Statistics and Numerical Method Part I: Statistics Week VI: Empirical Model 1/2555 สมศักดิ์ ศิวดำรงพงศ์ 1.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Chapter 12 Simple Linear Regression.
732G21/732G28/732A35 Lecture 4. Variance-covariance matrix for the regression coefficients 2.
Inference for  0 and 1 Confidence intervals and hypothesis tests.
Inference with Computer Printouts. Leaning Tower of Pisa Find a 90% confidence interval. Year Lean
Multicollinearity. Multicollinearity (or intercorrelation) exists when at least some of the predictor variables are correlated among themselves. In observational.
Interaction regression models. What is an additive model? A regression model with p-1 predictor variables contains additive effects if the response function.
Correlation and Regression Elementary Statistics Larson Farber Chapter 9 Hours of Training Accidents.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
Descriptive measures of the degree of linear association R-squared and correlation.
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
Model selection and model building. Model selection Selection of predictor variables.
Announcements There’s an in class exam one week from today (4/30). It will not include ANOVA or regression. On Thursday, I will list covered material and.
Chapter 20 Linear and Multiple Regression
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Least Square Regression
Multiple Regression Chapter 14.
Essentials of Statistics for Business and Economics (8e)
Presentation transcript:

Multiple regression

Example: Brain and body size predictive of intelligence? Sample of n = 38 college students Response (Y): intelligence based on the PIQ (performance) scores from the (revised) Wechsler Adult Intelligence Scale. Predictor (X 1 ): Brain size based on MRI scans (given as count/10,000) Predictor (X 2 ): Height in inches Predictor (X 3 ): Weight in pounds

Scatter matrix plots Scatter plots of response versus predictor helps in determining nature and strength of relationships. Scatter plots of predictor versus predictor helps in studying their relationships, as well as identifying scope of model and outliers.

Scatter matrix plot

Matrix plot in Minitab Select Graph >> Matrix plot … Specify all of the variables (response and predictors) you want graphed. Select OK.

Correlation matrix Correlations: PIQ, MRI, Height, Weight PIQ MRI Height MRI Height Weight Cell Contents: Pearson correlation

Correlation matrix in Minitab Stat >> Basic statistics >> Correlation… Select all of the variables (response and predictors). To get a “crisper” table, de-select default “Display p-values”

Linear regression model with 3 predictors where: Y i = intelligence (PIQ) if student i X i1 = brain size of student i (MRI) X i2 = height of student i (Height) X i3 = weight of student i (Weight)

Fitting multiple regression model in Minitab It’s basically the same as fitting a simple linear regression model. Stat >> Regression >> Regression… Select response and all predictors. Specify all desired options as you would for simple linear regression.

The regression equation is PIQ = MRI Height Weight Predictor Coef SE Coef T P Constant MRI Height Weight How likely is it that b 3 = would be as extreme as it is (?!) if β 3 = 0?

Confidence intervals for β k Sample estimate ± margin of error Predictor Coef SE Coef T P Weight

The regression equation is PIQ = MRI Height Predictor Coef SE Coef T P Constant MRI Height S = R-Sq = 29.5% R-Sq(adj) = 25.5% Coefficient of (multiple) determination Adjusted coefficient of (multiple) determination

Coefficient of (multiple) determination Basically same as before. R 2 = SSR/SSTO = proportionate reduction in total variation in Y associated with using set of X 1, …, X p-1 variables. Again, a large R 2 value does not necessarily imply that the fitted model is a useful one.

Adjusted coefficient of multiple determination Problem: adding more X variables can only increase R 2, because SSTO never changes for a given set of data. But, the remaining error (quantified by SSE) can only get smaller (or stay the same) when more predictor variables are considered. Solution: adjust R 2 to take into account the number of predictors in the model.

Adjusted coefficient of multiple determination

PIQ = MRI Height S = R-Sq = 29.5% R-Sq(adj) = 25.5% Analysis of Variance Source DF SS MS F P Regression Error Total Calculation of R 2 (adj): Interpretation of R 2 (adj):

Impact of the adjustment It’s a trade-off. R-Sq(adj) may even become smaller when another predictor variable is introduced into the model. The regression equation is PIQ = MRI Height S = R-Sq = 29.5% R-Sq(adj) = 25.5% The regression equation is PIQ = MRI Height Weight S = R-Sq = 29.5% R-Sq(adj) = 23.3%

The regression equation is PIQ = MRI Height Analysis of Variance Source DF SS MS F P Regression Error Total Is there a relationship between the response variable and the set of predictor variables? How likely is it that the sample would yield such an extreme F-statistic if the null hypothesis were true?

Caution when predicting or estimating response

What is scope of model?

Predicted Values for New Observations New Obs Fit SE Fit 95.0% CI 95.0% PI (106.64,119.68) (73.02,153.30) (100.19,117.78) (68.41,149.56) Values of Predictors for New Observations New Obs MRI Height S = 19.51

Diagnostics and remedial measures Most procedures carry directly over (with minor modification) from simple linear regression to multiple linear regression. But, some procedures are specific only to multiple linear regression (chapters 9, 10)

Residuals against each predictor Gives an indication of the adequacy of the regression function with respect to each specific predictor variable.

Unusual Observations Obs MRI PIQ Fit SEFit Residual StResid R R denotes an obs’n with a large standardized residual

Residuals versus omitted predictors As usual. Plus, also consider plotting residuals against interaction terms, such as X 1 X 2, because they too are potentially important omitted variables.

Regression interaction terms in Minitab Use the calculator to create a new variable (MRI*Ht). Select Calc >> Calculator. Specify “Store result in variable” (MRI*Ht) Specify Expression: MRI*Height Select OK. The new (interaction) predictor variable will appear in worksheet.

Modified Levene test MRIGrp 1: le : gt 90.5

LOF Test Requires that there are at least some repeats of the same values across all predictor variables. X 1 = 59, X 2 = 63 and X 1 =59 and X 2 =63 is an example of a repeat. X 1 = 59, X 2 = 63 and X 1 =59 and X 2 =66 is not an example of a repeat.

Row MRI Height Row MRI Height

Attempted LOF Test The regression equation is PIQ = MRI Height Analysis of Variance Source DF SS MS F P Regression Error Total No replicates. Cannot do pure error test.