Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.

Slides:



Advertisements
Similar presentations
Inference for Regression
Advertisements

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Objectives (BPS chapter 24)
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
BA 555 Practical Business Analysis
Lecture 25 Multiple Regression Diagnostics (Sections )
Lecture 25 Regression diagnostics for the multiple linear regression model Dealing with influential observations for multiple linear regression Interaction.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Chapter Eighteen MEASURES OF ASSOCIATION
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24 Multiple Regression (Sections )
Lecture 24: Thurs., April 8th
Simple Linear Regression Analysis
Regression Diagnostics Checking Assumptions and Data.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Stat 112: Lecture 16 Notes Finish Chapter 6: –Influential Points for Multiple Regression (Section 6.7) –Assessing the Independence Assumptions and Remedies.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Simple Linear Regression Analysis
Linear Regression/Correlation
Correlation & Regression
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Objectives of Multiple Regression
1 MULTI VARIATE VARIABLE n-th OBJECT m-th VARIABLE.
Chapter 12 Multiple Regression and Model Building.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Research Methods I Lecture 10: Regression Analysis on SPSS.
Lab 5 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  Two-Way.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Stat 112 Notes 16 Today: –Outliers and influential points in multiple regression (Chapter 6.7)
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 2: Review of Multiple Regression (Ch. 4-5)
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Simple Linear Regression (SLR)
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 10.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Multiple Regression Numeric Response variable (y) p Numeric predictor variables (p < n) Model: Y =  0 +  1 x 1 +  +  p x p +  Partial Regression.
Copyright © 2008 by Nelson, a division of Thomson Canada Limited Chapter 18 Part 5 Analysis and Interpretation of Data DIFFERENCES BETWEEN GROUPS AND RELATIONSHIPS.
Chapter 12: Correlation and Linear Regression 1.
Canadian Bioinformatics Workshops
Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Chapter 9 Multiple Linear Regression
Multiple Linear Regression
Regression Diagnostics
Chapter 12: Regression Diagnostics
Regression Model Building - Diagnostics
Diagnostics and Transformation for SLR
Residuals The residuals are estimate of the error
Multiple Regression Models
The greatest blessing in life is
Simple Linear Regression
Regression Model Building - Diagnostics
Chapter 13 Additional Topics in Regression Analysis
Diagnostics and Transformation for SLR
Presentation transcript:

Lab 4 Multiple Linear Regression

Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function of several explanatory variables

Ways of analysis  Matrix of scatterplots  Matrix of correlations  Regression: fit the model (variable selection); interpret the model, t-test & f-test in regression; prediction; diagnostics (linearity, constant var, normality, independence, outliers).

The independent variable, the response  The response: iq  The independent variables: MILK: 0=no breast milk, 1=yes FEM: 0=male kid, 1=female WEEKS: weeks in ventilation SOCIAL: mum’s social class  1,2,3,4 with 1 being the highest RANK: birth order of the kid EDUC: mum’s education level  1,2,3,4,5 with 5 being the highest

Matrix of scatterplots

Correlation among iq, weeks, social, educ, rank

Matrix of correlations

Regression-fit the model  Procedure Analyze  Regression  Linear  Methods of determining independent variables

Methods (details in instruction 4 P18)  Enter: The model is obtained with all specified variables. This is the default method.  Stepwise  Remove  Backward: The variables are removed from the model one by one if the meet the criterion for removal (a maximum significance level or a minimum F value).  Forward:

Regression-interpret model  Interpretation of the output 1. variables entered/removed 2. model summaries (R, R^2) 3. ANOVA test (f-test)

Note on f-test  To test overall significance of the model  its null distribution: f-distribution  To further construct extra-sum-of- squares f-test

4. Coefficients (estimation, t-test, CI of coefficients)  t-test in i-th row  CI of coefficients

Note on t-test and CI of coefficients  t-test to test the significance of a single independent variable can be one-sided its null distribution: t-distribution  95% CI of coefficients estimation of the range of its coefficient with 95% confidence i.e. the 95% changing range of Y with 1 unit increase in its corresponding X

Regression-prediction  Point estimation  Confidence interval of the mean (CI)  Prediction interval of one observation (PI)  e.g.

Multiple Regression-Diagnostics Obtain plots to test the validity of the assumptions Linearity: Residuals vs predicted value (Y) / explanatory variable (X) Constant variance: Residuals vs predicted value (Y) / explanatory variable (X) Normality: QQ plot of residuals Independence: residuals versus the time order of the observations Outliers and influential observations:

What is an influential observation?  An observation is influential if removing it markedly changes the estimated coefficients of the regression model.  An outlier may be an influential observation.

To identify outliers and/or influential observations  Studentized Residuals A case may be considered an outlier if the absolute value of its studentized residual exceeds 2.  Leverage Values The leverage for an observation is larger than 2p/n would imply the observation has a high potential for influence.  Cook ’ s Distances If Cook ’ s distance is close to or larger than 1, the case may be considered influential.

Miscellanies  Multicollinearity it exists if the correlation between independent variables is close to or higher than 0.85  Remember to use Ln(WEEKS) from Question 5

Miscellanies  Understanding meaning of 95% CI of coefficients  Identify “full model” and “reduced model” when doing extra-sum-of- squares f-test