Basic Biostat15: Multiple Linear Regression1. Basic Biostat15: Multiple Linear Regression2 In Chapter 15: 15.1 The General Idea 15.2 The Multiple Regression.

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

Simple Linear Regression Analysis
Multiple Regression and Model Building
Qualitative predictor variables
Chapter 12 Inference for Linear Regression
Forecasting Using the Simple Linear Regression Model and Correlation
BA 275 Quantitative Business Methods
Inference for Regression
Objectives (BPS chapter 24)
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 12 Multiple Regression
© 2000 Prentice-Hall, Inc. Chap Multiple Regression Models.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Simple Linear Regression Analysis
Ch. 14: The Multiple Regression Model building
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
15: Linear Regression Expected change in Y per unit X.
Simple Linear Regression Analysis
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Chapter 14 Introduction to Multiple Regression
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Statistical Methods Statistical Methods Descriptive Inferential
Examining Relationships in Quantitative Research
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Chapter 13 Multiple Regression
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Simple Linear Regression In the previous lectures, we only focus on one random variable. In many applications, we often work with a pair of variables.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
BPS - 5th Ed. Chapter 231 Inference for Regression.
The simple linear regression model and parameter estimation
Chapter 14 Introduction to Multiple Regression
Regression Analysis AGEC 784.
Inference for Least Squares Lines
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
CHAPTER 3 Describing Relationships
Multiple Regression Analysis and Model Building
Multiple Regression.
Chapter 14: Correlation and Regression
Stats Club Marnie Brennan
CHAPTER 29: Multiple Regression*
Regression Models - Introduction
Unit 3 – Linear regression
Chapter 14 Inference for Regression
Simple Linear Regression
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Basic Biostat15: Multiple Linear Regression1

Basic Biostat15: Multiple Linear Regression2 In Chapter 15: 15.1 The General Idea 15.2 The Multiple Regression Model 15.3 Categorical Explanatory Variables 15.4 Regression Coefficients [15.5 ANOVA for Multiple Linear Regression] [15.6 Examining Conditions] [Not covered in recorded presentation]

Basic Biostat15: Multiple Linear Regression The General Idea Simple regression considers the relation between a single explanatory variable and response variable

Basic Biostat15: Multiple Linear Regression4 The General Idea Multiple regression simultaneously considers the influence of multiple explanatory variables on a response variable Y The intent is to look at the independent effect of each variable while “adjusting out” the influence of potential confounders

Basic Biostat15: Multiple Linear Regression5 Regression Modeling A simple regression model (one independent variable) fits a regression line in 2-dimensional space A multiple regression model with two explanatory variables fits a regression plane in 3- dimensional space

Basic Biostat15: Multiple Linear Regression6 Simple Regression Model Regression coefficients are estimated by minimizing ∑residuals 2 (i.e., sum of the squared residuals) to derive this model: The standard error of the regression (s Y|x ) is based on the squared residuals:

Basic Biostat15: Multiple Linear Regression7 Multiple Regression Model Again, estimates for the multiple slope coefficients are derived by minimizing ∑residuals 2 to derive this multiple regression model: Again, the standard error of the regression is based on the ∑residuals 2 :

Basic Biostat15: Multiple Linear Regression8 Multiple Regression Model Intercept α predicts where the regression plane crosses the Y axis Slope for variable X 1 (β 1 ) predicts the change in Y per unit X 1 holding X 2 constant The slope for variable X 2 (β 2 ) predicts the change in Y per unit X 2 holding X 1 constant

Basic Biostat15: Multiple Linear Regression9 Multiple Regression Model A multiple regression model with k independent variables fits a regression “surface” in k + 1 dimensional space (cannot be visualized)

Basic Biostat15: Multiple Linear Regression Categorical Explanatory Variables in Regression Models Categorical independent variables can be incorporated into a regression model by converting them into 0/1 (“dummy”) variables For binary variables, code dummies “0” for “no” and 1 for “yes”

Basic Biostat15: Multiple Linear Regression11 Dummy Variables, More than two levels For categorical variables with k categories, use k–1 dummy variables SMOKE2 has three levels, initially coded 0 = non-smoker 1 = former smoker 2 = current smoker Use k – 1 = 3 – 1 = 2 dummy variables to code this information like this:

Basic Biostat15: Multiple Linear Regression12 Illustrative Example Childhood respiratory health survey. Binary explanatory variable (SMOKE) is coded 0 for non-smoker and 1 for smoker Response variable Forced Expiratory Volume (FEV) is measured in liters/second The mean FEV in nonsmokers is The mean FEV in smokers is 3.277

Basic Biostat15: Multiple Linear Regression13 Example, cont. Regress FEV on SMOKE least squares regression line: ŷ = X Intercept (2.566) = the mean FEV of group 0 Slope = the mean difference in FEV = − = t stat = with 652 df, P ≈ (same as equal variance t test) The 95% CI for slope β is to (same as the 95% CI for μ 1 − μ 0 )

Basic Biostat15: Multiple Linear Regression14 Dummy Variable SMOKE Regression line passes through group means b = – = 0.711

Basic Biostat15: Multiple Linear Regression15 Smoking increases FEV? Children who smoked had higher mean FEV How can this be true given what we know about the deleterious respiratory effects of smoking? ANS: Smokers were older than the nonsmokers AGE confounded the relationship between SMOKE and FEV A multiple regression model can be used to adjust for AGE in this situation

Basic Biostat15: Multiple Linear Regression Multiple Regression Coefficients Rely on software to calculate multiple regression statistics

Basic Biostat15: Multiple Linear Regression17 Example The multiple regression model is: FEV = −.209(SMOKE) +.231(AGE) SPSS output for our example: Intercept aSlope b 2 Slope b 1

Basic Biostat15: Multiple Linear Regression18 Multiple Regression Coefficients, cont. The slope coefficient associated for SMOKE is −.206, suggesting that smokers have.206 less FEV on average compared to non-smokers (after adjusting for age) The slope coefficient for AGE is.231, suggesting that each year of age in associated with an increase of.231 FEV units on average (after adjusting for SMOKE)

Basic Biostat15: Multiple Linear Regression19 Inference About the Coefficients Inferential statistics are calculated for each regression coefficient. For example, in testing H 0 : β 1 = 0 (SMOKE coefficient controlling for AGE) t stat = −2.588 and P = df = n – k – 1 = 654 – 2 – 1 = 651

Basic Biostat15: Multiple Linear Regression20 Inference About the Coefficients The 95% confidence interval for this slope of SMOKE controlling for AGE is −0.368 to −

Basic Biostat15: Multiple Linear Regression ANOVA for Multiple Regression pp. 343 – 346 (not covered in some courses)

Basic Biostat15: Multiple Linear Regression Examining Regression Conditions Conditions for multiple regression mirror those of simple regression –Linearity –Independence –Normality –Equal variance These are evaluated by analyzing the pattern of the residuals

Basic Biostat15: Multiple Linear Regression23 Residual Plot Figure: Standardized residuals plotted against standardized predicted values for the illustration (FEV regressed on AGE and SMOKE) Same number of points above and below horizontal of 0  no major departures from linearity Higher variability at higher values of Y  unequal variance (biologically reasonable)

Basic Biostat15: Multiple Linear Regression24 Examining Conditions Fairly straight diagonal suggests no major departures from Normality Normal Q-Q plot of standardized residuals