Multiple Regression [ Cross-Sectional Data ]

Slides:



Advertisements
Similar presentations
Forecasting Using the Simple Linear Regression Model and Correlation
Advertisements

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Analysis of Economic Data
Chapter 13 Multiple Regression
Regresi dan Analisis Varians Pertemuan 21 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 14 Introduction to Multiple Regression
Korelasi Ganda Dan Penambahan Peubah Pertemuan 13 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 12 Simple Regression
Interaksi Dalam Regresi (Lanjutan) Pertemuan 25 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Regresi dan Rancangan Faktorial Pertemuan 23 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 12 Multiple Regression
© 2000 Prentice-Hall, Inc. Chap Multiple Regression Models.
Multiple Regression Models. The Multiple Regression Model The relationship between one dependent & two or more independent variables is a linear function.
© 2003 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Chapter Topics Types of Regression Models
Statistics for Business and Economics Chapter 11 Multiple Regression and Model Building.
Multiple Regression and Correlation Analysis
© 2004 Prentice-Hall, Inc.Chap 14-1 Basic Business Statistics (9 th Edition) Chapter 14 Introduction to Multiple Regression.
Ch. 14: The Multiple Regression Model building
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Simple Linear Regression. Chapter Topics Types of Regression Models Determining the Simple Linear Regression Equation Measures of Variation Assumptions.
Chapter 7 Forecasting with Simple Regression
Statistics for Managers Using Microsoft Excel 3rd Edition
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Chapter 8 Forecasting with Multiple Regression
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
Purpose of Regression Analysis Regression analysis is used primarily to model causality and provide prediction –Predicts the value of a dependent (response)
© 2003 Prentice-Hall, Inc.Chap 11-1 Business Statistics: A First Course (3 rd Edition) Chapter 11 Multiple Regression.
Lecture 14 Multiple Regression Model
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
Multiple Regression Analysis Multivariate Analysis.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Chapter 14 Introduction to Multiple Regression
12a - 1 © 2000 Prentice-Hall, Inc. Statistics Multiple Regression and Model Building Chapter 12 part I.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
Chapter 13 Multiple Regression
Lecture 4 Introduction to Multiple Regression
Lecture 10: Correlation and Regression Model.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 14-1 Chapter 14 Introduction to Multiple Regression Statistics for Managers using Microsoft.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Chap 13-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 13 Multiple Regression and.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
© 2000 Prentice-Hall, Inc. Chap Chapter 10 Multiple Regression Models Business Statistics A First Course (2nd Edition)
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Chapter 14 Introduction to Multiple Regression
Lecture 24 Multiple Regression Model And Residual Analysis
Chapter 15 Multiple Regression and Model Building
Linear Regression Using Excel
Multiple Regression Analysis and Model Building
CHAPTER 29: Multiple Regression*
Statistics for Business and Economics
Pemeriksaan Sisa dan Data Berpengaruh Pertemuan 17
Korelasi Parsial dan Pengontrolan Parsial Pertemuan 14
Presentation transcript:

Multiple Regression [ Cross-Sectional Data ] Regression Analysis Multiple Regression [ Cross-Sectional Data ]

Learning Objectives Explain the linear multiple regression model [for cross-sectional data] Interpret linear multiple regression computer output Explain multicollinearity Describe the types of multiple regression models As a result of this class, you will be able to... 2

Regression Modeling Steps Define problem or question Specify model Collect data Do descriptive data analysis Estimate unknown parameters Evaluate model Use model for prediction 14

Simple vs. Multiple  represents the unit change in Y per unit change in X . Does not take into account any other variable besides single independent variable. i represents the unit change in Y per unit change in Xi. Takes into account the effect of other i s. “Net regression coefficient.”

Assumptions Linearity - the Y variable is linearly related to the value of the X variable. Independence of Error - the error (residual) is independent for each value of X. Homoscedasticity - the variation around the line of regression be constant for all values of X. Normality - the values of Y be normally distributed at each value of X.

Goal Develop a statistical model that can predict the values of a dependent (response) variable based upon the values of the independent (explanatory) variables.

Simple Regression A statistical model that utilizes one quantitative independent variable “X” to predict the quantitative dependent variable “Y.”

Multiple Regression A statistical model that utilizes two or more quantitative and qualitative explanatory variables (x1,..., xp) to predict a quantitative dependent variable Y. Caution: have at least two or more quantitative explanatory variables (rule of thumb)

Multiple Regression Model Y e X2 X1

Hypotheses H0: 1 = 2 = 3 = ... = P = 0 H1: At least one regression coefficient is not equal to zero

Hypotheses (alternate format) H0: i = 0 H1: i  0

Types of Models Positive linear relationship Negative linear relationship No relationship between X and Y Positive curvilinear relationship U-shaped curvilinear Negative curvilinear relationship

Multiple Regression Models 8

Multiple Regression Equations This is too complicated! You’ve got to be kiddin’! 16

Multiple Regression Models 10

Population Y-intercept Linear Model Relationship between one dependent & two or more independent variables is a linear function Population Y-intercept Population slopes Random error Dependent (response) variable Independent (explanatory) variables 11

Method of Least Squares The straight line that best fits the data. Determine the straight line for which the differences between the actual values (Y) and the values that would be predicted from the fitted line of regression (Y-hat) are as small as possible.

Measures of Variation Total sum of squares Explained variation (sum of squares due to regression) Unexplained variation (error sum of squares) Total sum of squares

Coefficient of Multiple Determination When null hypothesis is rejected, a relationship between Y and the X variables exists. Strength measured by R2 [ several types ]

Coefficient of Multiple Determination R2y.123- - -P The proportion of Y that is explained by the set of explanatory variables selected

Standard Error of the Estimate sy.x the measure of variability around the line of regression

Confidence interval estimates True mean Y.X Individual Y-hati

Interval Bands [from simple regression] Note: 1. As we move farther from the mean, the bands get wider. 2. The prediction interval bands are wider. Why? (extra Syx) 124

Multiple Regression Equation Y-hat = 0 + 1x1 + 2x2 + ... + PxP +  where: 0 = y-intercept {a constant value} 1 = slope of Y with variable x1 holding the variables x2, x3, ..., xP effects constant P = slope of Y with variable xP holding all other variables’ effects constant

Who is in Charge?

Mini-Case Predict the consumption of home heating oil during January for homes located around Screne Lakes. Two explanatory variables are selected - - average daily atmospheric temperature (oF) and the amount of attic insulation (“).

Mini-Case Develop a model for estimating heating oil used for a single family home in the month of January based on average temperature and amount of insulation in inches. (0F)

Mini-Case What preliminary conclusions can home owners draw from the data? What could a home owner expect heating oil consumption (in gallons) to be if the outside temperature is 15 oF when the attic insulation is 10 inches thick?

Multiple Regression Equation [mini-case] Dependent variable: Gallons Consumed ------------------------------------------------------------------------------------- Standard T Parameter Estimate Error Statistic P-Value -------------------------------------------------------------------------------------- CONSTANT 562.151 21.0931 26.6509 0.0000 Insulation -20.0123 2.34251 -8.54313 0.0000 Temperature -5.43658 0.336216 -16.1699 0.0000 R-squared = 96.561 percent R-squared (adjusted for d.f.) = 95.9879 percent Standard Error of Est. = 26.0138 +

Multiple Regression Equation [mini-case] Y-hat = 562.15 - 5.44x1 - 20.01x2 where: x1 = temperature [degrees F] x2 = attic insulation [inches]

Multiple Regression Equation [mini-case] Y-hat = 562.15 - 5.44x1 - 20.01x2 thus: For a home with zero inches of attic insulation and an outside temperature of 0 oF, 562.15 gallons of heating oil would be consumed. [ caution .. data boundaries .. extrapolation ] +

Extrapolation Extrapolation Prediction Outside the Range of X Values Used to Develop Equation Interpolation Prediction Within the Range of X Values Used to Develop Equation Based on smallest & largest X Values 127

Multiple Regression Equation [mini-case] Y-hat = 562.15 - 5.44x1 - 20.01x2 For a home with zero attic insulation and an outside temperature of zero, 562.15 gallons of heating oil would be consumed. [ caution .. data boundaries .. extrapolation ] For each incremental increase in degree F of temperature, for a given amount of attic insulation, heating oil consumption drops 5.44 gallons. +

Multiple Regression Equation [mini-case] Y-hat = 562.15 - 5.44x1 - 20.01x2 For a home with zero attic insulation and an outside temperature of zero, 562 gallons of heating oil would be consumed. [ caution … ] For each incremental increase in degree F of temperature, for a given amount of attic insulation, heating oil consumption drops 5.44 gallons. For each incremental increase in inches of attic insulation, at a given temperature, heating oil consumption drops 20.01 gallons.

Multiple Regression Prediction [mini-case] Y-hat = 562.15 - 5.44x1 - 20.01x2 with x1 = 15oF and x2 = 10 inches Y-hat = 562.15 - 5.44(15) - 20.01(10) = 280.45 gallons consumed

Coefficient of Multiple Determination [mini-case] R2y.12 = .9656 96.56 percent of the variation in heating oil can be explained by the variation in temperature and insulation. and

Coefficient of Multiple Determination Proportion of variation in Y ‘explained’ by all X variables taken together R2Y.12 = Explained variation = SSR Total variation SST Never decreases when new X variable is added to model Only Y values determine SST Disadvantage when comparing models SSR is sum of squares regression (not residual; that’s SSE). 27

Coefficient of Multiple Determination Adjusted Proportion of variation in Y ‘explained’ by all X variables taken together Reflects Sample size Number of independent variables Smaller [more conservative] than R2Y.12 Used to compare models 28

Coefficient of Multiple Determination (adjusted) R2(adj) y.123- - -P The proportion of Y that is explained by the set of independent [explanatory] variables selected, adjusted for the number of independent variables and the sample size.

Coefficient of Multiple Determination (adjusted) [Mini-Case] R2adj = 0.9599 95.99 percent of the variation in heating oil consumption can be explained by the model - adjusted for number of independent variables and the sample size

Coefficient of Partial Determination Proportion of variation in Y ‘explained’ by variable XP holding all others constant Must estimate separate models Denoted R2Y1.2 in two X variables case Coefficient of partial determination of X1 with Y holding X2 constant Useful in selecting X variables 30

Coefficient of Partial Determination [p. 878] R2y1.234 --- P The coefficient of partial variation of variable Y with x1 holding constant the effects of variables x2, x3, x4, ... xP.

Coefficient of Partial Determination [Mini-Case] R2y1.2 = 0.9561 For a fixed (constant) amount of insulation, 95.61 percent of the variation in heating oil can be explained by the variation in average atmospheric temperature. [p. 879]

Coefficient of Partial Determination [Mini-Case] R2y2.1 = 0.8588 For a fixed (constant) temperature, 85.88 percent of the variation in heating oil can be explained by the variation in amount of insulation.

Testing Overall Significance Shows if there is a linear relationship between all X variables together & Y Uses p-value Hypotheses H0: 1 = 2 = ... = P = 0 No linear relationship H1: At least one coefficient is not 0 At least one X variable affects Y Less chance of error than separate t-tests on each coefficient. Doing a series of t-tests leads to a higher overall Type I error than . 35

Testing Model Portions Examines the contribution of a set of X variables to the relationship with Y Null hypothesis: Variables in set do not improve significantly the model when all other variables are included Must estimate separate models Used in selecting X variables 37

Diagnostic Checking H0 retain or reject If reject - {p-value  0.05} R2adj Correlation matrix Partial correlation matrix

Multicollinearity High correlation between X variables Coefficients measure combined effect Leads to unstable coefficients depending on X variables in model Always exists; matter of degree Example: Using both total number of rooms and number of bedrooms as explanatory variables in same model 41

Detecting Multicollinearity Examine correlation matrix Correlations between pairs of X variables are more than with Y variable Few remedies Obtain new sample data Eliminate one correlated X variable 42

Evaluating Multiple Regression Model Steps Examine variation measures Do residual analysis Test parameter significance Overall model Portions of model Individual coefficients Test for multicollinearity 23

Multiple Regression Models 52

Dummy-Variable Regression Model Involves categorical X variable with two levels e.g., female-male, employed-not employed, etc. 53

Dummy-Variable Regression Model Involves categorical X variable with two levels e.g., female-male, employed-not employed, etc. Variable levels coded 0 & 1 53

Dummy-Variable Regression Model Involves categorical X variable with two levels e.g., female-male, employed-not employed, etc. Variable levels coded 0 & 1 Assumes only intercept is different Slopes are constant across categories 53

Dummy-Variable Model Relationships Same slopes b1 Females b0 + b2 b0 Males X1 68

Dummy Variables Permits use of qualitative data (e.g.: seasonal, class standing, location, gender). 0, 1 coding (nominative data) As part of Diagnostic Checking; incorporate outliers (i.e.: large residuals) and influence measures.

Multiple Regression Models 60

Interaction Regression Model Hypothesizes interaction between pairs of X variables Response to one X variable varies at different levels of another X variable Contains two-way cross product terms Y = 0 + 1x1 + 2x2 + 3x1x2 +  Can be combined with other models e.g. dummy variable models 63

Effect of Interaction Given: Without interaction term, effect of X1 on Y is measured by 1 With interaction term, effect of X1 on Y is measured by 1 + 3X2 Effect increases as X2i increases 67

Interaction Example Y = 1 + 2X1 + 3X2 + 4X1X2 Y 12 8 4 X1 0.5 1 1.5 68

Interaction Example Y 12 8 4 X1 0.5 1 1.5 Y = 1 + 2X1 + 3X2 + 4X1X2 X1 0.5 1 1.5 68

Interaction Example Y 12 8 4 X1 0.5 1 1.5 Y = 1 + 2X1 + 3X2 + 4X1X2 X1 0.5 1 1.5 68

Effect (slope) of X1 on Y does depend on X2 value Interaction Example Y = 1 + 2X1 + 3X2 + 4X1X2 Y Y = 1 + 2X1 + 3(1) + 4X1(1) = 4 + 6X1 12 8 Y = 1 + 2X1 + 3(0) + 4X1(0) = 1 + 2X1 4 X1 0.5 1 1.5 Effect (slope) of X1 on Y does depend on X2 value 68

Multiple Regression Models 71

Inherently Linear Models Non-linear models that can be expressed in linear form Can be estimated by least square in linear form Require data transformation 72

Curvilinear Model Relationships 49

Logarithmic Transformation Y =  + 1 lnx1 + 2 lnx2 +  1 > 0 1 < 0 75

Square-Root Transformation 1 > 0 1 < 0 74

Reciprocal Transformation Asymptote 1 < 0 1 > 0 76

Exponential Transformation 1 > 0 1 < 0 77

Overview Explained the linear multiple regression model Interpreted linear multiple regression computer output Explained multicollinearity Described the types of multiple regression models 79

Source of Elaborate Slides Prentice Hall, Inc Levine, et. all, First Edition

Regression Analysis [Multiple Regression] *** End of Presentation *** Questions?

81