Download presentation
Presentation is loading. Please wait.
1
Lecture 14 Multiple Regression Model
BA 201 Lecture 14 Multiple Regression Model
2
Topics Developing the Multiple Linear Regression
Inferences on Population Regression Coefficients Pitfalls in Multiple Regression and Ethical Issues
3
The Multiple Regression Model
Relationship between 1 dependent & 2 or more independent variables is a linear function Population Y-intercept Population slopes Random Error Residual Dependent (Response) variable for sample Independent (Explanatory) variables for sample model
4
Simple Linear Regression Model Revisited
Y X Observed Value
5
Population Multiple Regression Model
Bivariate model (2 Independent Variables: X1 and X2)
6
Sample Multiple Regression Model
Bivariate model Sample Regression Plane
7
Multiple Linear Regression Equation
Too complicated by hand! Ouch!
8
Multiple Regression Model: Example
(0F) Develop a model for estimating heating oil used for a single family home in the month of January based on average temperature and amount of insulation in inches.
9
Multiple Regression in PHStat
PHStat | Regression | Multiple Regression … EXCEL spreadsheet for the heating oil example.
10
Sample Multiple Regression Equation: Example
Excel Output For each degree increase in temperature, the estimated average amount of heating oil used is decreased by gallons, holding insulation constant. For each increase in one inch of insulation, the estimated average use of heating oil is decreased by gallons, holding temperature constant.
11
Interpretation of Estimated Coefficients
Slope (bi) Estimated that the average value of Y changes by bi for each 1 unit increase in Xi holding all other variables constant (ceterus paribus) Example: If b1 = -2, then fuel oil usage (Y) is expected to decrease by an estimated 2 gallons for each 1 degree increase in temperature (X1) given the inches of insulation (X2) Y-Intercept (b0) The estimated average value of Y when all Xi = 0
12
Simple and Multiple Regression Compared
Coefficients in a simple regression pick up the impact of that variable plus the impacts of other variables that are correlated with it and the dependent variable but are excluded from the model. Coefficients in a multiple regression net out the impacts of other variables in the equation. Hence they are called the net regression coefficients. They still pick up the effects of other variables that excluded form the model but are correlated with the included variables and the dependent variable.
13
Simple and Multiple Regression Compared:Example
Two simple regressions: Multiple Regression:
14
Simple and Multiple Regression Compared: Excel Output
15
Simple and Multiple Regression Compared: Excel Output
=
16
Venn Diagrams and Explanatory Power of a Simple Regression
Variations in Oil explained by the error term Variations in Temp not used in explaining variation in Oil Oil Variations in Oil explained by Temp or variations in Temp used in explaining variation in Oil Temp
17
Venn Diagrams and Explanatory Power of a Simple Regression
(continued) Oil Temp
18
Venn Diagrams and Explanatory Power of a Multiple Regression
Variation NOT explained by Temp nor Insulation Overlapping variation in both Temp and Insulation are used in explaining the variation in Oil but NOT in the estimation of nor Oil Temp Insulation
19
Coefficient of Multiple Determination
Proportion of Total Variation in Y Explained by All X Variables Taken Together Never Decreases When a New X Variable is Added to Model Disadvantage When Comparing Models
20
Venn Diagrams and Explanatory Power of Regression
Oil Temp Insulation
21
Adjusted Coefficient of Multiple Determination
Proportion of Variation in Y Explained by All X Variables adjusted for the Number of X Variables Used and the Sample Size Penalize Excessive Use of Independent Variables Smaller than Useful in Comparing among Models Could Decrease If an Insignificant New X Variable Is Added to the Model
22
Coefficient of Multiple Determination
Excel Output Adjusted r2 reflects the number of explanatory variables and sample size is smaller than r2
23
Interpretation of Coefficient of Multiple Determination
96.56% of the total variation in heating oil can be explained by different temperature and the variation in the amount of insulation 95.99% of the total fluctuation in heating oil can be explained by different temperature and the variation in the amount of insulation after adjusting for the number of explanatory variables and sample size
24
Example: Adjusted r2 Can Decrease
Adjusted r 2 decreases when k increases from 2 to 3
25
Using The Model to Make Predictions
Predict the amount of heating oil used for a home if the average temperature is 300 and the insulation is 6 inches. The predicted heating oil used is gallons
26
Predictions in PHStat PHStat | Regression | Multiple Regression …
Check the “Confidence and Prediction Interval Estimate” box EXCEL spreadsheet for the heating oil example.
27
Another Example The Excel spreadsheet that contains the multiple regression result of regressing Mid-term scores on quiz scores and attendance score
28
Residual Plots Residuals Vs Residuals Vs Time
May need to transform Y variable May need to transform variable May need to transform variable Residuals Vs Time May have autocorrelation
29
Residual Plots: Example
Maybe some non-linear relationship No Discernable Pattern
30
Testing for Overall Significance
Shows if there is a Linear Relationship between all of the X Variables Together and Y Shows if Y Depends Linearly on all of the X Variables Together as a Group Use F Test Statistic Hypotheses: H0: 1 = 2 = … = k = 0 (No linear relationship) H1: At least one i 0 ( At least one independent variable affects Y ) The Null Hypothesis is a Very Strong Statement Almost Always Reject the Null Hypothesis
31
Testing for Overall Significance
(continued) Test Statistic: where F has k numerator and (n-k-1) denominator degrees of freedom
32
Test for Overall Significance Excel Output: Heating Oil Example
p value k = 2, the number of explanatory variables n - 1
33
Test for Overall Significance Example Solution
H0: 1 = 2 = … = k = 0 H1: At least one i 0 = .05 df = 2 and 12 Critical Value(s): Test Statistic: Decision: Conclusion: F 168.47 (Excel Output) Reject at = 0.05 = 0.05 There is evidence that at least one independent variable affects Y F 3.89
34
Test for Significance: Individual Variables
Shows if There is a Linear Relationship Between the Variable Xi and Y while Holding the Effects of other X’s Fixed Show if Y Depends Linearly on a Single Xi Individually while Holding the Effects of other X’s Fixed Use t Test Statistic Hypotheses: H0: i = 0 (No linear relationship) H1: i 0 (Linear relationship between Xi and Y)
35
t Test Statistic Excel Output: Example
t Test Statistic for X1 (Temperature) t Test Statistic for X2 (Insulation)
36
t Test : Example Solution
Does temperature have a significant effect on monthly consumption of heating oil? Test at = 0.05. H0: 1 = 0 H1: 1 0 df = 12 Critical Value(s): Test Statistic: Decision: Conclusion: t Test Statistic = Reject H0 at = 0.05 Reject H Reject H .025 .025 There is evidence of a significant effect of temperature on oil consumption. b1 t 2.1788
37
Confidence Interval Estimate for the Slope
Provide the 95% confidence interval for the population slope 1 (the effect of temperature on oil consumption). 1 The estimated average consumption of oil is reduced by between 4.7 gallons to 6.17 gallons per each increase of 10 F.
38
Additional Pitfalls and Ethical Issues
Fail to Understand that Interpretation of the Estimated Regression Coefficients are Performed Holding All Other Independent Variables Constant Fail to Evaluate Residual Plots for Each Independent Variable
39
Summary Developed the Multiple Regression Model
Addressed Testing the Significance of the Multiple Regression Model Discussed Inferences on Population Regression Coefficients Addressed Pitfalls in Multiple Regression and Ethical Issues
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.