Download presentation
Presentation is loading. Please wait.
Published byChester Lynch Modified over 8 years ago
2
Multiple Regression
3
Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test for Overall Significance n Explain Multicollinearity n Describe the Types of Multiple Regression Models
4
Multiple Regression Models Multiple Regression Models Linear Dummy Variable Linear Non- Linear Inter- action Poly- Nomial Logit Square Root LogReciprocalExponential
5
Linear Multiple Regression Model Relationship Between 1 Dependent & 2 or More Independent Variables Is a Linear Function YXXX kk 1122 Dependent (Response) Variable Independent (Explanatory) Variables Population Slopes Population Y-Intercept Random Error
6
X 2 Y X 1 0 Response Plane (X 1,X 2 ) YX = + 1 X 1 + 2 X 2 Y = + 1 X 1 + 2 X 2 + (ObservedY) Population Bivariate Linear Multiple Regression Model
7
Sample Bivariate Linear Multiple Regression Model X 2 Y X 1 b 0 Y =a +b 1 X 1 +b 2 X 2 +e Response Plane (X 1,X 2 ) (Observed Y) e ^ 1122 Y =a +bX +bX
8
Regression Modeling Steps 1.Define Problem or Question 2.Specify Model 3.Collect Data 4.Do Descriptive Data Analysis 5.Estimate Unknown Parameters 6.Evaluate Model 7.Use Model for Prediction
9
Multiple Linear Regression Coefficient Equations Too Complicated By Hand! Ouch!
10
Parameter Estimation Example n You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.) & newspaper circulation (000) on the number of ad responses (00). You’ve collected the following data: RespSizeCirc 112 488 131 357 264 4106
11
Parameter Estimation Computer Output b2b2b2b2 b1b1b1b1 a
12
Interpretation of Coefficients Solution n Slope (b 1 =.2049) –For Each 1 Sq. In. Increase in Ad Size, the # Responses to Ad Is Expected to Increase by 20.49 Holding Circulation Constant n Slope (b 2 =.2805) –For Each 1,000 paper (1 Unit) Increase in Circulation, # Responses to Ad Is Expected to Increase by 28.05 Holding Ad Size Constant
13
Regression Modeling Steps 1.Define Problem or Question 2.Specify Model 3.Collect Data 4.Do Descriptive Data Analysis 5.Estimate Unknown Parameters 6.Evaluate Model 7.Use Model for Prediction
14
Evaluating the Model n How Well Does the Model Describe the Relationship Between the Variables? n Closeness of ‘Best Fit’ n Assumptions Met n Significance of Parameter Estimates n Correlation Between X Variables n Outliers (Unusual Observations)
15
Evaluating Multiple Regression Model Steps n Examine Variation Measures n Do Residual Analysis n Test Parameter Significance –Overall Model –Individual Coefficients n Test for Multicollinearity n Do Influence Analysis New! Expanded!
16
Coefficient of Multiple Determination n Proportion of Variation in Y ‘Explained’ by All X Variables Taken Together n r 2 Y. 12 = Explained Variation = SSR Total Variation SST n Never Decreases When New X Variable Is Added to Model –Only Y Values Determine SST –Disadvantage When Comparing Models
17
n Proportion of Variation in Y ‘Explained’ by All X Variables Taken Together n Reflects –Sample Size –Number of Independent Variables n Smaller Than r 2 Y. 12 n Used to Compare Models Adjusted Coefficient of Multiple Determination
19
Coefficient of Determination Computer Output r 2 adj Means 95.61% of Variation in Y Is Due to Ad Size & Circulation
20
Coefficients of Correlation
21
Correlation Matrix Computer Output r 12 r Y2 r Y1 All 1’s
22
Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity 5.Do Influence Analysis New! Expanded!
23
Testing Overall Significance n Tests If There Is a Linear Relationship Between All X Variables Together & Y n Uses F Test Statistic n Hypotheses –H 0 : 1 = 2 =... = P = 0 No Linear Relationship –H 1 : At Least One Coefficient Is Not 0 At Least One X Variable Affects Y
24
Analysis of Variance Source of Variation Regression Residual (Error) Total Sum of Squares Degrees of Freedom 1 = k 2 = n- k - 1 n - 1 Mean Square
25
MSR/MSE n - k - 1 P-value k
26
Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity 5.Do Influence Analysis New! Expanded!
27
Multicollinearity n High Correlation Between X Variables n Coefficients Measure Combined Effect n Leads to Unstable Coefficients Depending on X Variables in Model n Always Exists - Matter of Degree n Example - Using Both Sales & Profit as Explanatory Variables in Same Model
28
When independent variables are highly correlated, odd results may occur such that net coefficients are unreliable.
29
Odd Things Happen l Examine Correlation Matrix n Correlations Between Pairs of X Variables Are More than With Y Variable l Sign of slope changes from simple to multiple regression equation l Model passes F-test, but not individual t-tests l Correlation matrix shows different sign than net coefficient (slope)
30
Detecting Multicollinearity Examine Correlation Matrix Examine Correlation Matrix Correlations Between Pairs of X Variables Are More than With Y Variable Rule of Thumb: Potential problem if r >.7 for any 2 independent variables Rule of Thumb: Potential problem if r >.7 for any 2 independent variables Few Remedies Few Remedies Obtain New Sample Data Eliminate One Correlated X Variable
31
Age Experience Salary
35
n Delete one or more of the correlated variables –Drop variable if its t statistic for its net regression coefficient < 1 –Drop the variable if R c 2 increases upon its deletion n Change form of one or more of the independent variables –Change actual salary to real salary –Divide income by population for a per capita income n Exclude variable with lower correlation with y Possible Solutions
40
Multiple Regression Models Multiple Regression Models Linear Dummy Variable Linear Non- Linear Inter- action Poly- Nomial Logit Square Root LogReciprocalExponential
41
Dummy-Variable Regression Model 1. Involves Categorical X Variable with 2 Levels e.g., Male-Female; College-No College e.g., Male-Female; College-No College 2. Variable Levels Coded 0 & 1 3. Assumes Only Intercept Is Different Slopes Are Constant Across Categories Slopes Are Constant Across Categories 4. Dummy-Variable Model YXXX kk 1122
42
Y X 1 Dummy-Variable Model Relationships Category 1 Category 2 Same Slope Different Y-Intercepts
43
Dummy-Variable Model Worksheet X 2 Levels: 0 = Group 1, 1 = Group 2. Run Regression with Y, X 1, X 2 Case YX 1 X 2 1111 2480 3131 4351 ::::
44
Given: Starting Salary of College Grad's GPA i if Female Males ( Females ( if Male ): YbbXbX Y X X YbbXbbbX YbbXb (b bbX i i X X ) 01122 1 2 0112011 01120211 0 1 (0)(0) (1)(1) 2 2 0 1 Interpreting Dummy- Variable Model Equation Same Slopes
45
Computer Output: f Male if Female Males ( Females ( i ): YXX X YXX YXX X X X 357 0 1 357 (0) 35 357 (1) (37)510 12 2 11 11 2 2 1 0 1 5 Dummy-Variable Model Example Same Slopes
46
Conclusion n Explained the Linear Multiple Regression Model n Interpreted Linear Multiple Regression Computer Output n Explained Multicollinearity n Described the Types of Multiple Regression Models
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.