Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.

Similar presentations


Presentation on theme: "Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test."— Presentation transcript:

1

2 Multiple Regression

3 Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test for Overall Significance n Explain Multicollinearity n Describe the Types of Multiple Regression Models

4 Multiple Regression Models Multiple Regression Models Linear Dummy Variable Linear Non- Linear Inter- action Poly- Nomial Logit Square Root LogReciprocalExponential

5 Linear Multiple Regression Model Relationship Between 1 Dependent & 2 or More Independent Variables Is a Linear Function YXXX kk  1122  Dependent (Response) Variable Independent (Explanatory) Variables Population Slopes Population Y-Intercept Random Error

6 X 2 Y X 1  0 Response Plane (X 1,X 2 )  YX =  +  1 X 1 +  2 X 2 Y =  +  1 X 1 +  2 X 2 +  (ObservedY)  Population Bivariate Linear Multiple Regression Model

7 Sample Bivariate Linear Multiple Regression Model X 2 Y X 1 b 0 Y =a +b 1 X 1 +b 2 X 2 +e Response Plane (X 1,X 2 ) (Observed Y) e ^ 1122 Y =a +bX +bX

8 Regression Modeling Steps 1.Define Problem or Question 2.Specify Model 3.Collect Data 4.Do Descriptive Data Analysis 5.Estimate Unknown Parameters 6.Evaluate Model 7.Use Model for Prediction

9 Multiple Linear Regression Coefficient Equations Too Complicated By Hand! Ouch!

10 Parameter Estimation Example n You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.) & newspaper circulation (000) on the number of ad responses (00). You’ve collected the following data: RespSizeCirc 112 488 131 357 264 4106

11 Parameter Estimation Computer Output b2b2b2b2 b1b1b1b1 a

12 Interpretation of Coefficients Solution n Slope (b 1 =.2049) –For Each 1 Sq. In. Increase in Ad Size, the # Responses to Ad Is Expected to Increase by 20.49 Holding Circulation Constant n Slope (b 2 =.2805) –For Each 1,000 paper (1 Unit) Increase in Circulation, # Responses to Ad Is Expected to Increase by 28.05 Holding Ad Size Constant

13 Regression Modeling Steps 1.Define Problem or Question 2.Specify Model 3.Collect Data 4.Do Descriptive Data Analysis 5.Estimate Unknown Parameters 6.Evaluate Model 7.Use Model for Prediction

14 Evaluating the Model n How Well Does the Model Describe the Relationship Between the Variables? n Closeness of ‘Best Fit’ n Assumptions Met n Significance of Parameter Estimates n Correlation Between X Variables n Outliers (Unusual Observations)

15 Evaluating Multiple Regression Model Steps n Examine Variation Measures n Do Residual Analysis n Test Parameter Significance –Overall Model –Individual Coefficients n Test for Multicollinearity n Do Influence Analysis New! Expanded!

16 Coefficient of Multiple Determination n Proportion of Variation in Y ‘Explained’ by All X Variables Taken Together n r 2 Y. 12 = Explained Variation = SSR Total Variation SST n Never Decreases When New X Variable Is Added to Model –Only Y Values Determine SST –Disadvantage When Comparing Models

17 n Proportion of Variation in Y ‘Explained’ by All X Variables Taken Together n Reflects –Sample Size –Number of Independent Variables n Smaller Than r 2 Y. 12 n Used to Compare Models Adjusted Coefficient of Multiple Determination

18

19 Coefficient of Determination Computer Output r 2 adj Means 95.61% of Variation in Y Is Due to Ad Size & Circulation

20 Coefficients of Correlation

21 Correlation Matrix Computer Output r 12 r Y2 r Y1 All 1’s

22 Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity 5.Do Influence Analysis New! Expanded!

23 Testing Overall Significance n Tests If There Is a Linear Relationship Between All X Variables Together & Y n Uses F Test Statistic n Hypotheses –H 0 :  1 =  2 =... =  P = 0  No Linear Relationship –H 1 : At Least One Coefficient Is Not 0  At Least One X Variable Affects Y

24 Analysis of Variance Source of Variation Regression Residual (Error) Total Sum of Squares Degrees of Freedom 1 = k 2 = n- k - 1 n - 1 Mean Square

25 MSR/MSE n - k - 1 P-value k

26 Evaluating Multiple Regression Model Steps 1.Examine Variation Measures 2.Do Residual Analysis 3.Test Parameter Significance Overall Model Overall Model Individual Coefficients Individual Coefficients 4.Test for Multicollinearity 5.Do Influence Analysis New! Expanded!

27 Multicollinearity n High Correlation Between X Variables n Coefficients Measure Combined Effect n Leads to Unstable Coefficients Depending on X Variables in Model n Always Exists - Matter of Degree n Example - Using Both Sales & Profit as Explanatory Variables in Same Model

28 When independent variables are highly correlated, odd results may occur such that net coefficients are unreliable.

29 Odd Things Happen l Examine Correlation Matrix n Correlations Between Pairs of X Variables Are More than With Y Variable l Sign of slope changes from simple to multiple regression equation l Model passes F-test, but not individual t-tests l Correlation matrix shows different sign than net coefficient (slope)

30 Detecting Multicollinearity Examine Correlation Matrix Examine Correlation Matrix  Correlations Between Pairs of X Variables Are More than With Y Variable Rule of Thumb: Potential problem if r >.7 for any 2 independent variables Rule of Thumb: Potential problem if r >.7 for any 2 independent variables Few Remedies Few Remedies  Obtain New Sample Data  Eliminate One Correlated X Variable

31 Age Experience Salary

32

33

34

35 n Delete one or more of the correlated variables –Drop variable if its t statistic for its net regression coefficient < 1 –Drop the variable if R c 2 increases upon its deletion n Change form of one or more of the independent variables –Change actual salary to real salary –Divide income by population for a per capita income n Exclude variable with lower correlation with y Possible Solutions

36

37

38

39

40 Multiple Regression Models Multiple Regression Models Linear Dummy Variable Linear Non- Linear Inter- action Poly- Nomial Logit Square Root LogReciprocalExponential

41 Dummy-Variable Regression Model 1. Involves Categorical X Variable with 2 Levels e.g., Male-Female; College-No College e.g., Male-Female; College-No College 2. Variable Levels Coded 0 & 1 3. Assumes Only Intercept Is Different Slopes Are Constant Across Categories Slopes Are Constant Across Categories 4. Dummy-Variable Model YXXX kk  1122 

42 Y X 1 Dummy-Variable Model Relationships Category 1 Category 2 Same Slope Different Y-Intercepts

43 Dummy-Variable Model Worksheet X 2 Levels: 0 = Group 1, 1 = Group 2. Run Regression with Y, X 1, X 2 Case YX 1 X 2 1111 2480 3131 4351 ::::

44 Given: Starting Salary of College Grad's GPA i if Female Males ( Females ( if Male    ): YbbXbX Y X X YbbXbbbX YbbXb (b bbX i i X X       )    01122 1 2 0112011 01120211 0 1 (0)(0) (1)(1) 2 2 0 1 Interpreting Dummy- Variable Model Equation Same Slopes

45 Computer Output: f Male if Female Males ( Females ( i    ): YXX X YXX YXX X X X        357 0 1 357 (0) 35 357 (1) (37)510 12 2 11 11 2 2 1 0 1 5 Dummy-Variable Model Example Same Slopes

46 Conclusion n Explained the Linear Multiple Regression Model n Interpreted Linear Multiple Regression Computer Output n Explained Multicollinearity n Described the Types of Multiple Regression Models


Download ppt "Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test."

Similar presentations


Ads by Google