Download presentation
Presentation is loading. Please wait.
Published byClara Barber Modified over 9 years ago
1
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin
2
Multiple Regression and Model Building 15.1The Multiple Regression Model and the Least Squares Point Estimate 15.2Model Assumptions and the Standard Error 15.3R 2 and Adjusted R 2 (This section can be read anytime after reading Section 15.1) 15.4The Overall F Test 15.5Testing the Significance of an Independent Variable 15.6Confidence and Prediction Intervals 15-2
3
Multiple Regression and Model Building Continued 15.7The Sales Territory Performance Case 15.8Using Dummy Variables to Model Qualitative Independent Variables 15.9Using Squared and Interaction Variances 15.10Model Building and the Effects of Multicollinearity 15.11Residual Analysis in Multiple Regression 15.12Logistic Regression 15-3
4
15.1 The Multiple Regression Model and the Least Squares Point Estimate Simple linear regression used one independent variable to explain the dependent variable ◦ Some relationships are too complex to be described using a single independent variable Multiple regression uses two or more independent variables to describe the dependent variable ◦ This allows multiple regression models to handle more complex situations ◦ There is no limit to the number of independent variables a model can use Multiple regression has only one dependent variable LO15-1: Explain the multiple regression model and the related least squares point estimates. 15-4
5
15.2 Model Assumptions and the Standard Error The model is y = β 0 + β 1 x 1 + β 2 x 2 + … + β k x k + Assumptions for multiple regression are stated about the model error terms, ’s LO15-2: Explain the assumptions behind multiple regression and calculate the standard error. 15-5
6
15.3 R 2 and Adjusted R 2 1. Total variation is given by the formula Σ(y i - y ̄) 2 2. Explained variation is given by the formula Σ(y ̂ i - y ̄) 2 3. Unexplained variation is given by the formula Σ(y i - y ̂ i ) 2 4. Total variation is the sum of explained and unexplained variation This section can be covered anytime after reading Section 15.1 LO15-3: Calculate and interpret the multiple and adjusted multiple coefficients of determination. 15-6
7
R 2 and Adjusted R 2 Continued 5. The multiple coefficient of determination is the ratio of explained variation to total variation 6. R 2 is the proportion of the total variation that is explained by the overall regression model 7. Multiple correlation coefficient R is the square root of R 2 LO15-3 15-7
8
15.4 The Overall F Test To test H 0 : β 1 = β 2 = …= β k = 0 versus H a : At least one of β 1, β 2,…, β k ≠ 0 The test statistic is Reject H 0 in favor of H a if F(model) > F * or p-value < * F is based on k numerator and n-(k+1) denominator degrees of freedom LO15-4: Test the significance of a multiple regression model by using an F test. 15-8
9
15.5 Testing the Significance of an Independent Variable A variable in a multiple regression model is not likely to be useful unless there is a significant relationship between it and y To test significance, we use the null hypothesis H 0 : β j = 0 Versus the alternative hypothesis H a : β j ≠ 0 LO15-5: Test the significance of a single independent variable. 15-9
10
15.6 Confidence and Prediction Intervals The point on the regression line corresponding to a particular value of x 01, x 02,…, x 0k, of the independent variables is y ̂ = b 0 + b 1 x 01 + b 2 x 02 + … + b k x 0k It is unlikely that this value will equal the mean value of y for these x values Therefore, we need to place bounds on how far the predicted value might be from the actual value We can do this by calculating a confidence interval for the mean value of y and a prediction interval for an individual value of y LO15-6: Find and interpret a confidence interval for a mean value and a prediction interval for an individual value. 15-10
11
15.8 Using Dummy Variables to Model Qualitative Independent Variables So far, we have only looked at including quantitative data in a regression model However, we may wish to include descriptive qualitative data as well ◦ For example, might want to include the gender of respondents We can model the effects of different levels of a qualitative variable by using what are called dummy variables ◦ Also known as indicator variables LO15-7: Use dummy variables to model qualitative independent variables. 15-11
12
15.9 Using Squared and Interaction Variables The quadratic regression model relating y to x is: y = β 0 + β 1 x + β 2 x 2 + Where: 1.β 0 + β 1 x + β 2 x 2 is the mean value of the dependent variable y 2.β 0, β 1 x, and β 2 x 2 are regression parameters relating the mean value of y to x 3. is an error term that describes the effects on y of all factors other than x and x 2 LO15-8: Use squared and interaction variables. 15-12
13
15.10 Model Building and the Effects of Multicollinearity Multicollinearity is the condition where the independent variables are dependent, related or correlated with each other Effects ◦ Hinders ability to use t statistics and p-values to assess the relative importance of predictors ◦ Does not hinder ability to predict the dependent (or response) variable Detection ◦ Scatter plot matrix ◦ Correlation matrix ◦ Variance inflation factors (VIF) LO15-9: Describe multicollinearity and build a multiple regression model. 15-13
14
15.11 Residual Analysis in Multiple Regression For an observed value of y i, the residual is e i = y i - y ̂ = y i – (b 0 + b 1 x i 1 + … + b k x ik ) If the regression assumptions hold, the residuals should look like a random sample from a normal distribution with mean 0 and variance σ 2 LO15-10: Use residual analysis to check the assumptions of multiple regression. 15-14
15
15.12 Logistic Regression Logistic regression and least squares regression are very similar ◦ Both produce prediction equations The y variable is what makes logistic regression different ◦ With least squares regression, the y variable is a quantitative variable ◦ With logistic regression, it is usually a dummy 0/1 variable LO15-11: Use a logistic model to estimate probabilities and odds ratios. 15-15
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.