HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 12.4 Multiple Regression Equations
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Objectives o Determine and analyze the equation of a multiple regression model for a given set of data.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Multiple Regression Equations Multiple Regression Model A multiple regression model is a linear regression model using two or more explanatory variables to predict a response variable, given by where are the explanatory variables in the model and are the coefficients of the explanatory variables.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Multiple Regression Equations Multiple Regression Model (cont.) The coefficients, of the explanatory variables are the sample estimates of the corresponding population parameters, As before, the y-intercept of the multiple regression equation is b 0, which is the sample estimate of the population parameter, 0.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Multiple Regression Equations Null and Alternative Hypotheses for an ANOVA Test The null and alternative hypotheses for an ANOVA test to analyze the statistical significance of the linear relationship between the variables in a multiple regression model are as follows. are the coefficients of the explanatory variables, and k is the number of explanatory variables in the model.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model Construct and analyze a multiple regression equation for predicting a child’s reading level based on the following sample data, which omits the variable of teacher’s experience from the multiple regression model that we have been discussing. Use this new model to predict the reading level for a child who is 10 years old and has parents with an average of 17.2 years of education. Which of the two multiple regression models is better at predicting a child’s reading level?
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.)
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.) Solution We will use Microsoft Excel to construct and analyze a multiple regression model for these data. Begin by entering the data as they appear in columns A, B and C. Next, under the Data tab, choose Data Analysis. Select Regression from the options listed. Enter the necessary information into the Regression menu as shown in the following screenshot. Then click OK.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.)
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.) The output is shown in the following figure.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.) To begin, notice that the p-value for the regression equation is p-value ≈ E-08, or approximately , which is extremely small. Thus, this multiple regression equation fits the sample data extremely well. This can also be seen in the adjusted value of R 2 ≈ 0.991, which is close to 1. Notice that the p-values for the coefficients of the individual explanatory variables have changed. These p-values are both small, which indicates that both of these variables are useful in predicting the value of the response variable.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.) Using the coefficients listed in the table, the multiple regression equation for predicting a child’s reading level based on the child’s age, x 1, and the parents’ education, x 2, is as follows.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.) We can now use this new equation to predict the reading level of a 10-year-old child with parents who have an average of 17.2 years of education. Note that x 1 = 10 and x 2 = Substituting these values into the regression equation yields the following.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.) Thus, we would again predict that this child would be reading at a fifth-grade reading level, but not quite as far along as the first prediction of This regression equation, and its predicted value of the response variable, is very similar to the one we calculated with three explanatory variables. So which regression equation does a better job of predicting the value of the response variable? Let's begin by looking at the overall picture. The first model includes three variables, but one was found to be not statistically significant.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Example 12.20: Constructing and Analyzing a Multiple Regression Model (cont.) The second model only uses two variables, both of which were significant. Another consideration is to compare the adjusted R 2 -values for both models. For this new model, the adjusted R 2 ≈ For the first model with three explanatory variables, the adjusted R 2 ≈ The new model has the higher value. In either case, it appears as though the second model would do a better job of predicting the value of the response variable. There are certainly more robust techniques for determining which model is better, but we will leave those for a higher-level course.