Download presentation
Presentation is loading. Please wait.
Published byEugene Oswald Jacobs Modified over 9 years ago
1
7.1 Multiple Regression More than one explanatory/independent variable This makes a slight change to the interpretation of the coefficients This changes the measure of degrees of freedom We need to modify one of the assumptions EXAMPLE: tr t = 1 + 2 p t + 3 a t + e EXAMPLE: qd t = 1 + 2 p t + 3 inc t + e t EXAMPLE: gpa t = 1 + 2 SAT t + 3 STUDY t + e t
2
7.2 Interpretation of Coefficient 2 measures the change in Y from a change in X 2, holding X 3 constant. 23 measures the change in Y from a change in X 3, holding X 2 constant.
3
7.3 Assumptions of the Multiple Regression Model 1.The Regression Model is linear in the parameters and error term y t = 1 + 2 x 2t + 3 x 3t + … k x kt +e t 2. Error Term has a mean of zero: E(e) = 0 E(y) = 1 + 2 x 2t + 3 x 3t + … k x kt 3. Error term has constant variance: Var(e) = E(e 2 ) = 2 4. Error term is not correlated with itself (no serial correlation): Cov(e i,e j ) = E(e i e j ) = 0 i j 5.Data on x’s are not random (and thus are uncorrelated with the error term: Cov(X,e) = E(Xe) = 0) and they are NOT exact linear functions of other explanatory variables. 6.(Optional) Error term has a normal distribution. E~N(0, 2 )
4
7.4 Estimation of the Multiple Regression Model Let’s use a model with 2 independent variables: A scatterplot of points is now a scatter “cloud”. We want to fit the best “line” through these points. In 3 dimensions, the line becomes a plane. The estimated “line” and a residual are defined as before: The idea is to choose values for b 1, b 2, and b 3 such that the sum of squared residuals is minimized.
5
7.5 From here, we minimize this expression with respect to b 1, b 2, and b 3. We set these three derivatives equal to zero and Solve for b 1, b 2, b 3. We get the following formulas: Where:
6
7.6[What is going on here? In the formula for b 2, notice that if x 3 where omitted from the model, the formula reduces to the familiar formula from Chapter 3.] You may wonder why the multiple regression formulas on slide 7.5 aren’t equal to:
7
7.7 y x2x2 x3x3 y x For Bivariate (Simple) Regression We can use a Venn diagram to illustrate the idea of Regression as Analysis of Variance For Multiple Regression
8
7.8 Example of Multiple Regression Suppose we want to estimate a model of home prices using data on the size of the house (sqft), the number of bedrooms (bed) and the number of bathrooms (bath). We get the following results: How does a negative coefficient estimate on bed and bath make sense?
9
7.9 Expected Value We will omit the proofs. The Least Squares estimator for multiple regression is unbiased, regardless of the number of independent variables Variance Formulas With 2 Independent Variables Where r 23 is the correlation between x 2 and x 3 and the parameter 2 is the variance of the error term. We need to estimate 2 using the formula This estimate has T-k degrees of freedom.
10
7.10 Gauss Markov Theorem Under the assumptions 1-5 (the 6 th assumption isn’t needed for the theorem to be true) of the linear regression model, the least squares estimators b 1, b 2, …b k have the smallest variance of all linear and unbiased estimators of 1, 2,… k. They are the BLUE (Best, linear, unbiased, estimator)
11
7.11 Confidence Intervals and Hypothesis Testing The methods for constructing confidence intervals and conducting hypothesis tests are the same as they were for simple regression. The format for a confidence interval is: Where t c depends on the level of confidence and has T-k degrees of freedom. T is the number of observations and k is the number of independent variables plus one for the intercept. Hypothesis Tests: H o : i = c H 1 : i c Use the value of c for i when calculating t. If t > t c or t < - t c reject Ho If c is 0, then we call it a test of significance.
12
7.12 Goodness of Fit R 2 measures the proportion of the variance in the dependent variable that is explained by the independent variable. Recall that Least Squares chooses the line that produces the smallest sum of squared residuals, it also produces the line with the largest R 2. It also has the property that the inclusion of additional independent variables will never increase and will often lower the sum of squared residuals, meaning that R 2 will never fall and will often increase when new independent variables are added, even if the variables have no economic justification. Adjusted R 2 : adjust R 2 for degrees of freedom
13
7.13 Example: Grades at JMU Three models were estimated: gpa t = 1 + 2 SAT t + e t gpa t = 1 + 2 SAT t + 3 CREDITS t + 4 STUDY t + 5 JOB t + 6 EC t + e t gpa t = 1 + 2 SAT t + 3 CREDITS t + 4 STUDY t + 5 JOB t + e t A sample of 55 JMU students was taken Fall 2002. Data on GPA SAT scores Credit Hours Completed Hours of Study per Week Hours at a Job per week Hours at Extracurricular Activites
14
7.14 Here is our simple Regression model. Here is our multiple regression model. Both R 2 and Adjusted R 2 have increased with the inclusion of 4 additional indep. variables.
15
7.15 Notice that the Exclusion of EC increases adjusted R 2 but reduces R 2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.