Multiple Regression Lecture 13 Lecture 12.

Multiple Regression Lecture 13 Lecture 12

Today’s plan Moving from the bi-variate to the multivariate
Looking at how the multivariate equation relates to the bi-variate equation Derivation The difference between true and estimated models Lecture 12

Introduction In multivariate regressions, your number of X variables are restricted by (n-k) > 0, where k is the number of parameters in your model In the bi-variate case we had If (n-k)  0 we wouldn’t be able to calculate test statistics We will use an example where earnings is our dependent variable, years of schooling (YRS_SCH) is X1, and age is X2 Lecture 12

Derivation The rules for the derivation of the parameters are the same as for the bi-variate world Our g function will be g(a, b1, b2) We will still want to minimize  e2 Our model will be Y = a + b1X1 + b2X2 + e We can rewrite this in terms from deviations from mean values (coded variables): y = b1x1 + b2x2 + e Lecture 12

Derivation (2) We can rearrange our model in terms of e:
e = Y - a - b1X1 - b2X2 Differentiating with respect to each of the parameters gives us: Lecture 12

Derivation (3) To get our estimate of we use the FOC that the sum of the errors equal zero. We substitute in for e and solve: As we include more variables, we need more terms to calculate the intercept Calculating is more complicated Lecture 12

Derivation (4) We have the first order conditions for
The multivariate case is much more complicated than the bi-variate case, but the pattern remains the same Denominator still considers the variation in X The numerator still considers the variation of X1, X2, and Y Lecture 12

Derivation (6) The multivariate case is much more complicated than the bi-variate case, but the pattern remains the same Denominator still considers the variation in X The numerator still considers the variation of X1, X2, and Y Lecture 12

Matrix of products & cross-products
This will help us calculate b1 and b2, as well as other test statistics we’ll need The matrix of products and cross-products is symmetric Lecture 12

Example On L12.xls there is an example of a matrix of products and cross-products that we’re interested in. This spreadsheet also shows that LINEST can also accommodate a multivariate regression From the spreadsheet we know: Lecture 12

Example (2) We can then calculate: We can also calculate Lecture 12

Y = 4.53 + 0.097 X1,where X1 is years of schooling
Example (3) So now we can ask: What was the effect of including age? Had we not included age, our bi-variate regression equation would be: Y = X1,where X1 is years of schooling Including age, the multivariate regression equation is: Y = X X2 By including age, we reduce the coefficient on education (X1) by nearly a half! Lecture 12

True & estimated models
A true model can come from: 1) Economic theory an example of this is the Cobb-Douglas production function Y=ALK the form is provided by economic theory we want to test if  +  = 1 2) Ad-hoc variable inclusion The justification for the variables comes from economic theory, but we include variables on the basis of significance in statistical tests An example: the Phillips Curve Lecture 12

where X1 is still years of education
Omitted Variable Bias Let’s go back to the returns to education example in L12.xls and examine Omitted Variable Bias: Let’s assume that the true model is: Y = a + b1X1 + b2X2 + e But what if we instead estimate the following model: Y = a + b1X1 + u where X1 is still years of education Lecture 12

True & estimated models (3)
Reasons why we might not estimate the true model we might not be able to collect the necessary data we might simply forget to include other variables such as age in the regression Let’s rewrite our equations in terms of deviations from the mean: True model: y = b1x1 + b2x2 + e Estimated model: y = b1x1 + u Lecture 12

Omitted variable bias Our estimate of the slope coefficient for the bi-variate model will be: If we know the true model we can plug it into the above expression and take the expectation to get: Lecture 12

Omitted variable bias (2)
We can multiply out the terms and simplify the expression: Recall that one of our CLR assumptions is E(x1 e) = 0, so This represents the omitted variable bias Lecture 12

Omitted variable bias example
Returning to the L13.xls example, we have If we think that then: This leads to a biased estimate of Lecture 12

Recap / what’s to come We learned that deriving the multivariate regression equation is similar to deriving the bi-variate case We worked with a matrix of products and cross-products We looked at the difference between true and estimated regression models We learned to calculate the omitted variable bias In the next few lectures we’ll be doing some more with multivariate models and applications Lecture 12

Unnecessary Variables
What happens if variables that are included in the estimated model, are not relevant under the ‘true’ model. Estimated model: y = b1x1 + b2x2 + e True model: y = b1x1 + u If variables are unnecessary, they will not count in the estimated model. How to detect that: t-ratio hypothesis tests/Joint hypothesis tests using the F-distribution. Helps to make models parsimonious. Lecture 12

Multiple Regression Lecture 13 Lecture 12.

Similar presentations

Presentation on theme: "Multiple Regression Lecture 13 Lecture 12."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multiple Regression Lecture 13 Lecture 12.

Similar presentations

Presentation on theme: "Multiple Regression Lecture 13 Lecture 12."— Presentation transcript:

Similar presentations

About project

Feedback