Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chengyuan Yin School of mathematics

Similar presentations


Presentation on theme: "Chengyuan Yin School of mathematics"— Presentation transcript:

1 Chengyuan Yin School of mathematics
Econometrics Chengyuan Yin School of mathematics

2 10. Prediction in the Classical Regression Model
Econometrics 10. Prediction in the Classical Regression Model

3 Forecasting Objective: Forecast
Distinction: Ex post vs. Ex ante forecasting Ex post: RHS data are observed Ex ante: RHS data must be forecasted Prediction vs. model validation. Within sample prediction “Hold out sample”

4 Prediction Intervals Given x0 predict y0. Two cases:
Estimate E[y|x0] = x0; Predict y0 = x0 + 0 Obvious predictor, b’x0 + estimate of 0. Forecast 0 as 0, but allow for variance. Alternative: When we predict y0 with bx0, what is the 'forecast error?' Est.y0 - y0 = bx0 - x0 - 0, so the variance of the forecast error is x0Var[b - ]x0 + 2 How do we estimate this? Form a confidence interval. Two cases: If x0 is a vector of constants, the variance is just x0 Var[b] x0. Form confidence interval as usual. If x0 had to be estimated, then we use a random variable. What is the variance of the product? (Ouch!) One possibility: Use bootstrapping.

5 Forecast Variance Variance of the forecast error is
2 + x0’ Var[b]x0 = 2 + 2[x0’ (X’X)-1x0] If the model contains a constant term, this is In terms squares and cross products of deviations from means. Interpretation: Forecast variance is smallest in the middle of our “experience” and increases as we move outside it.

6 Butterfly Effect 5.1 in the 6th edition

7 Salkever’s Algebraic Trick
Salkever’s method of computing the forecasts and forecast variances Multiple regression of produces the least squares coefficient vector followed by the predictions. Residuals are 0 for the predictions, so s2( * )-1 gives the covariance matrix for the coefficient estimates and the variances for the forecasts. (Very clever, useful for understanding. Not actually used in modern software.)

8 Dummy Variable for One Observation
A dummy variable that isolates a single observation. What does this do? Define d to be the dummy variable in question. Z = all other regressors. X = [Z,d] Multiple regression of y on X. We know that X'e = 0 where e = the column vector of residuals. That means d'e = 0, which says that ej = 0 for that particular residual. Fairly important result. Important to know.

9 Oaxaca Decomposition Two groups, two regression models: (Two time periods, men vs. women, two countries, etc.) y1 = X11 + 1 and y2 = X22 + 2 Consider mean values, y1* = E[y1|mean x1] = x1* 1 y2* = E[y2|mean x2] = x2* 2 Now, explain why y1* is diferent from y2*. (I.e., departing from y2, why is y1 different?) (Could reverse the roles of 1 and 2.) y1* - y2* = x1* 1 - x2* 2 = x1*(1 - 2) + (x1* - x2*) 2 (change in model) (change in conditions)

10 The Oaxaca Decomposition


Download ppt "Chengyuan Yin School of mathematics"

Similar presentations


Ads by Google