Linear Regression Basics III Violating Assumptions Fin250f: Lecture 7.2 Spring 2010 Brooks, chapter 4(skim) 4.1-2, 4.4, 4.5, 4.7,
Outline Violating assumptions Parameter stability Model building
OLS Assumptions Error variances Error correlations Error normality Functional forms and linearity Omitting variables Adding irrelevant variables
Error Variance
Error Variance Which is a bigger error? * * * * * * Y X
Error Correlations Patterns in residuals Plot residuals/residual diagnostics Further modeling necessary If you can forecast u(t+1), need to work harder
Error Normality Skewness and kurtosis in residuals Testing Plots Bera-Jarque test How can this impact results?
Bera-Jarque Test for Normality
Nonnormal Errors: Impact For some theory: No In practice can be big problem Many extreme data points Forecasting models work hard to fit these extreme outliers Some solutions: Drop data points Robust forecast objectives (absolute errors)
Functional Forms Y=a+bX Actual function is nonlinear Several types of diagnostics Higher order (squared) terms (RESET) Think about specific nonlinear models Neural networks Threshold models Tricky: More later
Omitting Variables Leave out x(2) If it is correlated with x(1) this is a problem. Beta(1) will be biased and inconsistent. Forecast will not be optimal
Irrelevant Variables Overfitting/data snooping Model fits to noise Impacts standard errors for coefficients Coefficients still consistent and unbiased
Parameter Stability Known break point Chow test Predictive failure test Unknown break Quant likelihood ratio test Recursive least squares
Chow Test
Predictive Failure
Unknown Breaks Search for break Look for maximum Chow level Distribution is tricky Monte-carlo/bootstrap
Recursive/rolling estimation Recursive Estimate (1,T1) move T1 to full sample T See if parameters converge Rolling Roll bands (t-T,t) through data Watch parameters move through time We’ll use some of these
Pure Out of Sample Tests Estimate parameters over (1,T1) Get errors over (T1+1,T)
Model Construction General -> specific Less financial theory More statistics Problems: large unwieldy models Simple -> general More theory at the start Problems: can leave out important stuff