Chapter 5 Heteroskedasticity.

Slides:



Advertisements
Similar presentations
Autocorrelation and Heteroskedasticity
Advertisements

Heteroskedasticity Hill et al Chapter 11. Predicting food expenditure Are we likely to be better at predicting food expenditure at: –low incomes; –high.
Applied Econometrics Second edition
Heteroskedasticity Lecture 17 Lecture 17.
Econometric Modeling Through EViews and EXCEL
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Multicollinearity Multicollinearity - violation of the assumption that no independent variable is a perfect linear function of one or more other independent.
Objectives (BPS chapter 24)
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
LINEAR REGRESSION MODEL
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
Module II Lecture 6: Heteroscedasticity: Violation of Assumption 3
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Chapter 3 Simple Regression. What is in this Chapter? This chapter starts with a linear regression model with one explanatory variable, and states the.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Chapter 5 Heteroskedasticity. What is in this Chapter? How do we detect this problem What are the consequences of this problem? What are the solutions?
Chapter 4 Multiple Regression.
Chapter 9 Simultaneous Equations Models. What is in this Chapter? In Chapter 4 we mentioned that one of the assumptions in the basic regression model.
The Simple Regression Model
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
Chapter 7 Multicollinearity.
Econ 140 Lecture 191 Heteroskedasticity Lecture 19.
Topic 3: Regression.
Review.
Economics Prof. Buckles
Ordinary Least Squares
Regression Method.
Returning to Consumption
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Quantitative Methods Heteroskedasticity.
What does it mean? The variance of the error term is not constant
Model Building III – Remedial Measures KNNL – Chapter 11.
Specification Error I.
Chapter 10 Hetero- skedasticity Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Pure Serial Correlation
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
12.1 Heteroskedasticity: Remedies Normality Assumption.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
Heteroskedasticity Adapted from Vera Tabakova’s notes ECON 4551 Econometrics II Memorial University of Newfoundland.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
EC 532 Advanced Econometrics Lecture 1 : Heteroscedasticity Prof. Burak Saltoglu.
1 B IVARIATE AND MULTIPLE REGRESSION Estratto dal Cap. 8 di: “Statistics for Marketing and Consumer Research”, M. Mazzocchi, ed. SAGE, LEZIONI IN.
Principles of Econometrics, 4t h EditionPage 1 Chapter 8: Heteroskedasticity Chapter 8 Heteroskedasticity Walter R. Paczkowski Rutgers University.
Correlation & Regression Analysis
Example x y We wish to check for a non zero correlation.
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
1 HETEROSCEDASTICITY: WEIGHTED AND LOGARITHMIC REGRESSIONS This sequence presents two methods for dealing with the problem of heteroscedasticity. We will.
11.1 Heteroskedasticity: Nature and Detection Aims and Learning Objectives By the end of this session students should be able to: Explain the nature.
Heteroscedasticity Heteroscedasticity is present if the variance of the error term is not a constant. This is most commonly a problem when dealing with.
Ch5 Relaxing the Assumptions of the Classical Model
Chapter 4 Basic Estimation Techniques
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Fundamentals of regression analysis 2
HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Undergraduated Econometrics
HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?
Simple Linear Regression
Tutorial 1: Misspecification
Heteroskedasticity.
Chapter 7: The Normality Assumption and Inference with OLS
Product moment correlation
BEC 30325: MANAGERIAL ECONOMICS
Heteroskedasticity.
Financial Econometrics Fin. 505
BEC 30325: MANAGERIAL ECONOMICS
Presentation transcript:

Chapter 5 Heteroskedasticity

A regression line

What is in this Chapter? How do we detect this problem What are the consequences of this problem? What are the solutions?

What is in this Chapter? First, We discuss tests based on OLS residuals, likelihood ratio test, G-Q test and the B-P test. The last one is an LM test. Regarding consequences, we show that the OLS estimators are unbiased but inefficient and the standard errors are also biased, thus invalidating tests of significance

What is in this Chapter? Regarding solutions, we discuss solutions depending on particular assumptions about the error variance and general solutions. We also discuss transformation of variables to logs and the problems associated with deflators, both of which are commonly used as solutions to the heteroskedasticity problem.

5.1 Introduction The homoskedasticity = variance of the error terms is constant The heteroskedasticity = variance of the error terms is non-constant Illustrative Example  Table 5.1 presents consumption expenditures (y) and income (x) for 20 families. Suppose that we estimate the equation by ordinary least squares. We get (figures in parentheses are standard errors)

5.1 Introduction We get (figures in parentheses are standard errors) y=0.847 + 0.899 x R2 = 0.986 (0.703) (0.0253) RSS=31.074 Section 5.4

5.1 Introduction

5.1 Introduction

5.1 Introduction

5.1 Introduction

5.1 Introduction The residuals from this equation are presented in Table 5.3 In this situation there is no perceptible increase in the magnitudes of the residuals as the value of x increases Thus there does not appear to be a heteroskedasticity problem.

5.2 Detection of Heteroskedasticity In the illustrative example in Section 5.1 we plotted estimated residual against to see whether we notice any systematic pattern in the residuals that suggests heteroskedasticity in the error. Note however, that by virtue if the normal equation, and are uncorrelated though could be correlated with .

5.2 Detection of Heteroskedasticity Thus if we are using a regression procedure to test for heteroskedasticity, we should use a regression of on or a regression of or In the case of multiple regression, we should use powers of , the predicted value of , or powers of all the explanatory variables.

5.2 Detection of Heteroskedasticity The test suggested by Anscombe and a test called RESET suggested by Ramsey both involve regressing and testing whether or not the coefficients are significant. The test suggested by White involves regressing on all the explanatory variables and their squares and cross products. For instance, with explanatory variables x1, x2, x3, it involves regressing

5.2 Detection of Heteroskedasticity Glejser suggested estimating regressions of the type and so on and testing the hypothesis

5.2 Detection of Heteroskedasticity The implicit assumption behind all these tests is that where zi os an unknown variable and the different tests use different proxies or surrogates for the unknown function f(z).

5.2 Detection of Heteroskedasticity

5.2 Detection of Heteroskedasticity

5.2 Detection of Heteroskedasticity Thus there is evidence of heteroskedasticity even in the log- linear from, although casually looking at the residuals in Table 5.3, we concluded earlier that the errors were homoskedastic. The Goldfeld-Quandt, to be discussed later in this section, also did not reject the hypothesis of homoskedasticity. The Glejser tests, however, show significant heteroskedasticity in the log-linear form.

Assignment Redo this illustrative example The figure of the absolute value of the residual and x variable Linear form Log-linear form Three types of tests: Linear form and log-linear form The e-view table Reject/accept the null hypothesis of homogenous variance

5.2 Detection of Heteroskedasticity Some Other Tests (General tests) Likelihood Ratio Test Goldfeld and Quandt Test Breusch-Pagan Test

5.2 Detection of Heteroskedasticity Likelihood Ratio Test If the number of observations is large, one can use a likelihood ratio test. Divide the residuals (estimated from the OLS regression) into k group with ni observations in the i th group, . Estimate the error variances in each group by . Let the estimate of the error variance from the entire sample be .Then if we define as

5.2 Detection of Heteroskedasticity Goldfeld and Quandt Test If we do not have large samples, we can use the Goldfeld and Quandt test. In this test we split the observations into two groups — one corresponding to large values of x and the other corresponding to small values of x —

5.2 Detection of Heteroskedasticity Fit separate regressions for each and then apply an F-test to test the equality of error variances. Goldfeld and Quandt suggest omitting some observations in the middle to increase our ability to discriminate between the two error variances.

5.2 Detection of Heteroskedasticity Breusch-Pagan Test Suppose that . If there are some variables that influence the error variance and if , then the Breusch and Pagan test is atest of the hypothesis The function can be any function.

5.2 Detection of Heteroskedasticity For instance, f(x) can be ,and so on. The Breusch and Pagan test does not depend on the functional form. Let S0 = regression sum of squares from a regression of Then has a X 2 –distribution with d.f. r. This test is an asymptotic test. An intuitive justification for the test will be given after an illustrative example.

5.2 Detection of Heteroskedasticity Illustrative Example Consider the data in Table 5.1. To apply the Goldfeld-Quandt test we consider two groups of 10 observations each, ordered by the values of the variable x. The first group consists of observations 6, 11, 9, 4, 14, 15, 19, 20 ,1, and 16. The second group consists of the remaining 10.

5.2 Detection of Heteroskedasticity Illustrative Example The estimate equations were Group 1: y=1.0533+ 0.876 x R2 = 0.985 (0.616) (0.038) = 0.475 Group 2: y=3.279 + 0.835 x R2 = 0.904 (3.443) (0.096) = 3.154

5.2 Detection of Heteroskedasticity The F- ratio for the test is The 1% point for the F-distribution with d.f. 8 and 8 is 6.03. Thus the F-value is significant at the 1% level and we reject the hypothesis if homoskedasticity.

5.2 Detection of Heteroskedasticity Group 1: log y = 0.128 + 0.934 x R2 = 0.992 (0.079) (0.030) = 0.001596 Group 2: log y = 0.276 + 0.902 x R2 = 0.912 (0.352) (0.099) = 0.002789 The F-ratio for the test is

5.2 Detection of Heteroskedasticity For d.f. 8 and 8, the 5% point from the F-tables is 3.44. Thus if we use the 5% significance level, we do not reject the hypothesis of homoskedasticity if we consider the linear form but do not reject it in the log-linear form. Note that the White test rejected the hypothesis in both the forms.

5.3 Consequences of Heteroskedasticity

5.4 Solutions to the Heteroskedasticity Problem There are two types of solutions that have been suggested in the literature for the problem of heteroskedasticity:  Solutions dependent on particular assumptions about σi. General solutions. We first discuss category 1: weighted least squares (WLS)

5.4 Solutions to the Heteroskedasticity Problem WLS

5.4 Solutions to the Heteroskedasticity Problem Thus the constant term in this equation is the slope coefficient in the original equation.

5.4 Solutions to the Heteroskedasticity Problem Prais and Houthakker found in their analysis of family budget data that the errors from the equation had variance increasing with household income. They considered a model ,that is, . In this case we cannot divide the whole equation by a known constant as before. For this model we can consider a two-step procedure as follows.

5.4 Solutions to the Heteroskedasticity Problem First estimate and by OLS. Let these estimators be and . Now use the WLS procedure as outlined earlier, that is, regress on and with no constant term. The limitation of the two-step procedure: the error involved in the first step will affect the second step

5.4 Solutions to the Heteroskedasticity Problem This procedure is called a two-step weighted least squares procedure. The standard errors we get for the estimates of and from this procedure are valid only asymptotically. The are asymptotic standard errors because the weights have been estimated.

5.4 Solutions to the Heteroskedasticity Problem One can iterate this WLS procedure further, that is, use the new estimates of and to construct new weights and then use the WLS procedure, and repeat this procedure until convergence. This procedure is called the iterated weighted least squares procedure. However, there is no gain in (asymptotic) efficiency by iteration.

5.4 Solutions to the Heteroskedasticity Problem Illustrative Example As an illustration, again consider the data in Table 5.1.We saw earlier that regressing the absolute values of the residuals on x (in Glejser’s tests) gave the following estimates: Now we regress (with no constant term) where .

5.4 Solutions to the Heteroskedasticity Problem The resulting equation is If we assume that , the two-step WLS procedure would be as follows. Section 5.1

5.4 Solutions to the Heteroskedasticity Problem Next we compute and regress .The results were The in these equations are not comparable. But our interest is in estimates of the parameters in the consumption function.

Assignment Use the data of Table 5.1 to do the WLS Consider the log-liner form Run the Glejser’s tests to check if the log-linear regression model still has non-constant variance Estimate the non-constant variance and run the WLS Write a one-step program using Gauss program

5.5 Heteroskedasticity and the Use of Deflators There are two remedies often suggested and used for solving the heteroskedasticity problem:  Transforming the data to logs Deflating the variables by some measure of "size."

5.5 Heteroskedasticity and the Use of Deflators

5.5 Heteroskedasticity and the Use of Deflators

5.5 Heteroskedasticity and the Use of Deflators One important thing to note is that the purpose in all these procedures of deflation is to get more efficient estimates of the parameters But once those estimates have been obtained, one should make all inferences—calculation of the residuals, prediction of future values, etc., from the original equation—not the equation in the deflated variables.

5.5 Heteroskedasticity and the Use of Deflators Another point to note is that since the purpose of deflation is to get more efficient estimates, it is tempting to argue about the merits of the different procedures by looking at the standard errors of the coefficients. However, this is not correct, because in the presence of heteroskedasticity the standard errors themselves are biased, as we showed earlier

5.5 Heteroskedasticity and the Use of Deflators For instance, in the five equations presented above, the second and third are comparable and so are the fourth and fifth. In both cases if we look at the standard errors of the coefficient of X, the coefficient in the undeflated equation has a smaller standard error than the corresponding coefficient in the deflated equation. However, if the standard errors are biased, we have to be careful in making too much of these differences.

5.5 Heteroskedasticity and the Use of Deflators In the preceding example we have considered miles M as a deflator and also as an explanatory variable. In this context we should mention some discussion in the literature on "spurious correlation" between ratios.

5.5 Heteroskedasticity and the Use of Deflators The argument simply is that even if we have two variables X and Y that are uncorrelated, if we deflate both the variables by another variable Z, there could be a strong correlation between X/Z and Y/Z because of the common denominator Z . It is wrong to infer from this correlation that there exists a close relationship between X and Y.

5.5 Heteroskedasticity and the Use of Deflators Of course, if our interest is in fact the relationship between X/Z and Y/Z, there is no reason why this correlation need be called "spurious." As Kuh and Meyer point out, "The question of spurious correlation quite obviously does not arise when the hypothesis to be tested has initially been formulated in terms of ratios, for instance, in problems involving relative prices.

5.5 Heteroskedasticity and the Use of Deflators Similarly, when a series such as money value of output is divided by a price index to obtain a 'constant dollar' estimate of output, no question of spurious correlation need arise. Thus, spurious correlation can only exist when a hypothesis pertains to undeflated variables and the data have been divided through by another series for reasons extraneous to but not in conflict with the hypothesis framed an exact, i.e., nonstochastic relation.

5.5 Heteroskedasticity and the Use of Deflators In summary, often in econometric work deflated or ratio variables are used to solve the heteroskedasticity problem Deflation can sometimes be justified on pure economic grounds, as in the case of the use of "real" quantities and relative prices In this case all the inferences from the estimated equation will be based on the equation in the deflated variables.

5.5 Heteroskedasticity and the Use of Deflators However, if deflation is used to solve the heteroskedasticity problem, any inferences we make have to be based on the original equation, not the equation in the deflated variables In any case, deflation may increase or decrease the resulting correlations, but this is beside the point. Since the correlations are not comparable anyway, one should not draw any inferences from them.

5.5 Heteroskedasticity and the Use of Deflators Illustrative Example In Table 5.5 we present data on y = population density x = distance from the central business district for 39 census tracts on the Baltimore area in 1970. It has been suggested (this is called the “density gradient model”) that population density follows the relationship where A is the density of the central business district.

5.5 Heteroskedasticity and the Use of Deflators The basic hypothesis is that as you move away from the central business district population density drops off. For estimation purposes we take logs and write

5.5 Heteroskedasticity and the Use of Deflators where . Estimation of this equation by OLS gave the following results (figures in oarenthese are t-values, not standard errors):

5.5 Heteroskedasticity and the Use of Deflators The t-values are very high and the coefficients and significantly different from zero (with a significance level of less than 1%).The sign of is negative, as expected. With cross-sectional data like these we expect heteroskedasticity, and this could result in an underestimation of the standard errors (and thus an overestimation of the t-ratios).

5.5 Heteroskedasticity and the Use of Deflators To check whether there is heteroskedasticity, we have to analyze the estimated residuals . A plot if against showed a positive relationship and hence Glejser’s tests were applied.

5.5 Heteroskedasticity and the Use of Deflators Defining by , the following equations were estimated:

5.5 Heteroskedasticity and the Use of Deflators We choose the specification that gives the highest [or equivalently the highest t-value, since in the case of only one regressor.

5.5 Heteroskedasticity and the Use of Deflators The estimated regressions with t-values in parentheses were

5.5 Heteroskedasticity and the Use of Deflators All the t-statistics are significant, indicating the presence of heteroskedasticity. Based on the highest t-ratio, we chose the second specification (although the fourth specification is equally valid).

5.5 Heteroskedasticity and the Use of Deflators Deflating throughout by gives the regression equations to be estimated as The estimates were (figures in parentheses are t-ratios)