Lecture 16 Preview: Heteroskedasticity

Slides:



Advertisements
Similar presentations
Autocorrelation and Heteroskedasticity
Advertisements

Applied Econometrics Second edition
Econometric Modeling Through EViews and EXCEL
Chapter 12 Inference for Linear Regression
The Simple Regression Model
Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure The Ordinary Least Squares Estimation Procedure, Omitted Explanatory.
The Multiple Regression Model.
Lecture 9 Preview: One-Tailed Tests, Two-Tailed Tests, and Logarithms A One-Tailed Hypothesis Test: The Downward Sloping Demand Curve A Two-Tailed Hypothesis.
Multicollinearity Multicollinearity - violation of the assumption that no independent variable is a perfect linear function of one or more other independent.
Objectives (BPS chapter 24)
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
The Simple Linear Regression Model: Specification and Estimation
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
The Simple Regression Model
Chapter Topics Types of Regression Models
Review.
Lecture 10 Preview: Multiple Regression Analysis – Introduction Linear Demand Model and the No Money Illusion Theory A Two-Tailed Test: No Money Illusion.
Introduction to Linear Regression and Correlation Analysis
Returning to Consumption
Quantitative Methods Heteroskedasticity.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Lecture 7 Preview: Estimating the Variance of an Estimate’s Probability Distribution Review: Ordinary Least Squares (OLS) Estimation Procedure Importance.
Lecture 11 Preview: Hypothesis Testing and the Wald Test Wald Test Let Statistical Software Do the Work Testing the Significance of the “Entire” Model.
Lecture 12 Preview: Model Specification and Model Development Model Specification: Ramsey REgression Specification Error Test (RESET) RESET Logic Model.
Principles of Econometrics, 4t h EditionPage 1 Chapter 8: Heteroskedasticity Chapter 8 Heteroskedasticity Walter R. Paczkowski Rutgers University.
6. Simple Regression and OLS Estimation Chapter 6 will expand on concepts introduced in Chapter 5 to cover the following: 1) Estimating parameters using.
Lecture 6 Preview: Ordinary Least Squares Estimation Procedure  The Properties Clint’s Assignment: Assess the Effect of Studying on Quiz Scores General.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Example x y We wish to check for a non zero correlation.
11.1 Heteroskedasticity: Nature and Detection Aims and Learning Objectives By the end of this session students should be able to: Explain the nature.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Chapter 15 Inference for Regression. How is this similar to what we have done in the past few chapters?  We have been using statistics to estimate parameters.
Heteroscedasticity Chapter 8
Chapter 4: Basic Estimation Techniques
Chapter 20 Linear and Multiple Regression
Chapter 4 Basic Estimation Techniques
Regression Analysis AGEC 784.
Inference for Least Squares Lines
6. Simple Regression and OLS Estimation
The Simple Linear Regression Model: Specification and Estimation
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Revisit Omitted Explanatory Variable Bias
Lecture 13 Preview: Dummy and Interaction Variables
Lecture 21 Preview: Panel Data
Lecture 9 Preview: One-Tailed Tests, Two-Tailed Tests, and Logarithms
Lecture 8 Preview: Interval Estimates and Hypothesis Testing
Lecture 18 Preview: Explanatory Variable/Error Term Independence Premise, Consistency, and Instrumental Variables Review Regression Model Standard Ordinary.
Lecture 15 Preview: Other Regression Statistics and Pitfalls
The Simple Regression Model
Fundamentals of regression analysis 2
Lecture 17 Preview: Autocorrelation (Serial Correlation)
I271B Quantitative Methods
Review: Explanatory Variable/Error Term Correlation and Bias
Managerial Economics in a Global Economy
Chapter 12 – Autocorrelation
Lecture 22 Preview: Simultaneous Equation Models – Introduction
Undergraduated Econometrics
Simple Linear Regression
Heteroskedasticity.
Best Fitting Line Clint’s Assignment Simple Regression Model
The Simple Regression Model
The Multiple Regression Model
Financial Econometrics Fin. 505
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Lecture 16 Preview: Heteroskedasticity Regression Model Standard Ordinary Least Squares (OLS) Premises Estimation Procedures Embedded within the Ordinary Least Squares (OLS) Estimation Procedure What Is Heteroskedasticity? Heteroskedasticity and the Ordinary Least Squares (OLS) Estimation Procedure: The Consequences The Mathematics Our Suspicions Confirming Our Suspicions: A Simulation Accounting for Heteroskedasticity: An Example Justifying the Generalized Least Squares (GLS) Estimation Procedure Robust Standard Errors: An Alternative Approach

Regression Model yt = Dependent variable xt = Explanatory variable et = Error term yt = Const + xxt + et Const and x are the parameters t = 1, 2, …, T The error term is a random variable representing random influences: Mean[et] = 0 Standard Ordinary Least Squares (OLS) Premises Error Term Equal Variance Premise: The variance of the error term’s probability distribution for each observation is the same. Error Term/Error Term Independence Premise: The error terms are independent. Explanatory Variable/Error Term Independence Premise: The explanatory variables, the xt’s, and the error terms, the et’s, are not correlated. OLS Estimation Procedure Includes Three Estimation Procedures Value of the parameters, Const and x: bx = bConst = Question: What happens when the error term equal variance premise is violated? Variance of the error term’s probability distribution, Var[e]: SSR EstVar[e] = Degrees of Freedom Variance of the coefficient estimate’s probability distribution, Var[bx]: EstVar[bx] = Good News: When the standard premises are satisfied each of these procedures is unbiased. Good News: When the standard premises are satisfied the OLS estimation procedure for the coefficient value is the best linear unbiased estimation procedure (BLUE). Crucial Point: When the ordinary least squares (OLS) estimation procedure performs its calculations, it implicitly assumes that the three standard (OLS) premises are satisfied.

Error Term Equal Variance Premise: The variance of the error term’s probability distribution for each observation is the same; all the variances equal Var[e]: Var[e1] = Var[e2] = … = Var[eT] = Var[e] Let us review precisely what this means. What Is Heteroskedasticity? Consider the error terms of Professor Lord’s three students:  Lab 16.1 Heter = 0: No Heteroskedasticity For each student, the mean equals 0. This indicates that each student’s error term indeed reflects random influences. The variances are equal. No heteroskedasticity is present; the error term equal variance premise is satisfied.

Error Term Equal Variance Premise: The variance of the error term’s probability distribution for each observation is the same; all the variances equal Var[e]: Var[e1] = Var[e2] = … = Var[eT] = Var[e]  Lab 16.1 Heter = 1 For each student, the mean equals 0. This indicates that each student’s error term indeed reflects random influences. The variances are not equal. Heteroskedasticity is present; the error term equal variance premise is violated. Question: Why might heteroskedasticity exist in this case?

Consequences of Heteroskedasticity How does the presence of heteroskedasticity affect the estimation procedure for the bx = value of the coefficient? SSR variance of the error term’s probability distribution? EstVar[e] = Degrees of Freedom variance of the coefficient estimate’s probability distribution? EstVar[bx] = More specifically, are the three estimation procedures embedded in the ordinary least squares (OLS) estimation procedure still unbiased in the presence of heteroskedasticity? Estimation Procedure for the Value of the Coefficient Question: In the presence of heteroskedasticity, is the OLS estimation procedure for the value of the coefficient unbiased? That is, does Mean[bx] still equal x? Review: Arithmetic of Means Mean of a constant plus a variable: Mean[c + x] = c + Mean[x] Mean of a constant times a variable: Mean[cx] = c Mean[x] Mean of the sum of two variables: Mean[x + y] = Mean[x] + Mean[y]

Mean of the Coefficient Estimate’s Probability Distribution bx Mean[c + x] = c + Mean[x] Rewrite fraction as a product Mean[cx] = cMean[x] Mean[x+y] = Mean[x] + Mean[y] Mean[cx] = cMean[x] Mean[e1] = Mean[e2] = Mean[e3] = 0 Question: Have we relied on the error term equal variance premise to show that the OLS estimation procedure for the coefficient value is unbiased? No Question: In the presence of heteroskedasticity, should we expect the OLS estimation procedure for the coefficient value still to be unbiased? Yes

This equation is estimating a “single” Var[e]. OLS Estimation Procedure: Variance of the Coefficient Estimate’s Probability Distribution Question: In the presence of heteroskedasticity, is the OLS estimation procedure for the variance of the coefficient estimate’s probability distribution unbiased? Recall the two step strategy we used to estimate the variance of the coefficient estimate’s probability distribution: Step 1: Estimate the variance of the error term’s probability distribution from the available information. Step 2: Apply the relationship between the variances of the coefficient estimate’s and error term’s probability distributions: SSR EstVar[e] = Var[bx] = Degrees of Freedom EstVar[e] This equation is estimating a “single” Var[e]. What does Var[e] equal? EstVar[bx] = Var[e1] = … = Var[eT] = Var[e] Strategy: The strategy the ordinary least squares (OLS) estimation procedure uses is based on the premise that there is a “single” Var[e]. Question: Has the OLS estimation procedure relied on the error term equal variance premise to estimate the variance of the coefficient estimate’s probability distribution? Yes Question: In the presence of heteroskedasticity, might the OLS estimation procedure for the coefficient estimate’s variance be flawed? Yes

Our Suspicions: OLS estimation procedure for estimating the coefficient value should be unbiased. The variance calculation is based on a false premise. variance of the coefficient estimate’s probability distribution may be flawed. Act Coef Is the estimation procedure for the coefficient value unbiased? Unbiased estimation procedure: After many, many repetitions of the experiment the average (mean) of the estimates equals the actual value. 2 0 2 Mean (average) of the value estimates from all repetitions. Repetition Coefficient estimate for this repetition: Coef Value Est Variance of the estimated coefficient values estimates from all repetitions. Mean bx = Var Sum Sqr XDev Is the estimation procedure for the variance of the coefficient estimate’s probability distribution unbiased? SSR EstVar[e] = Degrees of Freedom SSR EstVar[bx] = Coef Var Est Estimate of the variance for the coefficient estimate’s probability distribution calculated from this repetition Mean Average of the variance estimates from all repetitions “Single” Var[e] premise

Is the OLS estimation procedure for the coefficient’s value unbiased? Simulation Results  Lab 16.2 Is OLS estimation procedure for the variance of the coefficient estimate’s probability distribution unbiased? Is the OLS estimation procedure for the coefficient’s value unbiased? Mean (Average) Variance of the Average of Actual of the Estimated Estimated Coef Estimated Variances, Heter Value Values, bx, from Values, bx, from EstVar[bx], from Each Factor of x All Repetitions All Repetitions All Repetitions 2.0 2.0 2.5 2.5 1 2.0 2.0 3.6 2.9 When heteroskedasticity is absent Nothing but good news When heteroskedasticity is present Good news: OLS estimation procedure for the coefficient value is unbiased. Bad news: OLS procedure for estimating the variance of the coefficient estimate’s probability distribution is flawed because it is based on a false premise. Consequently, all calculations based on the variance of the coefficient estimate’s probability distribution will be flawed: standard errors t-statistics tail probabilities

Accounting for Heteroskedasticity Step 1: Apply the Ordinary Least Squares (OLS) Estimation Procedure. Estimate the model’s parameters with the ordinary least squares (OLS) estimation procedure. Step 2: Consider the Possibility of Heteroskedasticity. Ask whether there is reason to suspect that heteroskedasticity may be present. Use the ordinary least squares (OLS) regression results to “get a sense” of whether hetereoskedasticity is a problem by examining the residuals. If the presence of hetereoskedasticity is suspected, formulate a model to explain it. Use the Breusch-Pagan-Godfrey approach by estimating an artificial regression to test for the presence of heteroskedasticity. Step 3: Apply the Generalized Least Squares (GLS) Estimation Procedure. Apply the model of heteroskedasticity and algebraically manipulate the original model to derive a new, tweaked model in which the error terms do not suffer from heteroskedasticity. Use the ordinary least squares (OLS) estimation procedure to estimate the parameters of the tweaked model. An Example: GDP and Internet Use Theory: Higher per capita GDP increases Internet use. Model: LogUsersInternett = Const + GDPGdpPCt + et Theory: GDP > 0. 1992 Internet Data: Cross section data of Internet use and gross domestic product for 29 countries in 1992. LogUsersInternett Log of Internet users per 1,000 people in nation t GdpPCt Per capita GDP (1,000’s of real “international” dollars) in nation t

Ordinary Least Squares (OLS) Step 1: Apply the Ordinary Least Squares (OLS) Estimation Procedure. Theory: Higher per capita GDP increases Internet use.  EViews Model: LogUsersInternett = Const + GDPGdpPCt + et Ordinary Least Squares (OLS) Dependent Variable: LogUsersInternet Explanatory Variable(s): Estimate SE t-Statistic Prob GdpPC 0.100772 0.032612 3.090019 0.0046 Const 0.486907 0.631615 -0.770891 0.4475 Number of Observations 29 Estimated Equation: EstLogUsersInternet = .487 + .101GdpPC Interpretation: We estimate that a $1,000 increase in real per capita GDP results in a 10.1 percent increase in Internet users. Critical Result: The GdpPC coefficient estimate equals .101. The positive sign of the coefficient estimate suggests that higher per capita GDP increases Internet use. This evidence supports the theory. H0: GDP = 0 Per capita GDP does not affect Internet use H1: GDP > 0 Higher per capita GDP increases Internet use Prob[Results IF H0 True]: What is the probability that the GdpPC estimate from one repetition of the experiment will be .101 or more, if H0 is true (that is, if the per capita GDP has no effect on the Internet use, if GDP actually equals 0)?

Ordinary Least Squares (OLS) Prob[Results IF H0 True]: What is the probability that the GdpPC estimate from one repetition of the experiment will be .101 or more, if H0 is true (that is, if the per capita GDP has no effect on the Internet use, if GDP actually equals 0)? H0: GDP = 0 H1: GDP > 0 t-distribution Mean = 0 SE = .0326 DF = 27 .0046/2 .0046/2 Using the tails probability: bGDP .0046 = .0023 Prob[Results IF H0 True] = .101 .101 2 .101 Would we reject H0 at the traditional levels? Yes Question: Could there be a potential problem? Question: What do we know about EstVar[bGDP] when heteroskedasticity is present? Answer: It is based on a false premise; EstVar[bGDP] could be inaccurate. Question: What is the SE[bGDP]? Answer: The square root of EstVar[bGDP] The tails probability is based on the SE[bGDP]. Consequently, the tails probability could be inaccurate also. Our calculation of Prob[Results IF H0 True] could be misleading us. Ordinary Least Squares (OLS) Dependent Variable: LogUsersInternet Explanatory Variable(s): Estimate SE t-Statistic Prob GdpPC 0.100772 0.032612 3.090019 0.0046 Const 0.486907 0.631615 -0.770891 0.4475 Number of Observations 29 Degrees of Freedom 27

 EViews Step 2: Consider the Possibility of Heteroskedasticity. Is reason to suspect that heteroskedasticity may be present? Yes. When the per capita GDP is low, individuals have little to spend on any goods other than the basic necessities. Individuals have little to spend on Internet use and consequently Internet use will be low. When the per capita GDP is high, individuals can afford to purchase more goods. Naturally, consumer tastes vary from nation to nation. In some high per capita GDP nations, individuals will opt to spend much on Internet use. In other high per capita GDP nations, individuals will opt to spend little on Internet use. Model: LogUsersInternett = Const + GDPGdpPCt + et  EViews Two nations with virtually the same level of per capita GDP have quite different rates of Internet use. The error term in the model would capture these differences. As per capita GDP increases we would expect the variance of the error term’s probability distribution to increase.

Use the ordinary least squares (OLS) regression results to “get a sense” of whether hetereoskedasticity is a problem by examining the residuals. We can think of the residuals as the estimated errors. The error terms, the et’s, are unobservable The residuals, the Rest’s, are observatible  yt = Const + xxt + et  Rest = yt  Estyt Estyt = bConst + bxxt  et = yt  (Const + xxt)  Rest = yt  (bConst + bxxt) Our suspicions appear to be borne out.  EViews

Ordinary Least Squares (OLS) If the presence of hetereoskedasticity is suspected, formulate a model to explain it. Heteroskedasticity Model: (et  Mean[et])2 = Const + GDPGdpPCt + vt Theory: GDP > 0  Since Mean[et] = 0. Use the Breusch-Pagan-Godfrey approach by estimating an artificial regression to test for the presence of heteroskedasticity. We can think of the  residuals as the estimated errors. ResSqrt = Const + GDPGdpPCt + vt Aside: Statistical software makes it easy to do this.  EViews Ordinary Least Squares (OLS) Dependent Variable: ResSqr Explanatory Variable(s): Estimate SE t-Statistic Prob GdpPC 0.086100 0.031863 2.702189 0.0118 Const 0.702317 0.617108 1.138078 0.2651 Number of Observations 29 Critical Result: The GdpPC coefficient estimate equals .086. The positive sign of the coefficient estimate suggests that higher per capita GDP increases the squared deviation of the error term from its mean. This evidence supports the view that heteroskedasticity is present. H0: GDP = 0 Per capita GDP does not affect the squared deviation of the residual H1: GDP > 0 Higher per capita GDP increases the squared deviation of the residual Based on these results we assume that the variance of the error terms probability distribution is proportion to GdpPC: .0118 Prob[Results IF H0 True] = = .0059 2 Heteroskedasticity Model: Var[et] = VGdpPCt where V equals a constant

Step 3: Apply the Generalized Least Squares (GLS) Estimation Procedure. Apply the model of heteroskedasticity and algebraically manipulate the original model to derive a new, tweaked model in which the error terms do not suffer from heteroskedasticity. Original Model: LogUsersInternett = Const + GDPGdpPCt + et For now, do not worry about why we divide by ; it will become clear shortly. Divide by Arithmetic of variances: Var[cx] = c2Var[x] Heteroskedasticity Model: Var[et] = VGdpPCt where V equals a constant Crucial Point: The tweaked model does not suffer from heteroskedasticity. That is why we divided by = V

Ordinary Least Squares (OLS) Use the ordinary least squares (OLS) estimation procedure to estimate the parameters of the tweaked model. NB: The tweaked model does not include a constant term.  EViews Ordinary Least Squares (OLS) Dependent Variable: AdjLogUsersInternet Explanatory Variable(s): Estimate SE t-Statistic Prob AdjGdpPC 0.113716 0.026012 4.371628 0.0002 AdjConst 0.726980 0.450615 1.613306 0.1183 Number of Observations 29 H0: GDP = 0 H1: GDP > 0 .0002 = .0001 Prob[Results IF H0 True] = 2 The Ordinary Least Squares (OLS) and Generalized Least Squares (GLS) Estimates GDP Estimate SE t-Statistic Tails Prob Ordinary Least Squares (OLS) .101 .033 3.09 .0046 Generalized Least Squares (GLS) .114 .026 4.37 .0002

Is the estimation procedure for the coefficient’s value unbiased? Justifying the Generalized Least Squares (GLS) Estimation Procedure Is the estimation procedure for the variance of the coefficient estimate’s probability distribution unbiased? Is the estimation procedure for the coefficient’s value unbiased? Recall our simulation: Mean (Average) Variance of the Average of Actual of the Estimated Estimated Coef Estimated Variances, Heter Estim Value Values, bx, from Values, bx, from EstVar[bx], from Each Factor Proc of x All Repetitions All Repetitions All Repetitions OLS 2.0 2.0 2.5 2.5 1 OLS 2.0 2.0 3.6 2.9 1 GLS 2.0 2.0 2.3 2.3  Lab 16.4 Questions: Is the estimation procedure: Std Premises Hetero an unbiased estimation procedure for the OLS OLS GLS coefficient value? Yes Yes Yes variance of the coefficient estimate’s probability distribution? Yes No Yes for the coefficient value the best linear unbiased estimation procedure (BLUE)? Yes No Yes

Justifying the Generalized Least Squares (GLS) Estimation Procedure Two issues emerge with the ordinary least squares (OLS) estimation procedure when heteroskedasticity is present: The standard error calculations made by the ordinary least squares (OLS) estimation procedure are flawed. While the ordinary least squares (OLS) for the coefficient value is unbiased, it is not the best linear unbiased estimation procedure (BLUE). Recall our simulation: Mean (Average) Variance of the Average of Actual of the Estimated Estimated Coef Estimated Variances, Heter Estim Value Values, bx, from Values, bx, from EstVar[bx], from Each Factor Proc of x All Repetitions All Repetitions All Repetitions OLS 2.0 2.0 2.5 2.5 1 OLS 2.0 2.0 3.6 2.9 1 GLS 2.0 2.0 2.3 2.3  Lab 16.4 Questions: Is the estimation procedure: Std Premises Hetero an unbiased estimation procedure for the OLS OLS GLS coefficient value? Yes Yes Yes variance of the coefficient estimate’s probability distribution? Yes No Yes for the coefficient value the best linear unbiased estimation procedure (BLUE)? Yes No Yes

Ordinary Least Squares (OLS) Robust Standard Errors: An Alternative Approach Two issues emerge with the ordinary least squares (OLS) estimation procedure when heteroskedasticity is present: The standard error calculations made by the ordinary least squares (OLS) estimation procedure are flawed. While the ordinary least squares (OLS) for the coefficient value is unbiased, it is not the best linear unbiased estimation procedure (BLUE).  EViews Huber-White robust standard errors: Dependent Variable: LogUsersInternet Explanatory Variable(s): Estimate SE t-Statistic Prob GdpPC 0.100772 0.032428 3.107552 0.0044 Const 0.486907 0.525871 0.925906 0.3627 White heteroskedasticity-consistent SEs Number of Observations 29 Standard errors based on the equal error term variance premise: Ordinary Least Squares (OLS) Dependent Variable: LogUsersInternet Explanatory Variable(s): Estimate SE t-Statistic Prob GdpPC 0.100772 0.032612 3.090019 0.0046 Const 0.486907 0.631615 -0.770891 0.4475 Number of Observations 29