Multicollinearity Susanta Nag Assistant Professor Department of Economics Central University of Jammu.

Slides:

Advertisements

Similar presentations

Managerial Economics in a Global Economy

Advertisements

Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.

Multivariate Regression

CHAPTER 3: TWO VARIABLE REGRESSION MODEL: THE PROBLEM OF ESTIMATION

The Multiple Regression Model.

Specification Error II

Introduction and Overview

Studenmund(2006): Chapter 8

Assumption MLR.3 Notes (No Perfect Collinearity)

The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.

Bagian 1. 2 Contents The log-linear model Semilog models.

Chapter 5 Heteroskedasticity. What is in this Chapter? How do we detect this problem What are the consequences of this problem? What are the solutions?

CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +

Multicollinearity Omitted Variables Bias is a problem when the omitted variable is an explanator of Y and correlated with X1 Including the omitted variable.

1.The independent variables do not form a linearly dependent set--i.e. the explanatory variables are not perfectly correlated. 2.Homoscedasticity --the.

Ekonometrika 1 Ekonomi Pembangunan Universitas Brawijaya.

Cross-sectional:Observations on individuals, households, enterprises, countries, etc at one moment in time (Chapters 1–10, Models A and B). 1 During this.

Ordinary Least Squares

Objectives of Multiple Regression

SIMULTANEOUS EQUATION MODELS

Lecture 17 Summary of previous Lecture Eviews. Today discussion  R-Square  Adjusted R- Square  Game of Maximizing Adjusted R- Square  Multiple regression.

MULTICOLLINEARITY: WHAT HAPPENS IF THE EGRESSORS

MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.

Dept of Economics Kuwait University

Specification Error I.

Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.

2 Multicollinearity Presented by: Shahram Arsang Isfahan University of Medical Sciences April 2014.

Lecture 14 Summary of previous Lecture Regression through the origin Scale and measurement units.

5-1 MGMG 522 : Session #5 Multicollinearity (Ch. 8)

MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED?

1/25 Introduction to Econometrics. 2/25 Econometrics Econometrics – „economic measurement“ „May be defined as the quantitative analysis of actual economic.

Ch5 Relaxing the Assumptions of the Classical Model

F-tests continued.

Chapter 4: Basic Estimation Techniques

Esman M. Nyamongo Central Bank of Kenya

REGRESSION DIAGNOSTIC III: AUTOCORRELATION

Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3

REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY

THE LINEAR REGRESSION MODEL: AN OVERVIEW

Multiple Regression Analysis and Model Building

Econometric methods of analysis and forecasting of financial markets

Simultaneous equation system

Evgeniya Anatolievna Kolomak, Professor

Multivariate Regression

EED 401: ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati Haruna Issahaku.

REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY

Fundamentals of regression analysis

Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation

STOCHASTIC REGRESSORS AND THE METHOD OF INSTRUMENTAL VARIABLES

HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?

Chapter 6: MULTIPLE REGRESSION ANALYSIS

Chapter 3 Multiple Linear Regression

REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY

MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED

Undergraduated Econometrics

Serial Correlation and Heteroscedasticity in

HETEROSCEDASTICITY: WHAT HAPPENS IF THE ERROR VARIANCE IS NONCONSTANT?

MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED

Chapter 7: The Normality Assumption and Inference with OLS

BEC 30325: MANAGERIAL ECONOMICS

Simple Linear Regression

Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences? It may be difficult to separate.

Autocorrelation MS management.

Financial Econometrics Fin. 505

Financial Econometrics Fin. 505

BEC 30325: MANAGERIAL ECONOMICS

Lecturer Dr. Veronika Alhanaqtah

Serial Correlation and Heteroscedasticity in

Presentation transcript:

Multicollinearity Susanta Nag Assistant Professor Department of Economics Central University of Jammu

The Concepts of Multicollinearity The term multicollinearity was first introduced by Ragnar Frisch in 1934. It is essentially a problem explanatory variables. Multicollinearity means the existence of a linear interrelationship among some or all explanatory variables of a regression model. Classical linear regression model: No perfect linear relationships among the explanatory variables.

Causes or Sources of Multicollinearity The data collection method employed. Use of Lagged Variables. Most of Economic Time series variables have the tendency to move together. Constraints on the model or in the population being sampled. Wrong Specification of the model.

Example of Multicollinearity Regression model: Y =β0+β1X1+β2X2+u.......(i) Y= consumption Exp. X1 =Income, X2 = wealth u= error term Example-2 Y =β0+β1X1+β2X2+β3X3+u.....................(i) Y= consumption Expenditure, X1 =Income, X2 = wealth, X3 = Liquid assets,

Nature of Multicollinearity Case-1: Perfect or Exact Multicollinearity. Case-2: Less than perfect or Near exact Multicollinearity. Case-3: Orthogonal or No Multicollinearity.

Nature of Multicollinearity Case-1 When X variables are linearly perfectly correlated For the k-variable regression model β1X1 +β2X2 + ..........+ βkXk = 0 ...............(1) Where β1, β2, ............ , βk are constants such that not all of them are zero simultaneously. Case-2 When X variables are inter-correlated but not perfectly: β1X1 +β2X2 + ..........+ βkXk + vi =0 .................(2) Where vi is a stochastic error term.

Difference between perfect and less than perfect multicollinearity Assume, for example, that β2 ≠ 0. Then, Eq. (1) can be written as: …….(i) Similarly, if β2 ≠ 0, Eq. (2) can be written as: … …..(ii) which shows that X2 is not an exact linear combination of other X’s because it is also determined by the stochastic error term vi.

Numerical example X2 X3 X*3 10 50 52 15 75 75 18 90 97 24 120 129 10 50 52 15 75 75 18 90 97 24 120 129 30 150 152 Relation between X3 and X2 is X3i= 5X2 Therefore, there is perfect collinearity between X2 and X3 since the coefficient of correlation r23 is unity. Now there is no longer perfect collinearity between X2 and X*3. (X3i = 5X2i + vi ) However, the two variables are highly correlated because coefficient of correlation is 0.9959.

Continued...... Case-3 (No Multicollinearity) Two vectors X2 and X3 are said to be orthogonal if the dot products of the two vectors is Zero. Graphical representation of these two will ne perpendicular.

Consequences of Multicollinearity Case-1 (Perfect Multicollinearity) The OLS Estimators become indeterminate forget about BLUE property. The Variance and Covariance (as well as the Standard errors) of OLS estimators will be Infinite. Individually OLS estimators are statistically insignificant. The Confidence Intervals of Parameters become infinite.

Continued...... Case-2 (Near Perfect Multicollinearity) The OLS Estimators can be determinate. But The Variance and Covariance (as well as the Standard errors) of OLS estimators will be very high. Individually OLS estimators are statistically insignificant. The Confidence Intervals of Parameters become large.

Continued....... Case-3 (No Multicollinearity) The OLS Estimators can be determinate as well as satisfied BLUE Property. But The Variance and Covariance (as well as the Standard errors) of OLS estimators will be finite and minimum. Individually OLS estimators are statistically significant. The Confidence Intervals of Parameters become minimum or low.

Detections of Multicollinearity Kmenta’s warning: Multicollinearity is a question of degree and not of kind. The meaningful distinction is not between the presence and the absence of multicollinearity, but between its various degrees. Multicollinearity is a feature of the sample and not of the population. Therefore, we do not “test for multicollinearity” but we measure its degree in any particular sample. 2. There are some Rules of Thumb of detecting or measuring its strength.

Detections of Multicollinearity 2-Rule of Thumb High R2 but few significant t-ratios. High pair-wise correlations among regressors Examination of partial correlations. Auxiliary regressions. Condition Number and condition Index. Tolerance and variance inflation factor :VIF=1/(1−r223)

Remedies of Multicollinearity Do nothing “Multicollinearity is God’s will, not a problem with OLS or Statistical technique in general”. Multicollinearity is essentially a data deficiency problem and some times we have no choice over the data. Even if we cannot estimate one or more regression coefficients with greater precision, a linear combination of them (i.e., estimable function) can be estimated relatively efficiently. =

(B) Some Rules of Thumb (1) A priori information. (2) Combining cross sectional and time Series data (3) Dropping Variables. (4) Use ratios or first differences. (5) Acquire additional or new data (6) Use Multivariate Techniques like Principal Components, Factor Analysis and Ridge Regression.

Rules of Thumb (1) A priori information. Yi = β1 + β2X2i + β3X3i + ui where Y = consumption, X2 = income, and X3 = wealth. Suppose a priori we believe that β3 = 0.10β2 We can then run the following regression: Yi = β1 + β2X2i + 0.10β2X3i + ui = β1 +β 2Xi + ui where Xi = X2i + 0.1X3i . Once we obtain βˆ2, we can estimate βˆ3 from the postulated relationship between β2 and β3.

Continued... ln Yt = β1 + β2 ln Pt + β3 ln It + ut (2) Combining cross-sectional and time series data. ln Yt = β1 + β2 ln Pt + β3 ln It + ut where Y = number of cars sold, P = average price, I = income, and t = time. Out objective is to estimate the price elasticity β2 and income elasticity β3. In time series data the price and income variables generally tend to be highly collinear. Tobin suggested a way out that if we have cross- sectional data, we can obtain a fairly reliable estimate of the income elasticity β3 because in such data, which are at a point in time, the prices do not vary much. Let the cross-sectionally estimated income elasticity be βˆ3. Using this estimate, we may write the preceding time series regression as: Y*t = β1 + β2 ln Pt + ut where Y* = ln Y − βˆ3 ln I, that is, Y* represents that value of Y after removing from it the effect of income. We can now obtain an estimate of the price elasticity β2 from the preceding regression.

Continued... (3) Dropping a variable(s) Yi = β1 + β2X2i + β3X3i + ui where Y = consumption, X2 = income, and X3 = wealth when we drop the wealth variable, we obtain regression which shows that in the original model the income variable was statistically insignificant, it is now “highly” significant.

Continued... (4) Transformation of variables. Yt = β1 + β2X2t + β3X3t + ut ……………………….(i) One reason for high multicollinearity between income and wealth in such data is that over time both the variables tend to move in the same direction. One way of minimizing this dependence is to proceed as follows. If the relation Above equaution holds at time t, it must also hold at time t − 1 because the origin of time is arbitrary anyway. Therefore, we have Yt−1 = β1 + β2X2,t−1 + β3X3,t−1 + ut−1 ………………..(ii) Yt − Yt−1 = β2(X2t − X2,t−1) + β3(X3t − X3,t−1) + vt …………(iii) where vt = ut − ut−1. Equation (iii) is known as the first difference form Equation (iii) reduces the severity of multicollinearity because, although the levels of X2 and X3 may be highly correlated, there is no a priori reason to believe that their differences will also be highly correlated.

Continued... (5) Additional or new data Since multicollinearity is a sample feature, it is possible that in another sample involving the same variables collinearity may not be so serious as in the first sample. Sometimes simply increasing the size of the sample (if possible) may attenuate the collinearity problem. (6) Other Multivariate statistical techniques such as factor analysis and principal components or Ridge regression

References Maddala, G.S and Lahiri K (2012) Introduction to Econometrics, Fourth Edition,Wiley India Pvt. Limited Gujarati, D.N et al. (2011): Basic Econometrics, Fifth Edition. The McGraw Hill Education India. Wooldridge, Jeffrey M. (2012): Introductory Econometrics: A Modern Approach, Fifth Edition, Cengage Learning.