EKONOMETRIKA TERAPAN (Pertemuan #2) Pengajar: Dr. Vera Lisna, S.Si, M.Phil
MODEL REGRESI LINIER BERGANDA DATA PANEL (Single Equation)
Metodologi Ekonometrika Berdasarkan Jenis Data CS TS PANEL Univariate Multivariate Correlation Regression Multivariate analysis Regression AR, MA ARMA ARIMA ARCH, GARCH Correlation Regression Granger Causality VAR ECM, VECM Pooled Fixed Effect Random Effect
Some Well-Known Panel Data Sets Pooled Data Pooling of TS and CS data Combination of TS and CS data Micropanel data Longitudinal data a study over time of a variable or group of subjects Event history analysis studying the movement over time of subjects through successive states of conditions Cohort analysis e.g. following the career path of 1965 graduates of a business school PANEL DATA PANEL DATA REGRESSION MODEL
Advantages of panel data Panel data relate to individuals over time there is a bound to heterogeneity controlling for individual heterogeneity By combining TS and CS data more informative, more variability, less collinearity among the varibles, more df, and more efficiency Note: df ↑ : distribusi mendekati normal By studying the repeated CS of observation Better able to study the dynamics of adjustments Better able to identify and measure effects that are simply not detectable in pure CS or TS data Allow us to construct and test more complicated behavioral models than purely CS or TS data By making data available for several thousand units minimize bias PANEL DATA CAN ENRICH EMPIRICAL ANALYSIS
Limitations of panel data Design and data collection problems includes coverage (incomplete account of individual or period), nonresponse (lack of respondent cooperation or interviewer error), recall (respondent not remembering correctly), frequency of interviewing, interview spacing, etc. Distortions of measurement errors due to unclear questions, memory errors, inappropriate informants, misrecording or responses, and interviewer effects Selectivity problems (due to self –selectivity, nonresponse, and attrition) data berkurang Short time-series dimensions Cross-section dependence panel unit root tests are suggested to account for CS dependence
Jenis-jenis data panel PANEL DATA Balanced panel Unbalanced panel PANEL DATA Short panel Long panel N > T T > N
Metode estimasi data panel PANEL DATA Pooled OLS LSDV FE Within Group REM Pool all obs Pool all obs but allow each CS unit to have its own (intercept) dummy varb Pool all obs, but express each varb in each CS as a deviation from its mean value and then estimates OLS regression on such mean corrected Pool all obs and assume that the intercept value are a random drawing from a much bigger population of CS data
Metode Estimasi Data Panel PANEL DATA FEM REM PLS B/W estimator WG GLS LSDV 2-way error comp
Teori Data Panel Statis Kelemahan data CS: Hanya dapat diamati pada satu titik Contoh analisis pertumbuhan ekonomi: PDRB, investasi, tingkat konsumsi hanya di satu titik perkembangan ekonomi antar waktu tidak dapat dilihat Kelemahan data TS: Variabel-variabel yang diobservasi secara agregat dati suatu uni invidu estimasi mungkin bias Kelebihan data panel: Verbeek (2004): Kombinasi data S dan TS jumlah observasi lebih besar Model data panel variabel penjelas dilihat dari dua dimensi parameter yang diestimasi lebih akurat Hsiao (2004) Lebih informatif Mengurangi kolinearitas antar variabel penjelas Meningkatkan df meningkatkan efisiensi Mampu mengontrol heterogenitas individy
Teori Data Panel Statis Dua pendekatan aplikasi data panel: Fixed effect model (FEM) Random Effect Model (REM) Perbedaan FEM dan REM: Asumsi ada/tidak korelasi antara error (e) dan variabel penjelas (X) Contoh: yit = αi + βXit+ εit Komponen error: One way error component model: yit = αi + βXit+ it + uit Two way error component model: yit = αi + βXit+ it + it + uit
FEM - Metode PLS Menggunakan gabungan seluruh data (pooled) jumlah observasi = n x t ; n = jumlah unit CS t = jumlah series Model: yit = αi + βXit+ εit ; αi = α i Formula perhitungan:
FEM - Metode PLS Kelemahan: Parameter β bias arah kemiringan (slope) PLS tidak sejajar dengan garis regresi masing-masing individu (tidak dapat membedakan observasi yang sama pada periode berbeda) Group 2 α2 + βxit Group 1 α1 + βxit Slope bias xit yit
An illustrative example of panel data Data are taken from investment theory proposed by Y. Grunfeld (1958: “The Determinants of Corporate Investment”, unpublished Ph.D. thesis)
Grunfeld Investment Function Real value of the firm X2 Y Real gross investment Real capital stock X3 Grunfeld Investment Function Yit = β1 + β2X2it + β3X3it + uit i = 1, 2, 3, 4 CS identifier t = 1, 2, …, 20 TS identifier ↓ 80 observasi balanced panel Initial assumptions: 1) Xkit nonstochastic 2) E(uit) N(0, 2) Panel data Balanced Unbalanced ti = t I Not all of ti = t
Estimation of Grunfeld Investment Function Yit = β1 + β2X2it + β3X3it + uit i = 1, 2, 3, 4 t = 1, 2, …, 20 Futher assumptions (intercept, slope, error term): The intercept and slope coefficients are constant across time and space and the error term captures differences over tima and individuals bjit = bj k,i,t and not all uit = u The slope coefficients are constant but the intercept varies over individuals The slope coefficients are constant but the intercept varies over individuals and time All coefficients (the intercept and slope) vary over individuals The intecept and slope coefficient vary over individuals and time
1) ALL COEFFICIENTS CONSTANT ACROSS TIME AND INDIVIDUALS Dependent Variable: Y? Method: Pooled Least Squares Date: 10/24/14 Time: 09:06 Sample: 1935 1954 Included observations: 20 Cross-sections included: 4 Total pool (balanced) observations: 80 Variable Coefficient Std. Error t-Statistic Prob. C -63.30414 29.61420 -2.137628 0.0357 X1? 0.110096 0.013730 8.018809 0.0000 X2? 0.303393 0.049296 6.154553 R-squared 0.756528 Mean dependent var 290.9154 Adjusted R-squared 0.750204 S.D. dependent var 284.8528 S.E. of regression 142.3682 Akaike info criterion 12.79149 Sum squared resid 1560690. Schwarz criterion 12.88081 Log likelihood -508.6596 Hannan-Quinn criter. 12.82730 F-statistic 119.6292 Durbin-Watson stat 0.218717 Prob(F-statistic) 0.000000 𝒀 =−𝟔𝟑.𝟑𝟎𝟒𝟏+𝟎.𝟏𝟏𝟎𝟏𝑿𝟐+𝟎.𝟑𝟎𝟑𝟒𝑿𝟑 se = (29.6142) (0.0137) (0.0493) t = (-2.1376) (8.0188) (6.1545) R2 = 0.7565 DW = 0.2187 n = 80 df = n – 3 = 77 All coeffs are indivually statistically signif All slope coeffs have pos signs R2 value is high DW is quite low perhaps there is autocor
Dependent Variable: Y? Method: Pooled Least Squares Date: 10/24/14 Time: 09:06 Sample: 1935 1954 Included observations: 20 Cross-sections included: 4 Total pool (balanced) observations: 80 Variable Coefficient Std. Error t-Statistic Prob. C -63.30414 29.61420 -2.137628 0.0357 X1? 0.110096 0.013730 8.018809 0.0000 X2? 0.303393 0.049296 6.154553 R-squared 0.756528 Mean dependent var 290.9154 Adjusted R-squared 0.750204 S.D. dependent var 284.8528 S.E. of regression 142.3682 Akaike info criterion 12.79149 Sum squared resid 1560690. Schwarz criterion 12.88081 Log likelihood -508.6596 Hannan-Quinn criter. 12.82730 F-statistic 119.6292 Durbin-Watson stat 0.218717 Prob(F-statistic) 0.000000 Dependent Variable: Y Method: Panel Least Squares Date: 10/24/14 Time: 09:19 Sample: 1935 1954 Periods included: 20 Cross-sections included: 4 Total panel (balanced) observations: 80 Variable Coefficient Std. Error t-Statistic Prob. C -63.30245 29.61417 -2.137573 0.0357 X1 0.110095 0.013730 8.018798 0.0000 X2 0.303392 0.049296 6.154541 R-squared 0.756528 Mean dependent var 290.9163 Adjusted R-squared 0.750204 S.D. dependent var 284.8522 S.E. of regression 142.3681 Akaike info criterion 12.79149 Sum squared resid 1560687. Schwarz criterion 12.88081 Log likelihood -508.6595 Hannan-Quinn criter. 12.82730 F-statistic 119.6289 Durbin-Watson stat 0.218716 Prob(F-statistic) 0.000000