Download presentation
Presentation is loading. Please wait.
Published byBernice Watson Modified over 8 years ago
1
Esman M. Nyamongo Central Bank of Kenya Econometrics Course organized by the COMESA Monetary Institute (CMI) on 2-11 June 2014, KSMS Nairobi Kenya 1
2
There are 3 types of data Cross sectional data Time series data Panel data The cross-section and time series are the primary building blocks of Panel 2
3
A time series is a set of observations on the values that a variable takes at different times Such data may be collected at regular time intervals ◦ Minutely and Hourly- collected literally continuously ( the so-called real time quote) ◦ Daily- e.g., Financial time series-Stock prices, exchange rates; weather reports- rainfall, temperature ◦ Weekly – e.g., money supply ◦ Monthly- e.g., consumer price index ◦ Quarterly- e.g., GDP ◦ Semi-annually- e.g., Fiscal data ◦ Annually- e.g., Fiscal data ◦ Quinquennially ( every 5 years)- e.g., manufacturing survey ◦ Decennially- (every 10 years)- e.g., population census data 3
4
Illustration of time series 4
5
The model setup: Where t= time series 5
6
Cross-section data is data on one or more variables collected at a particular point in time in time Survey data- questionnaire is designed to capture all variables a research is looking for. Macro data relating to different economic entities : countries, banks at a particular point in time. E.g. 2010 Other data 6
7
7
8
The model set up Where i= cross -section 8
9
Panel data is a combination of both time and cross-section data Specialized type is the longitudinal or micropanel data where a cross-sectional unit (say, individual, family, firm) is surveyed over time. Surveying same individual over time is able to provide useful information on the dynamics of individual/household/firm behavior 9
10
10
11
Time series + cross-section Where i= cross section; t= time series 11
12
PooledPooled data 12
13
Longitudinal/Micropanel data 13
14
Controlling for individual heterogeneity. Panel data suggests that individuals, firms, states or countries are heterogeneous. Panel data give more informative data, more variability, less collinearity among the variables, more degrees of freedom and more efficiency Panel data are better able to study the dynamics of adjustment. Panel data are better able to identify and measure effects that are simply not detectable in pure cross-section or pure time-series data. 14
15
Panel data models allow us to construct and test more complicated behavioral models than purely cross-section or time-series data. Micro panel data gathered on individuals, firms and households may be more accurately measured than similar variables measured at the macro level. Biases resulting from aggregation over firms or individuals may be reduced or eliminated (see Blundell, 1988; Klevmarken, 1989). Macro panel data on the other hand have a longer time series and unlike the problem of nonstandard distributions typical of unit roots tests in time-series analysis, shows that panel unit root tests have standard asymptotic distributions. 15
16
Design and data collection problems. Distortions of measurement errors. Measurement errors may arise because of faulty responses due to unclear questions, memory errors, deliberate distortion of responses (e.g. prestige bias), inappropriate informants, misrecording of responses and interviewer effects Selectivity problems.-Self-selectivity, Nonresponse, Attrition. Short time-series dimension Cross-section dependence 16
17
Emphasizes the joint estimation of coefficients- ignores panel structure of the data where y= dependent variable, Xs= regressors for all i and t. i.e For a given cross section, observations are serially uncorrelated Across cross-sections and time, the errors are homoscedastic 17
18
These assumptions- classical assumptions therefore suggest we estimate the equation by OLS Pooling- increases degrees of freedom, potentially lowering standard errors on the coefficients. This involves stacking cross-sections in the data setstacking This form assumes same intercept and same slope for all coefficients But how realistic is it to ignore the panel structure of the data? 18
19
19 Estimation of pooled model Preparation of data for use in panel estimation
20
Here same slope and intercept is assumed Homogeneity of the cross-section units 20
21
Recall In this model u may display 3 different schemes: (i) consists of 3 individual shocks, each assumed to be independent of each other: is cross section invariant shock- time effect is time invariant shock- individual effect is the error term with usual properties- uncorrelated with X it 21
22
(ii) to yield: Cross-section fixed effects model (iii) to yield: Time-fixed effects model. Here schemes (ii) and (iii) are referred to as ONE-AWAY ERROR COMPONENT MODEL Scheme (i) is the TWO-WAY-ERROR COMPONENT MODEL 22
23
Follows from the restrictions imposed by the pooled model i.e joint intercept and slope for i=1,2….N for t=1,2….T One-way error component model allows cross-section heterogeneity in the error term. Error term (u it ) becomes the sum of an individual specific effect ( u i - time invariant) and a ‘well behaved’ disturbance ( ) 23
24
In this formulation we have: The first part varies across cross-section units but is constant across time The second part varies unsystematically (independently) across time and individuals Two ways to estimate a regression model with error terms that are assumed to consist of several error components: Fixed-effects model- (1) each equation constant is a separate parameter (2) values of vi are potentially correlated with the other regressors Random-effects model – (1) differences in the vi are randomly distributed between units (2) values of vi are uncorrelated with the other regressors 24
25
Recall: Fixed-effects model- (1) each equation constant is a separate parameter (2) values of v i are potentially correlated with the other regressors 25
26
The main question is whether X it is correlated with u it : If no, then we have a seemingly unrelated regression If yes, then we have a multi-equation system with common coefficients and endogenous regressors Then how do we account for this endogeneity- In time series we use instrumental variable estimation methods (2sls,3sls etc) However, in panel we can handle with this, under certain assumptions, without using instruments How? 26
27
There are 3 approaches to doing this in panel: The least squares dummy variables (LSDV) estimator The within-group estimator The first difference estimator Discussed in turns 27
28
LSDV method applies OLS to levels with group specific dummies added to the list of regressors. Explains why the estimator is called the Least- Squares Dummy Variables (LSDV) estimator Consider the general model: Stack the observations over t, to obtain: 28
29
The pooled regression is then: Where is a Kronecker productKronecker Since 1 T is a regressor alongside X it then we expect: 29
30
This model is appealing, but consider the number of parameters to be estimated! K+1+(N-1)= k+N K= parameters for the original X-regressors 1= parameter for the intercept N-1= parameters for cross-section fixed effects (omitted x-section captured by intercept) Too many parameters (especially with large N! Is there another way? - yes 30 N-1 K+1
31
But we need to proceed! Even though the OLS would be valid as an estimation method, it is inappropriate, as the parameter space is too large: There are N+K regression parameters- as N is usually very large We estimate the using OLS following the Frisch-Waugh-Lovell (FWL) theorem on partitioned regressions: Digression on FWL and partitioned regressionsFWL and partitioned 31
32
Based on this result we can show that the fixed effects estimator is the partitioned OLS estimator of in pooled regression. Where It then follows that: Therefore: 32
33
Whence it follows that: This is basically a pooled OLS estimator on transformed data 33
34
34
35
Still assumes individual effects although we no longer directly estimate them. We demean the data- wipe out the individual effects- to estimate only B How do we wipe out the individual effects? We define a Q matrix. Where Q is defined such that: where 35
36
Consider a simple regression: The trick is to remove the fixed effect, vi. How? Step 1: Average over time t for each i. 36
37
Step 2: The transformed regression: or Then, stack by observation for t= 1, …., T, resulting in the ‘giant’ (pooled) regression: Or -edit this 37
38
The within-group fixed-effects estimator is pooled OLS on the transformed regression that has been stacked by observations: Degrees of freedom of FE estimator nT-k-n=n(T-1)-k Why- loose 1 degree of freedom in each fixed effect estimated. There are k B s to be estimated as well 38
39
Consider the general model: Stack an individual i’s observations for t=1,…..,T, giving: Or i= 1, 2, …., n Where 1 T is a (Tx1) vector of ones 39
40
The fixed-effects estimator is applied to a T- equation system transformed from the original system above. The matrix used for the transformation is the so-called annihilator associated with 1 T :annihilator where Q is a T x T matrix: 40
41
The matrices QT and PT are such that: And What does this mean- the QT and PT matrices takes you back to the transformed y 41
42
The transformed error-components model is then: The pooled regression is stated as: sss 42
43
The fixed-effects estimator is again obtained by pooled OLS on the transformed system: Why? Q T is idempotent i.e Q T Q T =Q T In other words: 43
44
P is the matrix that averages across time for each individual cross section. Thus pre-multiplying this regression by Q obtains deviations from means WITHIN each cross- section is an NTx1 vector with ‘stacked’ deviations The OLS estimator is therefore: Demeaning the data will not change the estimates for. Similar to running a regression with the line of best fit passing through the origin 44
45
Thus the ‘WITHIN’ model becomes a simple regression: Individual effects can be solved ( not estimated). But we need the following assumption: And solving: 45
46
Notice the fixed effects are not estimated. Computed instead. 46
47
The computed FE are indicated. But what are these FE? 47
48
Disadvantages of the method Demeaning the data means X-regressors which are themselves dummy variables cannot be used (sex, religion, etc) 48
49
Both coefficients are positive and significant. Therefore satisfy some theory! Which one do we choose? 49
50
The null hypothesis Alternative HA: Not all equal to 0 We test the null hypothesis of no individual effects within applied Chow or F-test, combining the residual sum of squares for the regression both with constraints (under the null) and without (under alternative). The recipe: RSS- OLS on pooled model ( constant intercept) URSS- OLS on LSDV 50
51
The F- statistics is stated as: If N is ‘large enough’ can use ‘WITHIN’ estimation instead of LSDV for the RSS. The decision rule: P-value FE are not redundant P-value> 0.05, we fail to reject the null hypothesis => FE are redundant, suggesting pooled model is valid 51
52
Recall: i=1,….,N; t=1, …..,T Where = unobservable individual effect; =unobservable time effect; stochastic disturbance = selector matrix of ones and zeros- individual effects = time dummies 52
53
Here we still assume and are fixed parameters to be estimated and Estimation with LSDV requires the estimation of {(N-1)+(T-1)} dummies This can introduce rather severe loss of degrees of freedom once again to avoid this problem we perform ‘ WITHIN’ transformation (similar to one-way model). Now, however, we must demean across both dimensions 53
54
Here we work with the following: Where:J N = matrix of ones of dimension N deviations across i deviations across t Transforming with Q sweeps out the time and individual effects 54
55
Here we have a simple regression (one-x regressor): We now need 2 constraints to capture individual and time effects Then we can compute the intercept using Again as in one way model we cannot use time-invariant or individual-invariant (dummy regressors) as Q wipes them out. 55
57
How do we interpret these result
58
58
59
59
60
Same results 60
61
61
62
62
63
Consider the partitioned regression equation: The least-squares estimators for B1 and B2 can be expressed as: Where 63
64
The residual maker and the Hat Matrix Some useful matrices: We know Meaning: Where is called the residual maker since it makes residuals out of y. Matrix A is idempotent [M 2 =MM=M] 64
65
M is a square matrix and idempotent- show The matrix has the following properties as well: The hat matrix (H)- makes y hat out of y. where 65
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.