Econometrics III Evgeniya Anatolievna Kolomak, Professor
Theme 1. The classical multiple linear regression
The classical multiple linear regression. Model specification Multiple linear regression model: y i =f(x i1, x i2,…, x iK )+ε i y i =β 1 x i1 +β 2 x i2 +…+β K x iK +ε i, i=1,…,n Where y – dependent (explained) variable or regressand; x 1,x 2,…, x K - independent (explanatory) variables or regressors; ε i - random disturbance (errors of the variables).
The classical multiple linear regression. Model specification We assume that each observation (y i, x i1, x i2,…,x iK ), i=1,…,n is generated by the process described by y i =β 1 x i1 +β 2 x i2 +…+β k x iK + ε i deterministic part random part
The classical multiple linear regression. Assumptions of the model 1.Linear functional form 2.Identifiability of the model parameters 3.Expected value of the disturbance given observed information 4.Variance and covariance of the disturbances given observed information 5.Nature of the independent variables 6.Probability distribution of the stochastic part of the model
Assumptions of the model. 1. Linear functional form Let y - column vector (nx1) of the observations y 1,y 2,..,y n x k – column vector (nx1) of the observations on variable x k ; X – matrix (nxK) assemble x k ; ε - column vector (nx1) containing the n disturbances. The model y i =β 1 x i1 +β 2 x i2 +…+β k x iK +ε i can be written y= x 1 β 1 +…..+x k β k +ε y=Xβ+ε Our primary interest is estimation and inference about β
Assumptions of the model. 2. Identifiability of the model parameters Identification condition: X is n x K matrix with rank K The columns of X are linearly independent and there are at least K observations.
Assumptions of the model. 3. Expected value of the disturbance ε i The disturbance conditioned on observations X is assumed to have expected value 0 at every observation E[ε i │X]=0 for all i=1,…,n The assumption implies E[y│X]=Xβ
Assumptions of the model. 4. Variance and covariance of the disturbances Constant variance or homoscedasticity Var[ε i │X]=σ 2 for all i=1,…,n Absence of correlation across observations or absence of autocorrelation Cov[ε i, ε j │X]=0 for all i≠j Summarizing E[εε T │X]= σ 2 I I – identity matrix
Assumptions of the model. 5. Non-stochastic regressors X is known n x K matrix of constants X is non-stochastic matrix X and ε are uncorrelated
Assumptions of the model. 6. Probability distribution of ε The disturbances ε i are normally distributed ε│X ~ N[0, σ 2 I] Normality enables to obtain several exact statistical results and to construct test statistics.
Assumptions of the classical regression model. Summary NumberAssumption A1y=Xβ+ε A2X is n x K matrix with rank K A3E[ε│X]=0 A4E[εε T │X]= σ 2 I A5X is non-stochastic matrix A6ε│X ~ N[0, σ 2 I]
Least squares regression The population quantities are β and ε i The sample estimations are b and e i For any value of b we estimate e i = y i - x i T b y i = x i T b + e i Vector b is chosen so that the fitted line x i T b is close to the data points. The fitting criterion is least squares.
Least squares regression The solving problem is: Minimize b e T e = (y – Xb) T (y – Xb) b satisfies the least squares normal equations: X T Xb=X T y If the inverse of X T X exists, which follows from the full rank assumption, then b=(X T X) -1 X T y
Least squares regression Algebraic properties of the least squares Let the first column of X is 1s, then 1. The least squares residuals sum to zero. Σ i e i =0 2. The regression passes through the point of means of the data y=x T b 3. The mean of fitted values from the regression equals the mean of the actual values ̂ y=Xb
Partitioned regression Suppose the regression involves two set of variables X 1 and X 2. However we are interested in estimates for β 2 only. The normal equations are: We first solve for b 1 b 1 is the coefficient vector in the regression of (y-x 2 T b 2 ) on x 1.
Partitioned regression Given the solution for b 1 we solve the second equation I – identity matrix and Matrix M 1 ∙is: 1) symmetric ( ) and 2) idempotent ( ). Proof: 1. 2.
Partitioned regression Properties of matrixes M 1 and P M 1 ∙y – vector of residuals in the regression of y on x 1 M 1 ∙x 2 – matrix of residuals of the regressions of x 2 on x 1 Let x 2 * =M 1 ∙x 2 and y * =M 1 ∙y. Then
Partitioned regression Theorem Frisch-Waugh. The sub-vector b 2 is the set of coefficients obtained when the residuals from a regression of y on X 1 are regressed on the set of residuals obtained when each column of X 2 is regressed on X 1. The algorithm is as follows: 1.To regress y on X 1 and to estimate residuals e y. 2.To regress each column of X 2 on X 1 and estimate residuals e x 2. 3.To regress e y on e x 2 and estimate coefficients b 2 and residuals e.
Goodness of fit and the analysis of variance The total variation in y: TSS=Σ i=1 n (y i - y) 2 In terms of regression equation: y i =x i T b+e i y i - y = (x i - x) T b + e i regression part error part For the full set of observations: Total sum of squares = regression sum of squares + error sum of squares TSS=RSS+ESS
Goodness of fit and the analysis of variance Coefficient of determination:
Goodness of fit and the analysis of variance SourceDegrees of freedom Mean Square RegressionK-1 (assuming a constant term) Residualsn-k Totaln-1 Coefficient of determination
Statistical properties of the Least Squares Estimator in finite sample Gaus-Markov Theorem. In the classical liner regression model y=Xβ+ε, the least squares estimator b is the minimum variance linear unbiased estimator of β
Statistical properties of the Least Squares Estimator in finite sample If the disturbances ε i are normally distributed ε│X ~ N[0, σ 2 I] then b│X ~ N[β, σ 2 (X T X) -1 ] b k │X ~ N[β k, σ 2 (X T X) -1 kk ] Rao-Blackwell Theorem. In the classical liner regression model y=Xβ+ε with normally distributed disturbance the least squares estimator b has the minimum variance of all unbiased estimators.
Statistical properties of the Least Squares Estimator in finite sample If we wish to test hypotheses about β or to construct confidence intervals, we need an estimate of the covariance matrix Var[b]= σ 2 (X T X) -1 Since e i is an estimate of ε i, estimation of σ 2 is And Est. Var[b]= s 2 (X T X) -1
Statistical properties of the Least Squares Estimator in finite sample