Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classical Linear Regression Model Normality Assumption Hypothesis Testing Under Normality Maximum Likelihood Estimator Generalized Least Squares.

Similar presentations


Presentation on theme: "Classical Linear Regression Model Normality Assumption Hypothesis Testing Under Normality Maximum Likelihood Estimator Generalized Least Squares."— Presentation transcript:

1 Classical Linear Regression Model Normality Assumption Hypothesis Testing Under Normality Maximum Likelihood Estimator Generalized Least Squares

2 Normality Assumption Assumption 5  |X ~ N(0,  2 I n ) Implications of Normality Assumption –(b-  )|X ~ N(0,  2 (X′X) -1 ) –(b k -  k )|X ~ N(0,  2 ([X′X) -1 ] kk )

3 Hypothesis Testing under Normality Implications of Normality Assumption –Because  i /  ~ N(0,1), where M = I-X(X′X) -1 X′ and trace(M) = n-K.

4 Hypothesis Testing under Normality If  2 is not known, replace it with s 2. The standard error of the OLS estimator  k is SE(b k ) = Suppose A.1-5 hold. Under H 0 : the t-statistic defined as ~ t(n-K).

5 Hypothesis Testing under Normality Proof of ~ t(n-K)

6 Hypothesis Testing under Normality Testing Hypothesis about Individual Regression Coefficient, H 0 : –If  2 is known, use z k ~ N(0,1). –If  2 is not known, use t k ~ t(n-K). Given a level of significance , Prob(-t  /2 (n-K) < t < t  /2 (n-K)) = 1- 

7 Hypothesis Testing under Normality –Confidence Interval [b k -SE(b k ) t  /2 (n-K), b k +SE(b k ) t  /2 (n-K)] –p-Value: p = Prob(t>|t k |) x 2 Prob(-|t k | |t k |) = Prob(t . Reject otherwise.

8 Hypothesis Testing under Normality Linear Hypotheses H 0 : R  = q

9 Hypothesis Testing under Normality Let m = Rb-q, where b is the unrestricted least squares estimator of . –E(m|X) = E(Rb-q|X) = R  -q = 0 –Var(m|X) = Var(Rb-q|X) = RVar(b|X)R′ =  2 R(X′X) -1 R′ Wald Principle W = m′Var(m|X) -1 m = (Rb-q)′[  2 R(X′X) -1 R′] -1 (Rb-q) ~ χ 2 (J), where J is the number of restrictions Define F = (W/J)/(s 2 /  2 ) = (Rb-q)′[s 2 R(X′X) -1 R′] -1 (Rb-q)/J

10 Hypothesis Testing under Normality Suppose A.1-5 holds. Under H 0 : R  = q, where R is J x K with rank(R)=J, the F- statistic defined as is distributed as F(J,n-K), the F distribution with J and n-K degrees of freedom.

11 Discussions Residuals e = y-Xb ~ N(0,  2 M) if  2 is known and M = I-X(X′X) -1 X′ If  2 is unknown and estimated by s 2,

12 Discussions Wald Principle vs. Likelihood Principle: By comparing the restricted (R) and unrestricted (UR) least squares, the F- statistic is shown

13 Discussions Testing R 2 = 0: Equivalently, H 0 : R  = q, where J = K-1, and  1 is the unrestricted constant term. The F-statistic follows F(K-1,n-K).

14 Discussions Testing  k = 0: Equivalently, H 0 : R  = q, where F(1,n-K) = b k [Est Var(b)] -1 kk b k t-ratio: t(n-K) = b k /SE(b k )

15 Discussions t vs. F: –t 2 (n-K) = F(1,n-K) under H 0 : R  =q when J=1 –For J > 1, the F test is preferred to multiple t tests Durbin-Watson Test Statistic for Time Series Model: –The conditional distribution, and hence the critical values, of DW depends on X…

16 Maximum Likelihood Assumption 1, 2, 4, and 5 imply y|X ~ N(X ,  2 I n ) The conditional density or likelihood of y given X is

17 Maximum Likelihood Likelihood Function L( ,  2 ) = f(y|X; ,  2 ) Log Likelihood Function

18 Maximum Likelihood ML estimator of ( ,  2 ) = argmax ( ,  ) log L( ,  ), where we set  =  2

19 Maximum Likelihood Suppose Assumptions 1-5 hold. Then the ML estimator of  is the OLS estimator b and ML estimator of  or  2 is

20 Maximum Likelihood Maximum Likelihood Principle –Let  = ( ,  ) –Score: s(  ) =  log L(  )/  –Information Matrix: I(  ) = E(s(  )s(  )′|X) –Information Matrix Equality:

21 Maximum Likelihood Maximum Likelihood Principle –Cramer-Rao Bound: I(  ) -1 That is, for an unbiased estimator of  with a finite variance-covariance matrix:

22 Maximum Likelihood Under Assumptions 1-5, the ML or OLS estimator b of  with variance  2 (X′X) -1 attains the Cramer- Rao bound. ML estimator of  2 is biased, so the Cramer-Rao bound does not apply. OLS estimator of  2, s 2 = e′e/(n-K) with E(s 2 |X) =  2 and Var(s 2 |X) = 2  4 /(n-K), does not attain the Cramer-Rao bound 2  4 /n.

23 Discussions Concentrated Log Likelihood Function Therefore, argmax  log L(  ) = argmin  SSR(  )

24 Discussions Hypothesis Testing H 0 : R  = q –Likelihood Ratio Test –F Test as a Likelihood Ratio Test

25 Discussions Quasi-Maximum Likelihood –Without normality (Assumption 5), there is no guarantee that ML estimator of  is OLS or that the OLS estimator b achieves the Cramer-Rao bound. –However, b is a quasi- (or pseudo-) maximum likelihood estimator, an estimator that maximizes a misspecified (normal) likelihood function.

26 Generalized Least Squares Assumption 4 Revisited: E(  ′  |X) = Var(  |X) =  2 I n Assumption 4 Relaxed (Assumption 4’): E(  ′  |X) = Var(  |X) =  2 V(X), with nonsingular and known V(X). –OLS estimator of , b=(X′X) -1 X′y, is not efficient although it is still unbiased. –t-test and F-test are no longer valid.

27 Generalized Least Squares Since V=V(X) is known, V -1 = C′C Let y * = Cy, X * = CX,  * = C  y = X  +   y * = X *  +  * –Checking A.2: E(  * |X * ) = E(  * |X) = 0 –Checking A.4: E(  *  * ′|X * ) = E(  *  * ′|X) =  2 CVC′ =  2 I n GLS: OLS for the transformed model y * = X *  +  *

28 Generalized Least Squares b GLS = (X * ′X * ) -1 X * ′y * = (X′V -1 X) -1 X′V -1 y Var(b GLS |X) =  2 (X * ′X * ) -1 =  2 (X′V -1 X) -1 If V = V(X) = Var(  |X)/  2 is known, –b GLS = (X′[Var(  |X)] -1 X) -1 X′[Var(  |X)] -1 y –Var(b GLS |X) = (X′[Var(  |X)] -1 X) -1 –GLS estimator b GLS of  is BLUE.

29 Generalized Least Squares Under Assumption 1-3, E(b GLS |X) = . Under Assumption 1-3, and 4’, Var(b GLS |X) =  2 (X′V(X) -1 X) -1 Under Assumption 1-3, and 4’, the GLS estimator is efficient in that the conditional variance of any unbiased estimator that is linear in y is greater than or equal to [Var(b GLS |X)].

30 Discussions Weighted Least Squares (WLS) –Assumption 4”: V(X) is a diagonal matrix, or E(  i 2 |X) = Var(  i |X) =  2 v i (X) Then –WLS is a special case of GLS.

31 Discussions If V = V(X) is not known, we can estimate its functional form from the sample. This approach is called the Feasible GLS. V becomes a random variable, then very little is known about the distribution and finite sample properties of the GLS estimator.

32 Example Cobb-Douglas Cost Function for Electricity Generation (Christensen and Greene [1976]) Data: Greene’s Table F4.3 –Id = Observation, 123 + 35 holding companies –Year = 1970 for all observations –Cost = Total cost, –Q = Total output, –Pl = Wage rate, –Sl = Cost share for labor, –Pk = Capital price index, –Sk = Cost share for capital, –Pf = Fuel price, –Sf = Cost share for fuel

33 Example Cobb-Douglas Cost Function for Electricity Generation (Christensen and Greene [1976]) –ln(Cost) =  1 +  2 ln(PL) +  3 ln(PK) +  4 ln(PF) +  5 ln(Q) + ½  6 ln(Q)^2 +  7 ln(Q)*ln(PL) +  8 ln(Q)*ln(PK) +  9 ln(Q)*ln(PF) +  –Linear Homogeneity in Prices:  2 +  3 +  4 =1,  7 +  8 +  9 =0 –Imposing Restrictions: ln(Cost/PF) =  1 +  2 ln(PL/PF) +  3 ln(PK/PF) +  5 ln(Q) + ½  6 ln(Q)^2 +  7 ln(Q)*ln(PL/PF) +  8 ln(Q)*ln(PK/PF) + 


Download ppt "Classical Linear Regression Model Normality Assumption Hypothesis Testing Under Normality Maximum Likelihood Estimator Generalized Least Squares."

Similar presentations


Ads by Google