Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part 11: Asymptotic Distribution Theory 11-1/72 Econometrics I Professor William Greene Stern School of Business Department of Economics.

Similar presentations


Presentation on theme: "Part 11: Asymptotic Distribution Theory 11-1/72 Econometrics I Professor William Greene Stern School of Business Department of Economics."— Presentation transcript:

1 Part 11: Asymptotic Distribution Theory 11-1/72 Econometrics I Professor William Greene Stern School of Business Department of Economics

2 Part 11: Asymptotic Distribution Theory 11-2/72 Econometrics I Part 11 – Asymptotic Distribution Theory

3 Part 11: Asymptotic Distribution Theory 11-3/72 Received October 6, 2012 Dear Prof. Greene, I am AAAAAA, an assistant professor of Finance at the xxxxx university of xxxxx, xxxxx. I would be grateful if you could answer my question regarding the parameter estimates and the marginal effects in Multinomial Logit (MNL). After running my estimations, the parameter estimate of my variable of interest is statistically significant, but its marginal effect, evaluated at the mean of the explanatory variables, is not. Can I just rely on the parameter estimates’ results to say that the variable of interest is statistically significant? How can I reconcile the parameter estimates and the marginal effects’ results? Thank you very much in advance! Best, AAAAAA

4 Part 11: Asymptotic Distribution Theory 11-4/72 Asymptotics: Setting Most modeling situations involve stochastic regressors, nonlinear models or nonlinear estimation techniques. The number of exact statistical results, such as expected value or true distribution, that can be obtained in these cases is very low. We rely, instead, on approximate results that are based on what we know about the behavior of certain statistics in large samples. Example from basic statistics: What can we say about 1/ ? We know a lot about. What do we know about its reciprocal?

5 Part 11: Asymptotic Distribution Theory 11-5/72 Convergence Definitions, kinds of convergence as n grows large: 1. To a constant; example, the sample mean, converges to the population mean. 2. To a random variable; example, a t statistic with n -1 degrees of freedom converges to a standard normal random variable

6 Part 11: Asymptotic Distribution Theory 11-6/72 Convergence to a Constant Sequences and limits. Sequence of constants, indexed by n (n(n+1)/2 + 3n + 5) Ordinary limit: --------------------------  ? (1/2) (n 2 + 2n + 1) (The use of the “leading term”) Convergence of a random variable. What does it mean for a random variable to converge to a constant? Convergence of the variance to zero. The random variable converges to something that is not random.

7 Part 11: Asymptotic Distribution Theory 11-7/72 Convergence Results Convergence of a sequence of random variables to a constant - convergence in mean square: Mean converges to a constant, variance converges to zero. (Far from the most general, but definitely sufficient for our purposes.) A convergence theorem for sample moments. Sample moments converge in probability to their population counterparts. Generally the form of The Law of Large Numbers. (Many forms; see Appendix D in your text. This is the “weak” law of large numbers.) Note the great generality of the preceding result. (1/n)Σ i g(z i ) converges to E[g(z i )].

8 Part 11: Asymptotic Distribution Theory 11-8/72 Extending the Law of Large Numbers

9 Part 11: Asymptotic Distribution Theory 11-9/72 Probability Limit

10 Part 11: Asymptotic Distribution Theory 11-10/72 Mean Square Convergence

11 Part 11: Asymptotic Distribution Theory 11-11/72 Probability Limits and Expecations What is the difference between E[x n ] and plim x n ?

12 Part 11: Asymptotic Distribution Theory 11-12/72 Consistency of an Estimator If the random variable in question, x n is an estimator (such as the mean), and if plim x n = θ Then x n is a consistent estimator of θ. Estimators can be inconsistent for two reasons: (1) They are consistent for something other than the thing that interests us. (2) They do not converge to constants. They are not consistent estimators of anything. We will study examples of both.

13 Part 11: Asymptotic Distribution Theory 11-13/72 The Slutsky Theorem Assumptions: If x n is a random variable such that plim x n = θ. For now, we assume θ is a constant. g(.) is a continuous function with continuous derivatives. g(.) is not a function of n. Conclusion: Then plim[g(x n )] = g[plim(x n )] assuming g[plim(x n )] exists. (VVIR!) Works for probability limits. Does not work for expectations.

14 Part 11: Asymptotic Distribution Theory 11-14/72 Slutsky Corollaries

15 Part 11: Asymptotic Distribution Theory 11-15/72 Slutsky Results for Matrices Functions of matrices are continuous functions of the elements of the matrices. Therefore, If plimA n = A and plimB n = B (element by element), then plim(A n -1 ) = [plim A n ] -1 = A -1 and plim(A n B n ) = plimA n plim B n = AB

16 Part 11: Asymptotic Distribution Theory 11-16/72 Limiting Distributions Convergence to a kind of random variable instead of to a constant x n is a random sequence with cdf F n (x n ). If plim x n = θ (a constant), then F n (x n ) becomes a point. But, x n may converge to a specific random variable. The distribution of that random variable is the limiting distribution of x n. Denoted

17 Part 11: Asymptotic Distribution Theory 11-17/72 Limiting Distribution

18 Part 11: Asymptotic Distribution Theory 11-18/72 A Slutsky Theorem for Random Variables (Continuous Mapping)

19 Part 11: Asymptotic Distribution Theory 11-19/72 An Extension of the Slutsky Theorem

20 Part 11: Asymptotic Distribution Theory 11-20/72 Application of the Slutsky Theorem

21 Part 11: Asymptotic Distribution Theory 11-21/72 Central Limit Theorems Central Limit Theorems describe the large sample behavior of random variables that involve sums of variables. “Tendency toward normality.” Generality: When you find sums of random variables, the CLT shows up eventually. The CLT does not state that means of samples have normal distributions.

22 Part 11: Asymptotic Distribution Theory 11-22/72 A Central Limit Theorem

23 Part 11: Asymptotic Distribution Theory 11-23/72 Lindeberg-Levy vs. Lindeberg-Feller Lindeberg-Levy assumes random sampling – observations have the same mean and same variance. Lindeberg-Feller allows variances to differ across observations, with some necessary assumptions about how they vary. Most econometric estimators require Lindeberg- Feller (and extensions such as Lyapunov).

24 Part 11: Asymptotic Distribution Theory 11-24/72 Order of a Sequence Order of a sequence ‘Little oh’ o(.). Sequence h n is o(n  ) (order less than n  ) iff n -  h n  0. Example: h n = n 1.4 is o(n 1.5 ) since n -1.5 h n = 1 /n.1  0. ‘Big oh’ O(.). Sequence h n is O(n  ) iff n -  h n  a finite nonzero constant. Example 1: h n = (n 2 + 2n + 1) is O(n 2 ). Example 2:  i x i 2 is usually O(n 1 ) since this is n  the mean of x i 2 and the mean of x i 2 generally converges to E[x i 2 ], a finite constant. What if the sequence is a random variable? The order is in terms of the variance. Example: What is the order of the sequence in random sampling? Var[ ] = σ 2 /n which is O(1/n)

25 Part 11: Asymptotic Distribution Theory 11-25/72 Cornwell and Rupert Panel Data Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years Variables in the file are EXP = work experience WKS = weeks worked OCC = occupation, 1 if blue collar, IND = 1 if manufacturing industry SOUTH = 1 if resides in south SMSA= 1 if resides in a city (SMSA) MS = 1 if married FEM = 1 if female UNION = 1 if wage set by union contract ED = years of education LWAGE = log of wage = dependent variable in regressions These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp. 149-155.

26 Part 11: Asymptotic Distribution Theory 11-26/72

27 Part 11: Asymptotic Distribution Theory 11-27/72

28 Part 11: Asymptotic Distribution Theory 11-28/72 Histogram for LWAGE

29 Part 11: Asymptotic Distribution Theory 11-29/72 The kernel density estimator is a histogram (of sorts).

30 Part 11: Asymptotic Distribution Theory 11-30/72 Kernel Density Estimator

31 Part 11: Asymptotic Distribution Theory 11-31/72 Kernel Estimator for LWAGE X*

32 Part 11: Asymptotic Distribution Theory 11-32/72 Asymptotic Distribution An asymptotic distribution is a finite sample approximation to the true distribution of a random variable that is good for large samples, but not necessarily for small samples. Stabilizing transformation to obtain a limiting distribution. Multiply random variable x n by some power, a, of n such that the limiting distribution of n a x n has a finite, nonzero variance. Example, has a limiting variance of zero, since the variance is σ 2 /n. But, the variance of √n is σ 2. However, this does not stabilize the distribution because E[ ] = √ nμ. The stabilizing transformation would be

33 Part 11: Asymptotic Distribution Theory 11-33/72 Asymptotic Distribution Obtaining an asymptotic distribution from a limiting distribution Obtain the limiting distribution via a stabilizing transformation Assume the limiting distribution applies reasonably well in finite samples Invert the stabilizing transformation to obtain the asymptotic distribution Asymptotic normality of a distribution.

34 Part 11: Asymptotic Distribution Theory 11-34/72 Asymptotic Efficiency  Comparison of asymptotic variances  How to compare consistent estimators? If both converge to constants, both variances go to zero. Example: Random sampling from the normal distribution, Sample mean is asymptotically normal[μ,σ 2 /n] Median is asymptotically normal [μ,(π/2)σ 2 /n] Mean is asymptotically more efficient

35 Part 11: Asymptotic Distribution Theory 11-35/72 Limiting Distribution: Testing for Normality

36 Part 11: Asymptotic Distribution Theory 11-36/72 Extending the Law of Large Numbers

37 Part 11: Asymptotic Distribution Theory 11-37/72 Elements of a Proof for Bowman Shenton

38 Part 11: Asymptotic Distribution Theory 11-38/72 Bera and Jarque

39 Part 11: Asymptotic Distribution Theory 11-39/72 Expectation vs. Probability Limit

40 Part 11: Asymptotic Distribution Theory 11-40/72 The Delta Method The delta method (combines most of these concepts) Nonlinear transformation of a random variable: f(x n ) such that plim x n =  but  n (x n -  ) is asymptotically normally distributed. What is the asymptotic behavior of f(x n )? Taylor series approximation: f(x n )  f(  ) + f(  ) (x n -  ) By Slutsky theorem, plim f(x n ) = f(  )  n[f(x n ) - f(  )]  f(  ) [  n (x n -  )] Large sample behaviors of the LHS and RHS sides are the same (generally - requires f(.) to be nicely behaved. RHS is a constant times something familiar. Large sample variance is [f(  )] 2 times large sample Var[  n (x n -  )] Return to asymptotic variance of x n Gives us the VIR for the asymptotic distribution of a function.

41 Part 11: Asymptotic Distribution Theory 11-41/72 Delta Method

42 Part 11: Asymptotic Distribution Theory 11-42/72 Delta Method - Applications

43 Part 11: Asymptotic Distribution Theory 11-43/72 Delta Method Application

44 Part 11: Asymptotic Distribution Theory 11-44/72 Delta Method

45 Part 11: Asymptotic Distribution Theory 11-45/72 Midterm Question

46 Part 11: Asymptotic Distribution Theory 11-46/72 Confidence Intervals?

47 Part 11: Asymptotic Distribution Theory 11-47/72 Krinsky and Robb vs. the Delta Method

48 Part 11: Asymptotic Distribution Theory 11-48/72 Krinsky and Robb

49 Part 11: Asymptotic Distribution Theory 11-49/72 Krinsky and Robb

50 Part 11: Asymptotic Distribution Theory 11-50/72 Delta Method – More than One Parameter

51 Part 11: Asymptotic Distribution Theory 11-51/72 Log Income Equation ---------------------------------------------------------------------- Ordinary least squares regression............ LHS=LOGY Mean = -1.15746 Estimated Cov[b1,b2] Standard deviation =.49149 Number of observs. = 27322 Model size Parameters = 7 Degrees of freedom = 27315 Residuals Sum of squares = 5462.03686 Standard error of e =.44717 Fit R-squared =.17237 --------+------------------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] Mean of X --------+------------------------------------------------------------- AGE|.06225***.00213 29.189.0000 43.5272 AGESQ| -.00074***.242482D-04 -30.576.0000 2022.99 Constant| -3.19130***.04567 -69.884.0000 MARRIED|.32153***.00703 45.767.0000.75869 HHKIDS| -.11134***.00655 -17.002.0000.40272 FEMALE| -.00491.00552 -.889.3739.47881 EDUC|.05542***.00120 46.050.0000 11.3202 --------+-------------------------------------------------------------

52 Part 11: Asymptotic Distribution Theory 11-52/72 Age-Income Profile: Married=1, Kids=1, Educ=12, Female=1

53 Part 11: Asymptotic Distribution Theory 11-53/72 Application: Maximum of a Function AGE|.06225***.00213 29.189.0000 43.5272 AGESQ| -.00074***.242482D-04 -30.576.0000 2022.99

54 Part 11: Asymptotic Distribution Theory 11-54/72 Delta Method Using Visible Digits

55 Part 11: Asymptotic Distribution Theory 11-55/72 Delta Method Results ----------------------------------------------------------- WALD procedure. --------+-------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------- G1| 674.399*** 22.05686 30.575.0000 G2| 56623.8*** 1797.294 31.505.0000 AGESTAR| 41.9809***.19193 218.727.0000 --------+-------------------------------------------------- (Computed using all internal digits of regression results)

56 Part 11: Asymptotic Distribution Theory 11-56/72

57 Part 11: Asymptotic Distribution Theory 11-57/72 Krinsky and Robb Descriptive Statistics ============================================================ Variable Mean Std.Dev. Minimum Maximum --------+--------------------------------------------------- ASTAR | 42.0607.191563 41.2226 42.7720

58 Part 11: Asymptotic Distribution Theory 11-58/72 More than One Function and More than One Coefficient

59 Part 11: Asymptotic Distribution Theory 11-59/72 Application: Partial Effects Received October 6, 2012 Dear Prof. Greene, I am AAAAAA, an assistant professor of Finance at the xxxxx university of xxxxx, xxxxx. I would be grateful if you could answer my question regarding the parameter estimates and the marginal effects in Multinomial Logit (MNL). After running my estimations, the parameter estimate of my variable of interest is statistically significant, but its marginal effect, evaluated at the mean of the explanatory variables, is not. Can I just rely on the parameter estimates’ results to say that the variable of interest is statistically significant? How can I reconcile the parameter estimates and the marginal effects’ results? Thank you very much in advance! Best, AAAAAA

60 Part 11: Asymptotic Distribution Theory 11-60/72 Application: Doctor Visits  German Individual Health Care data: N=27,236  Model for number of visits to the doctor: True E[V|Income] = exp(1.412 -.0745*income) Linear regression: g*(Income)=3.917 -.208*income

61 Part 11: Asymptotic Distribution Theory 11-61/72 A Nonlinear Model

62 Part 11: Asymptotic Distribution Theory 11-62/72 Interesting Partial Effects

63 Part 11: Asymptotic Distribution Theory 11-63/72 Necessary Derivatives (Jacobian)

64 Part 11: Asymptotic Distribution Theory 11-64/72

65 Part 11: Asymptotic Distribution Theory 11-65/72

66 Part 11: Asymptotic Distribution Theory 11-66/72 Partial Effects at Means vs. Mean of Partial Effects

67 Part 11: Asymptotic Distribution Theory 11-67/72 Partial Effect for a Dummy Variable?

68 Part 11: Asymptotic Distribution Theory 11-68/72 Application: CES Function

69 Part 11: Asymptotic Distribution Theory 11-69/72 Application: CES Function

70 Part 11: Asymptotic Distribution Theory 11-70/72 Application: CES Function Using Spanish Dairy Farm Data Create ; x1=1 ; x2=logcows ; x3=logfeed ; x4=-.5*(logcows-logfeed)^2$ Regress ; lhs=logmilk;rh1=x1,x2,x3,x4$ Calc ; b1=b(1);b2=b(2);b3=b(3);b4=b(4) $ Calc ; gamma=exp(b1) ; delta=b2/(b2+b3) ; nu=b2+b3 ; rho=b4*(b2+b3)/(b2*b3)$ Calc ;g11=exp(b1) ;g12=0 ;g13=0 ;g14=0 ;g21=0 ;g22=b3/(b2+b3)^2 ;g23=-b2/(b2+b3)^2 ;g24=0 ;g31=0 ;g32=1 ;g33=1 ;g34=0 ;g41=0 ;g42=-b3*b4/(b2^2*b3) ;g43=-b2*b4/(b2*b3^2) ;g44=(b2+b3)/(b2*b3)$ Matrix ; g=[g11,g12,g13,g14/g21,g22,g23,g24/g31,g32,g33,g34/g41,g42,g43,g44]$ Matrix ; VDelta=G*VARB*G' $ Matrix ; theta=[gamma/delta/nu/rho] ; Stat(theta,vdelta)$

71 Part 11: Asymptotic Distribution Theory 11-71/72 Estimated CES Function --------+-------------------------------------------------------------------- | Standard Prob. 95% Confidence Matrix| Coefficient Error z |z|>Z* Interval --------+-------------------------------------------------------------------- THETA_1| 105981*** 475.2458 223.00.0000 105049 106912 THETA_2|.56819***.01286 44.19.0000.54299.59340 THETA_3| 1.06781***.00864 123.54.0000 1.05087 1.08475 THETA_4| -.31956**.12857 -2.49.0129 -.57155 -.06758 --------+--------------------------------------------------------------------

72 Part 11: Asymptotic Distribution Theory 11-72/72 Asymptotics for Least Squares Looking Ahead… Assumptions: Convergence of XX/n (doesn’t require nonstochastic or stochastic X). Convergence of X  /n to 0. Sufficient for consistency. Assumptions: Convergence of (1/  n)X’  to a normal vector, gives asymptotic normality What about asymptotic efficiency?


Download ppt "Part 11: Asymptotic Distribution Theory 11-1/72 Econometrics I Professor William Greene Stern School of Business Department of Economics."

Similar presentations


Ads by Google