1/76. 2/76 William Greene New York University True Random Effects in Stochastic Frontier Models.

Slides:



Advertisements
Similar presentations
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Advertisements

Panel Data Models Prepared by Vera Tabakova, East Carolina University.
[Part 5] 1/53 Stochastic FrontierModels Heterogeneity Stochastic Frontier Models William Greene Stern School of Business New York University 0Introduction.
A.M. Alonso, C. García-Martos, J. Rodríguez, M. J. Sánchez Seasonal dynamic factor model and bootstrap inference: Application to electricity market forecasting.
Longitudinal and Multilevel Methods for Models with Discrete Outcomes with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David.
Part 24: Bayesian Estimation 24-1/35 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
1/78 William Greene New York University North American Productivity Workshop Ottawa, June 6, 2014 True Random Effects in Stochastic Frontier Models.
[Part 3] 1/49 Stochastic FrontierModels Stochastic Frontier Model Stochastic Frontier Models William Greene Stern School of Business New York University.
Evaluating Alternative Representations of the Choice Set In Models of Labour Supply Rolf Aaberge, Ugo Colombino and Tom Wennemo Workshop on Discrete Choice.
Part 12: Random Parameters [ 1/46] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
Bootstrap in Finance Esther Ruiz and Maria Rosa Nieto (A. Rodríguez, J. Romo and L. Pascual) Department of Statistics UNIVERSIDAD CARLOS III DE MADRID.
1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA.
Part 4: Partial Regression and Correlation 4-1/24 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Part 23: Simulation Based Estimation 23-1/26 Econometrics I Professor William Greene Stern School of Business Department of Economics.
Demand Estimation & Forecasting
Cross-sectional:Observations on individuals, households, enterprises, countries, etc at one moment in time (Chapters 1–10, Models A and B). 1 During this.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: maximum likelihood estimation of regression coefficients Original citation:
[Part 8] 1/27 Stochastic FrontierModels Applications Stochastic Frontier Models William Greene Stern School of Business New York University 0Introduction.
[Part 7] 1/68 Stochastic FrontierModels Panel Data Stochastic Frontier Models William Greene Stern School of Business New York University 0Introduction.
Part 20: Selection [1/66] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
Efficiency Measurement William Greene Stern School of Business New York University.
Random Sampling, Point Estimation and Maximum Likelihood.
Introduction: Why statistics? Petter Mostad
Efficiency of Public Spending in Developing Countries: A Stochastic Frontier Approach William Greene Stern School of Business World Bank, May 23, 2005.
Efficiency Measurement William Greene Stern School of Business New York University.
Efficiency Measurement William Greene Stern School of Business New York University.
Part 6: MLE for RE Models [ 1/38] Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business.
Methodology Conclusions References (selected) Bhattacharyya, A., Kumbhakar, S., Bhattacharyya, A., Ownership structure and cost efficiency: A study.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Copyright © 2005 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics Thomas Maurice eighth edition Chapter 7.
1/69: Topic Descriptive Statistics and Linear Regression Microeconometric Modeling William Greene Stern School of Business New York University New.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
1/62: Topic 2.3 – Panel Data Binary Choice Models Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA.
Efficiency Measurement William Greene Stern School of Business New York University.
Machine Learning 5. Parametric Methods.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
1/70. 2/70 Mundlak, Y., Empirical production function free of management bias. Journal of Farm Economics 43, (Wrote about (omitted) fixed.
Operational Conditions in Regulatory Benchmarking – A Monte-Carlo Simulation Stefan Seifert & Maria Nieswand Workshop: Benchmarking of Public Utilities.
Stochastic Frontier Models
Implications of Model Specification and Temporal Revisit Designs on Trend Detection Leigh Ann Starcevich (OSU) Kathryn M. Irvine (USGS) Andrea M. Heard.
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
Efficiency Measurement William Greene Stern School of Business New York University.
Chapter 4 Basic Estimation Techniques
Heteroskedastic Stochastic Cost Frontier Approach in the Estimation of Cost Efficiency of Tunisian Water Distribution Utilities Tawfik Ben Amor,PhD and.
Microeconometric Modeling
Efficiency Measurement
School of Business, Economics and Law University of Gothenburg
22. Stochastic Frontier Models And Efficiency Measurement
Microeconometric Modeling
Impact evaluation: The quantitative methods with applications
Microeconometric Modeling
Econometric Analysis of Panel Data
Microeconometric Modeling
Efficiency Measurement
Stochastic Frontier Models
Empirical Models to deal with Unobserved Heterogeneity
Econometrics I Professor William Greene Stern School of Business
Microeconometric Modeling
Stochastic Frontier Models
Evaluating Impacts: An Overview of Quantitative Methods
Econometrics Chengyuan Yin School of Mathematics.
Microeconometric Modeling
Econometric Analysis of Panel Data
Panel Stochastic Frontier Models with Endogeneity in Stata
Presentation transcript:

1/76

2/76 William Greene New York University True Random Effects in Stochastic Frontier Models

3/76 Agenda  Skew normality – Adelchi Azzalini  Stochastic frontier model  Panel Data: Time varying and time invariant inefficiency models  Panel Data: True random effects models  Maximum Simulated Likelihood Estimation  Applications of true random effects  Persistent and transient inefficiency in Swiss railroads  A panel data sample selection corrected stochastic frontier model  Spatial effects in a stochastic frontier model

4/76 Skew Normality

5/76 The Stochastic Frontier Model

6/76 Log Likelihood Skew Normal Density

7/76 Birnbaum (1950) Wrote About Skew Normality Effect of Linear Truncation on a Multinormal Population

8/76 Weinstein (1964) Found f(  ) Query 2: The Sum of Values from a Normal and a Truncated Normal Distribution See, also, Nelson (Technometrics, 1964), Roberts (JASA, 1966)

9/76 Resembles f(  ) O’Hagan and Leonard (1976) Found Something Like f(  ) Bayes Estimation Subject to Uncertainty About Parameter Constraints

10/76 ALS (1977) Discovered How to Make Great Use of f(  ) See, also, Forsund and Hjalmarsson (1974), Battese and Corra (1976) Poirier,… Timmer, … several others.

11/76 Azzalini (1985) Figured Out f(  ) And Noticed the Connection to ALS © 2014

12/76

13/76 ALS

14/76 A Useful FAQ About the Skew Normal

15/76 Random Number Generator

16/76 How Many Applications of SF Are There?

17/76 W. D. Walls (2006) On Skewness in the Movies Cites Azzalini.

18/76 “The skew-normal distribution developed by Sahu et al. (2003)…” Does not know Azzalini. SNARCH Model for Financial Crises (2013)

19/76 A Skew Normal Mixed Logit Model (2010) Greene (2010, knows Azzalini and ALS), Bhat (2011, knows not Azzalini … or ALS)

20/76  Foundation: An Entire Field Stochastic Frontier Model  Occasional Modeling Strategy  Culture: Skewed Distribution of Movie Revenues  Finance: Crisis and Contagion  Choice Modeling: The Mixed Logit Model  How can these people find each other?  Where else do applications appear? Skew Normal Applications

21/76 Stochastic Frontier

22/76 The Cross Section Departure Point: 1977

23/76 The Panel Data Models Appear: 1981 Time fixed

24/76 Reinterpreting the Within Estimator: 1984 Time fixed

25/76 Misgivings About Time Fixed Inefficiency: 1990-

26/76 Are the systematically time varying models more like time fixed or freely time varying?

27/76

28/76 Skepticism About Time Varying Inefficiency Models: Greene (2004)  

29/76 True Random Effects

30/76 True Random and Fixed Effects: 2004 Time varying Time fixed

31/76 Estimation of TFE and TRE Models: 2004

32/76

33/76

34/76 The Most Famous Frontier Study Ever

35/76 The Famous WHO Model  logCOMP=  +  1 logPerCapitaHealthExpenditure +  2 logYearsEduc +  3 Log 2 YearsEduc +    = v - u  Schmidt/Sickles FEM  191 Countries. 140 of them observed

36/76 The Notorious WHO Results 37

37/76 No, it doesn’t. August 12,

38/76 Huffington Post, April 17, 2014

39/76 we are #37

40/76 Greene, W., Distinguishing Between Heterogeneity and Inefficiency: Stochastic Frontier Analysis of the World Health Organization’s Panel Data on National Health Care Systems, Health Economics, 13, 2004, pp

41/76

42/76 Three Extensions of the True Random Effects Model

43/76 Generalized True Random Effects Model

44/76 A Stochastic Frontier Model with Short-Run and Long-Run Inefficiency: Colombi, R., Kumbhakar, S., Martini, G., Vittadini, G., University of Bergamo, WP, 2011, JPA 2014, forthcoming. Tsionas, G. and Kumbhakar, S. Firm Heterogeneity, Persistent and Transient Technical Inefficiency: A Generalized True Random Effects Model Journal of Applied Econometrics. Published online, November, Extremely involved Bayesian MCMC procedure. Efficiency components estimated by data augmentation.

45/76

46/76 Estimating Efficiency in the CSN Model

47/76 Estimating the GTRE Model

48/76 “From the sampling theory perspective, the application of the model is computationally prohibitive when T is large. This is because the likelihood function depends on a (T+1)-dimensional integral of the normal distribution.” [Tsionas and Kumbhakar (2012, p. 6)]

49/76 Kumbhakar, Lien, Hardaker Technical Efficiency in Competing Panel Data Models: A Study of Norwegian Grain Farming, JPA, Published online, September, Three steps based on GLS: (1) RE/FGLS to estimate ( ,  ) (2) Decompose time varying residuals using MoM and SF. (3) Decompose estimates of time invariant residuals.

50/76

51/76 WHO Results: 2014

52/76

53/76 Empirical application Cost Efficiency of Swiss Railway Companies

54/76 Model Specification TC = f ( Y 1, Y 2, P L, P C, P E, N, NS, d t ) 54 C :Total costs Y 1 :Passenger-km Y 2 :Ton-km P L :Price of labor (wage per FTE) P C :Price of capital (capital costs / total number of seats) P E :Price of electricity N : Network length NS: Number of stations Dt: time dummies

55/76 Data  50 railway companies  Period 1985 to 1997  unbalanced panel with number of periods (Ti) varying from 1 to 13 and with 45 companies with 12 or 13 years, resulting in 605 observations  Data source: Swiss federal transport office  Data set available at  Data set used in: Farsi, Filippini, Greene (2005), Efficiency and measurement in network industries: application to the Swiss railway companies, Journal of Regulatory Economics 55

56/76

57/76

58/76 Cost Efficiency Estimates 58

59/76 Correlations

60/76 MSL Estimation

61/76 Why is the MSL method so computationally efficient compared to classical FIML and Bayesian MCMC for this model?  Conditioned on the persistent effects, the group observations are independent.  The joint conditional distribution is simple and easy to compute, in closed form.  The full likelihood is obtained by integrating over only one dimension. (This was discovered by Butler and Moffitt in 1982.)  Neither of the other methods takes advantage of this result. Both integrate over T+1 dimensions.

62/76

63/76 Equivalent Log Likelihood – Identical Outcome One Dimensional Integration over δ i T+1 Dimensional Integration over Re i.

64/76 Simulated [over (w,h)] Log Likelihood Very Fast – with T=13, one minute or so

65/76 Also Simulated Log Likelihood GHK simulator is used to approximate the T+1 variate normal integrals. Very Slow – Huge amount of unnecessary computation.

66/ Farms, 6 years. 100 Halton draws. Computation time: 35 seconds including computing efficiencies. Computation of the GTRE Model is Actually Fast and Easy

67/76 Simulation Variance

68/76 Does the simulation chatter degrade the econometric efficiency of the MSL estimator?  Hajivassiliou, V., “Some practical issues in maximum simulated likelihood,” Simulation-based Inference in Econometrics: Methods and Applications, Mariano, R., Weeks, M. and Schuerman, T., Cambridge University Press, 2008  Speculated that Asy.Var[estimator] = V + (1/R)C  The contribution of the chatter would be of second or third order. R is typically in the hundreds or thousands.  No other evidence on this subject.

69/76 An Experiment Pooled Spanish Dairy Farms Data  Stochastic frontier using FIML.  Random constant term linear regression with constant term equal to  -  |w|, w~ N[0,1] This is equivalent to the stochastic frontier model.  Maximum simulated likelihood  500 random draws for the simulation for the base case. Uses Mersenne Twister for the RNG  50 repetitions of estimation based on 500 random draws to suggest variation due to simulation chatter.

70/76

71/76 Chatter Simulation Noise in Standard Errors of Coefficients

72/76 Quasi-Monte Carlo Integration Based on Halton Sequences For example, using base p=5, the integer r=37 has b 0 = 2, b 1 = 2, and b 3 = 1; (37=1x x x5 0 ). Then H(37|5) = 2    5 -3 =

73/76 Is It Really Simulation?  Halton or Sobol sequences are not random  Far more stable than random draws, by a factor of about 10.  There is no simulation chatter  View the same as numerical quadrature  There may be some approximation error. How would we know?

74/76 Halton Sequences

75/76 Haltonized Log Likelihood

76/76 Summary  The skew normal distribution  Two useful models for panel data (and one potentially useful model pending development)  Extension of TRE model that allows both transient and persistent random variation and inefficiency  Sample selection corrected stochastic frontier  Spatial autocorrelation stochastic frontier model  Methods: Maximum simulated likelihood as an alternative to received brute force methods  Simpler  Faster  Accurate  Simulation “chatter” is a red herring – use Halton sequences

77/76 Sample Selection

78/76 TECHNICAL EFFICIENCY ANALYSIS CORRECTING FOR BIASES FROM OBSERVED AND UNOBSERVED VARIABLES: AN APPLICATION TO A NATURAL RESOURCE MANAGEMENT PROJECT Empirical Economics: Volume 43, Issue 1 (2012), Pages Boris Bravo-Ureta University of Connecticut Daniel Solis University of Miami William Greene New York University

79/76 The MARENA Program in Honduras  Several programs have been implemented to address resource degradation while also seeking to improve productivity, managerial performance and reduce poverty (and in some cases make up for lack of public support).  One such effort is the Programa Multifase de Manejo de Recursos Naturales en Cuencas Prioritarias or MARENA in Honduras focusing on small scale hillside farmers.

80/76 Expected Impact Evaluation

81/76 Methods  A matched group of beneficiaries and control farmers is determined using Propensity Score Matching techniques to mitigate biases that would stem from selection on observed variables.  In addition, we deal with possible self-selection on unobservables arising from unobserved variables using a selectivity correction model for stochastic frontiers introduced by Greene (2010).

82/76 A Sample Selected SF Model d i = 1[  ′z i + h i > 0], h i ~ N[0,1 2 ] y i =  +  ′x i +  i,  i ~ N[0,   2 ] (y i,x i ) observed only when d i = 1.  i = v i - u i u i =  u |U i | where U i ~ N[0,1 2 ] v i =  v V i where V i ~ N[0,1 2 ]. (h i,v i ) ~ N 2 [(0,1), (1,  v,  v 2 )]

83/76 Simulated logL for the Standard SF Model This is simply a linear regression with a random constant term, α i = α - σ u |U i |

84/76 Likelihood For a Sample Selected SF Model

85/76 Simulated Log Likelihood for a Selectivity Corrected Stochastic Frontier Model The simulation is over the inefficiency term.

86/76 JLMS Estimator of u i

87/76 Closed Form for the Selection Model  The selection model can be estimated without simulation  “The stochastic frontier model with correction for sample selection revisited.” Lai, Hung-pin. Forthcoming, JPA  Based on closed skew normal distribution  Similar to Maddala’s 1982 result for the linear selection model. See slide 42.  Not more computationally efficient.  Statistical properties identical.  Suggested possibility that simulation chatter is an element of inefficiency in the maximum simulated likelihood estimator.

88/76 Spanish Dairy Farms: Selection based on being farm # periods The theory works. Closed Form vs. Simulation

89/76 Variables Used in the Analysis Production Participation

90/76 Findings from the First Wave

91/76 A Panel Data Model Selection takes place only at the baseline. There is no attrition.

92/76 Simulated Log Likelihood

93/76  Benefit group is more efficient in both years  The gap is wider in the second year  Both means increase from year 0 to year 1  Both variances decline from year 0 to year 1 Main Empirical Conclusions from Waves 0 and 1

94/76

95/76 Spatial Autocorrelation

96/76  Spatial Stochastic Frontier Models: Accounting for Unobserved Local Determinants of Inefficiency: A.M.Schmidt, A.R.B.Morris, S.M.Helfand, T.C.O.Fonseca, Journal of Productivity Analysis, 31, 2009, pp  Simply redefines the random effect to be a ‘region effect.’ Just a reinterpretation of the ‘group.’ No spatial decay with distance.  True REM does not “perform” as well as several other specifications. (“Performance” has nothing to do with the frontier model.) True Random Spatial Effects