Stochastic Frontier Models 0 Introduction 1 Efficiency Measurement 2 Frontier Functions 3 Stochastic Frontiers 4 Production and Cost 5 Heterogeneity 6 Model Extensions 7 Panel Data 8 Applications William Greene Stern School of Business New York University
Range of Applications Regulated industries – railroads, electricity, public services Health care delivery – nursing homes, hospitals, health care systems (WHO) Banking and Finance Many, many (many) other industries. See Lovell and Schmidt survey…
Discrete Variables Count data frontier Outcomes inside the frontier: Preserve discrete outcome Patents (Hofler, R. “A Count Data Stochastic Frontier Model,” Infant Mortality (Fe, E., “On the Production of Economic Bads…”)
Count Frontier P(y*|x)=Poisson Model for optimal outcome Effects the distribution: P(y|y*,x)=P(y*-u|x)= a different count model for the mixture of two count variables Effects the mean:E[y*|x]=λ(x) while E[y|x]=u λ(x) with 0 < u < 1. (A mixture model) Other formulations.
Alvarez, Arias, Greene Fixed Management Yit = f(xit,mi*) where mi* = “management” Actual mi = mi* - ui. Actual falls short of “ideal” Translates to a random coefficients stochastic frontier model Estimated by simulation Application to Spanish dairy farms
Fixed Management as an Input Implies Time Variation in Inefficiency
Random Coefficients Frontier Model [Chamberlain/Mundlak: Correlation mi* (not mi-mi*) with xit]
Estimated Model First order production coefficients (standard errors). Quadratic terms not shown.
Inefficiency Distributions Without Fixed Management With Fixed Management
Holloway, Tomberlin, Irz: Coastal Trawl Fisheries Application of frontier to coastal fisheries Hierarchical Bayes estimation Truncated normal model and exponential Panel data application Time varying inefficiency The “good captain” effect vs. inefficiency
Sports Kahane: Hiring practices in hockey Output=payroll, Inputs=coaching, franchise measures Efficiency in payroll related to team performance Battese/Coelli panel data translog model Koop: Performance of baseball players Aggregate output: singles, doubles, etc. Inputs = year, league, team Policy relevance? (Just for fun)
Macro Performance Koop et al. Productivity Growth in a stochastic frontier model Country, year, Yit = ft(Kit,Lit)Eitwit Bayesian estimation OECD Countries, 1979-1988
Mutual Fund Performance Standard CAPM Stochastic frontier added Excess return=a+b*Beta +v – u Sub-model for determinants of inefficiency Bayesian framework Pooled various different distribution estimates
Energy Consumption Derived input to household and community production Cost analogy Panel data, statewide electricity consumption: Filippini, Farsi, et al.
Hospitals Usually cost studies Multiple outputs – case mix “Quality” is a recurrent theme Complexity – unobserved variable Endogeneity Rosko: US Hospitals, multiple outputs, panel data, determinants of inefficiency = HMO penetration, payment policies, also includes indicators of heterogeneity Australian hospitals: Fit both production and cost frontiers. Finds large cost savings from removing inefficiency.
Law Firms Stochastic frontier applied to service industry Output=Revenue Inputs=Lawyers, associates/partners ratio, paralegals, average legal experience, national firm Analogy drawn to hospitals literature – quality aspect of output is a difficult problem
Farming Hundreds of applications Major proving ground for new techniques Many high quality, very low level micro data sets O’Donnell/Griffiths – Philippine rice farms Latent class – favorable or unfavorable climate Panel data production model Bayesian – has a difficult time with latent class models. Classical is a better approach
Railroads and other Regulated Industries Filippini – Maggi: Swiss railroads, scale effects etc. Also studied effect of different panel data estimators Coelli – Perelman, European railroads. Distance function. Developed methodology for distance functions Many authors: Electricity (C&G). Used as the standard test data for Bayesian estimators
Banking Dozens of studies Typically multiple output cost functions Wheelock and Wilson, U.S. commercial banks Turkish Banking system Banks in transition countries U.S. Banks – Fed studies (hundreds of studies) Typically multiple output cost functions Development area for new techniques Many countries have very high quality data available
Sewers New York State sewage treatment plants 200+ statewide, several thousand employees Used fixed coefficients technology lnE = a + b*lnCapacity + v – u; b < 1 implies economies of scale (almost certain) Fit as frontier functions, but the effect of market concentration was the main interest
Summary
Inefficiency
Methodologies Data Envelopment Analysis Stochastic Frontier Modeling HUGE User base Largely atheoretical Applications in management, consulting, etc. Stochastic Frontier Modeling More theoretically based – “model” based More active technique development literature Equally large applications pool
SFA Models Normal – Half Normal Normal-Gamma, Exponential, Rayleigh Truncation Heteroscedasticity Heterogeneity in the distribution of ui Normal-Gamma, Exponential, Rayleigh Classical vs. Bayesian applications Flexible functional forms for inefficiency There are yet others in the literature
Modeling Settings Production and Cost Models Multiple output models Cost functions Distance functions, profits and revenue functions
Modeling Issues Appropriate model framework Cost, production, etc. Functional form How to handle observable heterogeneity – “where do we put the zs?” Panel data Is inefficiency time invariant? Separating heterogeneity from inefficiency Dealing with endogeneity Allocative inefficiency and the Greene problem
Range of Applications Regulated industries – railroads, electricity, public services Health care delivery – nursing homes, hospitals, health care systems (WHO, AHRQ) Banking and Finance Many other industries. See Lovell and Schmidt “Efficiency and Productivity” 27 page bibliography. Table of over 200 applications since 2000