Econometrics in Health Economics Discrete Choice Modeling and Frontier Modeling and Efficiency Estimation Professor William Greene Stern School of Business New York University September 2-4, 2007
Frontier and Efficiency Estimation Session 5 Efficiency Analysis Stochastic Frontier Model Efficiency Estimation Session 6 Panel Data Models and Heterogeneity Fixed and Random Effects Bayesian and Classical Estimation Session 7 Efficiency Models Stochastic Frontier and Data Envelopment Analysis Student Presentation: Silvio Daidone and Francesco D’Amico Session 8: Computer Exercises and Applications
The Production Function “A single output technology is commonly described by means of a production function f(z) that gives the maximum amount q of output that can be produced using input amounts (z 1,…,z L-1 ) > 0. “Microeconomic Theory,” Mas-Colell, Whinston, Green: Oxford, 1995, p See also Samuelson (1938) and Shephard (1953).
Thoughts on Inefficiency Failure to achieve the theoretical maximum Hicks (ca. 1935) on the benefits of monopoly Leibenstein (ca. 1966): X inefficiency Debreu, Farrell (1950s) on management inefficiency All related to firm behavior in the absence of market restraint – the exercise of market power.
A History of Empirical Investigation Cobb-Douglas (1927) Arrow, Chenery, Minhas, Solow (1963) Joel Dean (1940s, 1950s) Johnston (1950s) Nerlove (1960) Christensen et al. (1972)
Inefficiency in the “Real” World Measurement of inefficiency in “markets” – heterogeneous production outcomes: Aigner and Chu (1968) Timmer (1971) Aigner, Lovell, Schmidt (1977) Meeusen, van den Broeck (1977)
Production Functions Production is a process of transformation of a set of inputs, denoted x into a set of outputs, y Transformation of inputs to outputs is via the transformation function: T(y,x) = 0.
Defining the Production Set Level set: The Production function is defined by the isoquant The efficient subset is defined in terms of the level sets:
Isoquants and Level Sets
The Distance Function
Inefficiency
Production Function Model with Inefficiency
Cost Inefficiency y* = f(x) C* = g(y*,w) (Samuelson – Shephard duality results) Cost inefficiency: If y < f(x), then C must be greater than g(y,w). Implies the idea of a cost frontier. lnC = lng(y,w) + u, u > 0.
Specification
Corrected Ordinary Least Squares
Modified OLS An alternative approach that requires a parametric model of the distribution of u i is modified OLS (MOLS). The OLS residuals, save for the constant displacement, are pointwise consistent estimates of their population counterparts, - u i. suppose that u i has an exponential distribution with mean λ. Then, the variance of u i is λ 2, so the standard deviation of the OLS residuals is a consistent estimator of E[u i ] = λ. Since this is a one parameter distribution, the entire model for u i can be characterized by this parameter and functions of it. The estimated frontier function can now be displaced upward by this estimate of E[u i ].
COLS and MOLS
Deterministic Frontier: Programming Estimators
Estimating Inefficiency
Statistical Problems with Programming Estimators They do correspond to MLEs. The likelihood functions are “irregular” There are no known statistical properties – no estimable covariance matrix for estimates. They might be “robust,” like LAD. Noone knows for sure. Never demonstrated.
A Model with a Statistical Basis
Extensions Cost frontiers, based on duality results: ln y = f(x) – u ln C = g(y,w) + u’ u > 0. u’ > 0. Economies of scale and allocative inefficiency blur the relationship. Corrected and modified least squares estimators based on the deterministic frontiers are easily constructed.
Data Envelopment Analysis
Methodological Problems Measurement error Outliers Specification errors The overall problem with the deterministic frontier approach
Stochastic Frontier Models Motivation: Factors not under control of the firm Measurement error Differential rates of adoption of technology frontier is randomly placed by the whole collection of stochastic elements which might enter the model outside the control of the firm. Aigner, Lovell, Schmidt (1977), Meeusen, van den Broeck (1977)
Stochastic Frontier Model u i > 0, but v i may take any value. A symmetric distribution, such as the normal distribution, is usually assumed for v i. Thus, the stochastic frontier is + ’x i +v i and, as before, u i represents the inefficiency.
Least Squares Estimation Average inefficiency is embodied in the third moment of the disturbance ε i = v i - u i. So long as E[v i - u i ] is constant, the OLS estimates of the slope parameters of the frontier function are unbiased and consistent. (The constant term estimates α-E[u i ]. The average inefficiency present in the distribution is reflected in the asymmetry of the distribution, which can be estimated using the OLS residuals:
Application to Spanish Dairy Farms InputUnitsMeanStd. Dev. MinimumMaximum MilkMilk production (liters) 131,108 92,539 14,110727,281 Cows# of milking cows Labor# man-equivalent units LandHectares of land devoted to pasture and crops FeedTotal amount of feedstuffs fed to dairy cows (tons) 57,94147,9813, ,732 N = 247 farms, T = 6 years ( )
Example: Dairy Farms
The Normal-Half Normal Model
Normal-Half Normal Variable
Decomposition
Standard Form
Estimation: Least Squares/MoM OLS estimator of β is consistent E[u i ] = (2/π) 1/2 σ u, so OLS constant estimates α+ (2/π) 1/2 σ u Second and third moments of OLS residuals estimate
A Problem with Method of Moments Estimator of σ u is [m 3 / ] 1/3 Theoretical m 3 is < 0 Sample m 3 may be > 0. If so, no solution for σ u. (Negative to 1/3 power.)
Likelihood Function Waldman (1982) result on skewness of OLS residuals: If the OLS residuals are positively skewed, rather than negative, then OLS maximizes the log likelihood, and there is no evidence of inefficiency in the data.
Alternative Model: Exponential
Normal-Exponential Likelihood
Truncated Normal Model
Normal-Truncated Normal
Other Models Other Parametric Models (we will examine gamma later in the course) Semiparametric and nonparametric – the recent outer reaches of the theoretical literature Other variations including heterogeneity in the frontier function and in the distribution of inefficiency
Estimating u i No direct estimate of u i Data permit estimation of y i – β’x i. Can this be used? ε i = y i – β’x i = v i – u i Indirect estimate of u i, using E[u i |v i – u i ] v i – u i is estimable with e i = y i – b’x i.
Fundamental Tool - JLMS We can insert our maximum likelihood estimates of all parameters. Note: This estimates E[u|v i – u i ], not u i.
Other Distributions
Efficiency
Application: Electricity Generation
Estimated Translog Production Frontiers
Inefficiency Estimates
Estimated Inefficiency Distribution
Confidence Region
Application (Based on Costs)
Multiple Output Frontier The formal theory of production departs from the transformation function that links the vector of outputs, y to the vector of inputs, x; T(y,x) = 0. As it stands, some further assumptions are obviously needed to produce the framework for an empirical model. By assuming homothetic separability, the function may be written in the form A(y) = f(x).
Multiple Output Production Function Inefficiency in this setting reflects the failure of the firm to achieve the maximum aggregate output attainable. Note that the model does not address the economic question of whether the chosen output mix is optimal with respect to the output prices and input costs. That would require a profit function approach. Berger (1993) and Adams et al. (1999) apply the method to a panel of U.S. banks – 798 banks, ten years.
Duality Between Production and Cost
Implied Cost Frontier Function
Stochastic Cost Frontier
Cobb-Douglas Cost Frontier
Translog Cost Frontier
Restricted Translog Cost Function
Cost Application to C&G Data
Estimates of Economic Efficiency
Duality – Production vs. Cost
Multiple Output Cost Frontier
Allocative Inefficiency and Economic Inefficiency Technical inefficiency: Off the isoquant. Allocative inefficiency: Wrong input mix.
Cost Structure – Demand System
Cost Frontier Model
The Greene Problem Factor shares are derived from the cost function by differentiation. Where does e k come from? Any nonzero value of e k, which can be positive or negative, must translate into higher costs. Thus, u must be a function of e 1,…,e K such that ∂u/∂e k > 0 Noone had derived a complete, internally consistent equation system the Greene problem. Solution: Kumbhakar in several recent papers. Very complicated – near to impractical Apparently not of interest to practitioners
Observable Heterogeneity As opposed to unobservable heterogeneity Observe: Y or C (outcome) and X or w (inputs or input prices) Firm characteristics z. Not production or cost, characterize the production process. Enter the production or cost function? Enter the inefficiency distribution? How?
Shifting the Outcome Function Firm specific heterogeneity can also be incorporated into the inefficiency model as follows: This modifies the mean of the truncated normal distribution y i = x i + v i - u i v i ~ N[0, v 2 ] u i = | Ui | where U i ~ N[ i, u 2 ], i = 0 + 1 z i,
Heterogeneous Mean
Estimated Efficiency
One Step or Two Step 2 Step: Fit Half or truncated normal model, compute JLMS u i, regress u i on z i Airline EXAMPLE: Fit model without POINTS, LOADFACTOR, STAGE 1 Step: Include z i in the model, compute u i including z i Airline example: Include 3 variables Methodological issue: Left out variables in two step approach.
WHO Health Care Study
Application: WHO Data
One vs. Two Step
Unobservable Heterogeneity Parameters vary across firms Random variation (heterogeneity, not Bayesian) Variation partially explained by observable indicators Continuous variation – random parameter models: Considered with panel data models Latent class – discrete parameter variation
A Latent Class Model
Latent Class Application Banking Costs
Heteroscedasticity in v and/or u Var[v i | h i ] = v 2 g v (h i, ) = vi 2 g v (h i,0) = 1, g v (h i, ) = [exp( T h i )] 2 Var[U i | h i ] = u 2 gu(hi, )= ui 2 g u (h i,0) = 1, g u (h i, ) = [exp( T h i )] 2
Application: WHO Data
A “Scaling” Model
Model Extensions Simulation Based Estimators Normal-Gamma Frontier Model Bayesian Estimation of Stochastic Frontiers Similar Model Structures Similar Estimation Methodologies Similar Results
Normal-Gamma Very flexible model. VERY difficult log likelihood function. Bayesians love it. Conjugate functional forms for other model parts
Normal-Gamma Model z ~ N[- i + v 2 / u, v 2 ]. q(r,ε i ) is extremely difficult to compute
Normal-Gamma
Simulating the Likelihood i = y i - Tx i, i = - i - v 2 / u, = v, and P L = (- i / ) and F q is a draw from the continuous uniform(0,1) distribution.
Application to C&G Data This is the standard data set for developing and testing Exponential, Gamma, and Bayesian estimators.
Application to C&G Data
Bayesian Estimation Short history – first developed post 1995 Range of applications Largely replicated existing classical methods Recent applications have extended received approaches Common features of the application
Bayesian Formulation of SF Model Normal – Exponential Model vi – ui = yi - - Txi. Estimation proceeds (in principle) by specifying priors over = ( , , v, u), then deriving inferences from the joint posterior p( |data). In general, the joint posterior for this model cannot be derived in closed form, so direct analysis is not feasible. Using Gibbs sampling, and known conditional posteriors, it is possible use Markov Chain Monte Carlo (MCMC) methods to sample from the marginal posteriors and use that device to learn about the parameters and inefficiencies. In particular, for the model parameters, we are interested in estimating E[ |data], Var[ |data] and, perhaps even more fully characterizing the density f( |data).
Estimating Inefficiency One might, ex post, estimate E[u i |data] however, it is more natural in this setting to include (u 1,...,u N ) with , and estimate the conditional means with those of the other parameters. The method is known as data augmentation.
Priors Over Parameters
Priors for Inefficiencies
Posterior
Gibbs Sampling: Conditional Posteriors
Bayesian Normal-Gamma Model Tsionas (2002) Erlang form – Integer P “Random parameters” Applied to C&G River Huang (2004) Fully general Applied (as usual) to C&G
Bayesian and Classical Results
Methodological Comparison Bayesian vs. Classical Interpretation Practical results: Bernstein – von Mises Theorem in the presence of diffuse priors Kim and Schmidt comparison (JPA, 2000) Important difference – tight priors over u i in this context. Conclusions?