Download presentation
Presentation is loading. Please wait.
Published byNoah Nathan Watts Modified over 9 years ago
2
Econometrics in Health Economics Discrete Choice Modeling and Frontier Modeling and Efficiency Estimation Professor William Greene Stern School of Business New York University September 2-4, 2007
3
Frontier and Efficiency Estimation Session 5 Efficiency Analysis Stochastic Frontier Model Efficiency Estimation Session 6 Panel Data Models and Heterogeneity Fixed and Random Effects Bayesian and Classical Estimation Session 7 Efficiency Models Stochastic Frontier and Data Envelopment Analysis Student Presentation: Silvio Daidone and Francesco D’Amico Session 8: Computer Exercises and Applications
4
The Production Function “A single output technology is commonly described by means of a production function f(z) that gives the maximum amount q of output that can be produced using input amounts (z 1,…,z L-1 ) > 0. “Microeconomic Theory,” Mas-Colell, Whinston, Green: Oxford, 1995, p. 129. See also Samuelson (1938) and Shephard (1953).
5
Thoughts on Inefficiency Failure to achieve the theoretical maximum Hicks (ca. 1935) on the benefits of monopoly Leibenstein (ca. 1966): X inefficiency Debreu, Farrell (1950s) on management inefficiency All related to firm behavior in the absence of market restraint – the exercise of market power.
6
A History of Empirical Investigation Cobb-Douglas (1927) Arrow, Chenery, Minhas, Solow (1963) Joel Dean (1940s, 1950s) Johnston (1950s) Nerlove (1960) Christensen et al. (1972)
7
Inefficiency in the “Real” World Measurement of inefficiency in “markets” – heterogeneous production outcomes: Aigner and Chu (1968) Timmer (1971) Aigner, Lovell, Schmidt (1977) Meeusen, van den Broeck (1977)
8
Production Functions Production is a process of transformation of a set of inputs, denoted x into a set of outputs, y Transformation of inputs to outputs is via the transformation function: T(y,x) = 0.
9
Defining the Production Set Level set: The Production function is defined by the isoquant The efficient subset is defined in terms of the level sets:
10
Isoquants and Level Sets
11
The Distance Function
12
Inefficiency
13
Production Function Model with Inefficiency
14
Cost Inefficiency y* = f(x) C* = g(y*,w) (Samuelson – Shephard duality results) Cost inefficiency: If y < f(x), then C must be greater than g(y,w). Implies the idea of a cost frontier. lnC = lng(y,w) + u, u > 0.
15
Specification
16
Corrected Ordinary Least Squares
17
Modified OLS An alternative approach that requires a parametric model of the distribution of u i is modified OLS (MOLS). The OLS residuals, save for the constant displacement, are pointwise consistent estimates of their population counterparts, - u i. suppose that u i has an exponential distribution with mean λ. Then, the variance of u i is λ 2, so the standard deviation of the OLS residuals is a consistent estimator of E[u i ] = λ. Since this is a one parameter distribution, the entire model for u i can be characterized by this parameter and functions of it. The estimated frontier function can now be displaced upward by this estimate of E[u i ].
18
COLS and MOLS
19
Deterministic Frontier: Programming Estimators
20
Estimating Inefficiency
21
Statistical Problems with Programming Estimators They do correspond to MLEs. The likelihood functions are “irregular” There are no known statistical properties – no estimable covariance matrix for estimates. They might be “robust,” like LAD. Noone knows for sure. Never demonstrated.
22
A Model with a Statistical Basis
23
Extensions Cost frontiers, based on duality results: ln y = f(x) – u ln C = g(y,w) + u’ u > 0. u’ > 0. Economies of scale and allocative inefficiency blur the relationship. Corrected and modified least squares estimators based on the deterministic frontiers are easily constructed.
24
Data Envelopment Analysis
25
Methodological Problems Measurement error Outliers Specification errors The overall problem with the deterministic frontier approach
26
Stochastic Frontier Models Motivation: Factors not under control of the firm Measurement error Differential rates of adoption of technology frontier is randomly placed by the whole collection of stochastic elements which might enter the model outside the control of the firm. Aigner, Lovell, Schmidt (1977), Meeusen, van den Broeck (1977)
27
Stochastic Frontier Model u i > 0, but v i may take any value. A symmetric distribution, such as the normal distribution, is usually assumed for v i. Thus, the stochastic frontier is + ’x i +v i and, as before, u i represents the inefficiency.
28
Least Squares Estimation Average inefficiency is embodied in the third moment of the disturbance ε i = v i - u i. So long as E[v i - u i ] is constant, the OLS estimates of the slope parameters of the frontier function are unbiased and consistent. (The constant term estimates α-E[u i ]. The average inefficiency present in the distribution is reflected in the asymmetry of the distribution, which can be estimated using the OLS residuals:
29
Application to Spanish Dairy Farms InputUnitsMeanStd. Dev. MinimumMaximum MilkMilk production (liters) 131,108 92,539 14,110727,281 Cows# of milking cows 2.12 11.27 4.5 82.3 Labor# man-equivalent units 1.67 0.55 1.0 4.0 LandHectares of land devoted to pasture and crops. 12.99 6.17 2.0 45.1 FeedTotal amount of feedstuffs fed to dairy cows (tons) 57,94147,9813,924.1 376,732 N = 247 farms, T = 6 years (1993-1998)
30
Example: Dairy Farms
31
The Normal-Half Normal Model
32
Normal-Half Normal Variable
33
Decomposition
34
Standard Form
35
Estimation: Least Squares/MoM OLS estimator of β is consistent E[u i ] = (2/π) 1/2 σ u, so OLS constant estimates α+ (2/π) 1/2 σ u Second and third moments of OLS residuals estimate
36
A Problem with Method of Moments Estimator of σ u is [m 3 /-.21801] 1/3 Theoretical m 3 is < 0 Sample m 3 may be > 0. If so, no solution for σ u. (Negative to 1/3 power.)
37
Likelihood Function Waldman (1982) result on skewness of OLS residuals: If the OLS residuals are positively skewed, rather than negative, then OLS maximizes the log likelihood, and there is no evidence of inefficiency in the data.
38
Alternative Model: Exponential
39
Normal-Exponential Likelihood
40
Truncated Normal Model
41
Normal-Truncated Normal
42
Other Models Other Parametric Models (we will examine gamma later in the course) Semiparametric and nonparametric – the recent outer reaches of the theoretical literature Other variations including heterogeneity in the frontier function and in the distribution of inefficiency
43
Estimating u i No direct estimate of u i Data permit estimation of y i – β’x i. Can this be used? ε i = y i – β’x i = v i – u i Indirect estimate of u i, using E[u i |v i – u i ] v i – u i is estimable with e i = y i – b’x i.
44
Fundamental Tool - JLMS We can insert our maximum likelihood estimates of all parameters. Note: This estimates E[u|v i – u i ], not u i.
45
Other Distributions
46
Efficiency
47
Application: Electricity Generation
48
Estimated Translog Production Frontiers
49
Inefficiency Estimates
50
Estimated Inefficiency Distribution
51
Confidence Region
52
Application (Based on Costs)
53
Multiple Output Frontier The formal theory of production departs from the transformation function that links the vector of outputs, y to the vector of inputs, x; T(y,x) = 0. As it stands, some further assumptions are obviously needed to produce the framework for an empirical model. By assuming homothetic separability, the function may be written in the form A(y) = f(x).
54
Multiple Output Production Function Inefficiency in this setting reflects the failure of the firm to achieve the maximum aggregate output attainable. Note that the model does not address the economic question of whether the chosen output mix is optimal with respect to the output prices and input costs. That would require a profit function approach. Berger (1993) and Adams et al. (1999) apply the method to a panel of U.S. banks – 798 banks, ten years.
55
Duality Between Production and Cost
56
Implied Cost Frontier Function
57
Stochastic Cost Frontier
58
Cobb-Douglas Cost Frontier
59
Translog Cost Frontier
60
Restricted Translog Cost Function
61
Cost Application to C&G Data
62
Estimates of Economic Efficiency
63
Duality – Production vs. Cost
64
Multiple Output Cost Frontier
65
Allocative Inefficiency and Economic Inefficiency Technical inefficiency: Off the isoquant. Allocative inefficiency: Wrong input mix.
66
Cost Structure – Demand System
67
Cost Frontier Model
68
The Greene Problem Factor shares are derived from the cost function by differentiation. Where does e k come from? Any nonzero value of e k, which can be positive or negative, must translate into higher costs. Thus, u must be a function of e 1,…,e K such that ∂u/∂e k > 0 Noone had derived a complete, internally consistent equation system the Greene problem. Solution: Kumbhakar in several recent papers. Very complicated – near to impractical Apparently not of interest to practitioners
69
Observable Heterogeneity As opposed to unobservable heterogeneity Observe: Y or C (outcome) and X or w (inputs or input prices) Firm characteristics z. Not production or cost, characterize the production process. Enter the production or cost function? Enter the inefficiency distribution? How?
70
Shifting the Outcome Function Firm specific heterogeneity can also be incorporated into the inefficiency model as follows: This modifies the mean of the truncated normal distribution y i = x i + v i - u i v i ~ N[0, v 2 ] u i = | Ui | where U i ~ N[ i, u 2 ], i = 0 + 1 z i,
72
Heterogeneous Mean
73
Estimated Efficiency
74
One Step or Two Step 2 Step: Fit Half or truncated normal model, compute JLMS u i, regress u i on z i Airline EXAMPLE: Fit model without POINTS, LOADFACTOR, STAGE 1 Step: Include z i in the model, compute u i including z i Airline example: Include 3 variables Methodological issue: Left out variables in two step approach.
75
WHO Health Care Study
76
Application: WHO Data
77
One vs. Two Step
78
Unobservable Heterogeneity Parameters vary across firms Random variation (heterogeneity, not Bayesian) Variation partially explained by observable indicators Continuous variation – random parameter models: Considered with panel data models Latent class – discrete parameter variation
79
A Latent Class Model
80
Latent Class Application Banking Costs
81
Heteroscedasticity in v and/or u Var[v i | h i ] = v 2 g v (h i, ) = vi 2 g v (h i,0) = 1, g v (h i, ) = [exp( T h i )] 2 Var[U i | h i ] = u 2 gu(hi, )= ui 2 g u (h i,0) = 1, g u (h i, ) = [exp( T h i )] 2
82
Application: WHO Data
83
A “Scaling” Model
84
Model Extensions Simulation Based Estimators Normal-Gamma Frontier Model Bayesian Estimation of Stochastic Frontiers Similar Model Structures Similar Estimation Methodologies Similar Results
85
Normal-Gamma Very flexible model. VERY difficult log likelihood function. Bayesians love it. Conjugate functional forms for other model parts
86
Normal-Gamma Model z ~ N[- i + v 2 / u, v 2 ]. q(r,ε i ) is extremely difficult to compute
87
Normal-Gamma
88
Simulating the Likelihood i = y i - Tx i, i = - i - v 2 / u, = v, and P L = (- i / ) and F q is a draw from the continuous uniform(0,1) distribution.
89
Application to C&G Data This is the standard data set for developing and testing Exponential, Gamma, and Bayesian estimators.
90
Application to C&G Data
91
Bayesian Estimation Short history – first developed post 1995 Range of applications Largely replicated existing classical methods Recent applications have extended received approaches Common features of the application
92
Bayesian Formulation of SF Model Normal – Exponential Model vi – ui = yi - - Txi. Estimation proceeds (in principle) by specifying priors over = ( , , v, u), then deriving inferences from the joint posterior p( |data). In general, the joint posterior for this model cannot be derived in closed form, so direct analysis is not feasible. Using Gibbs sampling, and known conditional posteriors, it is possible use Markov Chain Monte Carlo (MCMC) methods to sample from the marginal posteriors and use that device to learn about the parameters and inefficiencies. In particular, for the model parameters, we are interested in estimating E[ |data], Var[ |data] and, perhaps even more fully characterizing the density f( |data).
93
Estimating Inefficiency One might, ex post, estimate E[u i |data] however, it is more natural in this setting to include (u 1,...,u N ) with , and estimate the conditional means with those of the other parameters. The method is known as data augmentation.
94
Priors Over Parameters
95
Priors for Inefficiencies
96
Posterior
98
Gibbs Sampling: Conditional Posteriors
99
Bayesian Normal-Gamma Model Tsionas (2002) Erlang form – Integer P “Random parameters” Applied to C&G River Huang (2004) Fully general Applied (as usual) to C&G
100
Bayesian and Classical Results
101
Methodological Comparison Bayesian vs. Classical Interpretation Practical results: Bernstein – von Mises Theorem in the presence of diffuse priors Kim and Schmidt comparison (JPA, 2000) Important difference – tight priors over u i in this context. Conclusions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.