Presentation is loading. Please wait.

Presentation is loading. Please wait.

Econometrics in Health Economics Discrete Choice Modeling and Frontier Modeling and Efficiency Estimation Professor William Greene Stern School of Business.

Similar presentations


Presentation on theme: "Econometrics in Health Economics Discrete Choice Modeling and Frontier Modeling and Efficiency Estimation Professor William Greene Stern School of Business."— Presentation transcript:

1

2 Econometrics in Health Economics Discrete Choice Modeling and Frontier Modeling and Efficiency Estimation Professor William Greene Stern School of Business New York University September 2-4, 2007

3 Frontier and Efficiency Estimation  Session 5 Efficiency Analysis Stochastic Frontier Model Efficiency Estimation  Session 6 Panel Data Models and Heterogeneity Fixed and Random Effects Bayesian and Classical Estimation  Session 7 Efficiency Models Stochastic Frontier and Data Envelopment Analysis Student Presentation: Silvio Daidone and Francesco D’Amico  Session 8: Computer Exercises and Applications

4 The Production Function “A single output technology is commonly described by means of a production function f(z) that gives the maximum amount q of output that can be produced using input amounts (z 1,…,z L-1 ) > 0. “Microeconomic Theory,” Mas-Colell, Whinston, Green: Oxford, 1995, p. 129. See also Samuelson (1938) and Shephard (1953).

5 Thoughts on Inefficiency Failure to achieve the theoretical maximum  Hicks (ca. 1935) on the benefits of monopoly  Leibenstein (ca. 1966): X inefficiency  Debreu, Farrell (1950s) on management inefficiency All related to firm behavior in the absence of market restraint – the exercise of market power.

6 A History of Empirical Investigation  Cobb-Douglas (1927)  Arrow, Chenery, Minhas, Solow (1963)  Joel Dean (1940s, 1950s)  Johnston (1950s)  Nerlove (1960)  Christensen et al. (1972)

7 Inefficiency in the “Real” World Measurement of inefficiency in “markets” – heterogeneous production outcomes:  Aigner and Chu (1968)  Timmer (1971)  Aigner, Lovell, Schmidt (1977)  Meeusen, van den Broeck (1977)

8 Production Functions Production is a process of transformation of a set of inputs, denoted x  into a set of outputs, y  Transformation of inputs to outputs is via the transformation function: T(y,x) = 0.

9 Defining the Production Set Level set: The Production function is defined by the isoquant The efficient subset is defined in terms of the level sets:

10 Isoquants and Level Sets

11 The Distance Function

12 Inefficiency

13 Production Function Model with Inefficiency

14 Cost Inefficiency y* = f(x)  C* = g(y*,w) (Samuelson – Shephard duality results) Cost inefficiency: If y < f(x), then C must be greater than g(y,w). Implies the idea of a cost frontier. lnC = lng(y,w) + u, u > 0.

15 Specification

16 Corrected Ordinary Least Squares

17 Modified OLS An alternative approach that requires a parametric model of the distribution of u i is modified OLS (MOLS). The OLS residuals, save for the constant displacement, are pointwise consistent estimates of their population counterparts, - u i. suppose that u i has an exponential distribution with mean λ. Then, the variance of u i is λ 2, so the standard deviation of the OLS residuals is a consistent estimator of E[u i ] = λ. Since this is a one parameter distribution, the entire model for u i can be characterized by this parameter and functions of it. The estimated frontier function can now be displaced upward by this estimate of E[u i ].

18 COLS and MOLS

19 Deterministic Frontier: Programming Estimators

20 Estimating Inefficiency

21 Statistical Problems with Programming Estimators  They do correspond to MLEs.  The likelihood functions are “irregular”  There are no known statistical properties – no estimable covariance matrix for estimates.  They might be “robust,” like LAD. Noone knows for sure. Never demonstrated.

22 A Model with a Statistical Basis

23 Extensions  Cost frontiers, based on duality results: ln y = f(x) – u  ln C = g(y,w) + u’ u > 0. u’ > 0. Economies of scale and allocative inefficiency blur the relationship.  Corrected and modified least squares estimators based on the deterministic frontiers are easily constructed.

24 Data Envelopment Analysis

25 Methodological Problems  Measurement error  Outliers  Specification errors  The overall problem with the deterministic frontier approach

26 Stochastic Frontier Models  Motivation: Factors not under control of the firm Measurement error Differential rates of adoption of technology  frontier is randomly placed by the whole collection of stochastic elements which might enter the model outside the control of the firm.  Aigner, Lovell, Schmidt (1977), Meeusen, van den Broeck (1977)

27 Stochastic Frontier Model u i > 0, but v i may take any value. A symmetric distribution, such as the normal distribution, is usually assumed for v i. Thus, the stochastic frontier is  +  ’x i +v i and, as before, u i represents the inefficiency.

28 Least Squares Estimation Average inefficiency is embodied in the third moment of the disturbance ε i = v i - u i. So long as E[v i - u i ] is constant, the OLS estimates of the slope parameters of the frontier function are unbiased and consistent. (The constant term estimates α-E[u i ]. The average inefficiency present in the distribution is reflected in the asymmetry of the distribution, which can be estimated using the OLS residuals:

29 Application to Spanish Dairy Farms InputUnitsMeanStd. Dev. MinimumMaximum MilkMilk production (liters) 131,108 92,539 14,110727,281 Cows# of milking cows 2.12 11.27 4.5 82.3 Labor# man-equivalent units 1.67 0.55 1.0 4.0 LandHectares of land devoted to pasture and crops. 12.99 6.17 2.0 45.1 FeedTotal amount of feedstuffs fed to dairy cows (tons) 57,94147,9813,924.1 376,732 N = 247 farms, T = 6 years (1993-1998)

30 Example: Dairy Farms

31 The Normal-Half Normal Model

32 Normal-Half Normal Variable

33 Decomposition

34 Standard Form

35 Estimation: Least Squares/MoM  OLS estimator of β is consistent  E[u i ] = (2/π) 1/2 σ u, so OLS constant estimates α+ (2/π) 1/2 σ u  Second and third moments of OLS residuals estimate

36 A Problem with Method of Moments  Estimator of σ u is [m 3 /-.21801] 1/3  Theoretical m 3 is < 0  Sample m 3 may be > 0. If so, no solution for σ u. (Negative to 1/3 power.)

37 Likelihood Function Waldman (1982) result on skewness of OLS residuals: If the OLS residuals are positively skewed, rather than negative, then OLS maximizes the log likelihood, and there is no evidence of inefficiency in the data.

38 Alternative Model: Exponential

39 Normal-Exponential Likelihood

40 Truncated Normal Model

41 Normal-Truncated Normal

42 Other Models  Other Parametric Models (we will examine gamma later in the course)  Semiparametric and nonparametric – the recent outer reaches of the theoretical literature  Other variations including heterogeneity in the frontier function and in the distribution of inefficiency

43 Estimating u i  No direct estimate of u i  Data permit estimation of y i – β’x i. Can this be used? ε i = y i – β’x i = v i – u i Indirect estimate of u i, using E[u i |v i – u i ]  v i – u i is estimable with e i = y i – b’x i.

44 Fundamental Tool - JLMS We can insert our maximum likelihood estimates of all parameters. Note: This estimates E[u|v i – u i ], not u i.

45 Other Distributions

46 Efficiency

47 Application: Electricity Generation

48 Estimated Translog Production Frontiers

49 Inefficiency Estimates

50 Estimated Inefficiency Distribution

51 Confidence Region

52 Application (Based on Costs)

53 Multiple Output Frontier  The formal theory of production departs from the transformation function that links the vector of outputs, y to the vector of inputs, x; T(y,x) = 0.  As it stands, some further assumptions are obviously needed to produce the framework for an empirical model. By assuming homothetic separability, the function may be written in the form A(y) = f(x).

54 Multiple Output Production Function Inefficiency in this setting reflects the failure of the firm to achieve the maximum aggregate output attainable. Note that the model does not address the economic question of whether the chosen output mix is optimal with respect to the output prices and input costs. That would require a profit function approach. Berger (1993) and Adams et al. (1999) apply the method to a panel of U.S. banks – 798 banks, ten years.

55 Duality Between Production and Cost

56 Implied Cost Frontier Function

57 Stochastic Cost Frontier

58 Cobb-Douglas Cost Frontier

59 Translog Cost Frontier

60 Restricted Translog Cost Function

61 Cost Application to C&G Data

62 Estimates of Economic Efficiency

63 Duality – Production vs. Cost

64 Multiple Output Cost Frontier

65 Allocative Inefficiency and Economic Inefficiency Technical inefficiency: Off the isoquant. Allocative inefficiency: Wrong input mix.

66 Cost Structure – Demand System

67 Cost Frontier Model

68 The Greene Problem  Factor shares are derived from the cost function by differentiation.  Where does e k come from?  Any nonzero value of e k, which can be positive or negative, must translate into higher costs. Thus, u must be a function of e 1,…,e K such that ∂u/∂e k > 0  Noone had derived a complete, internally consistent equation system  the Greene problem.  Solution: Kumbhakar in several recent papers. Very complicated – near to impractical Apparently not of interest to practitioners

69 Observable Heterogeneity  As opposed to unobservable heterogeneity  Observe: Y or C (outcome) and X or w (inputs or input prices)  Firm characteristics z. Not production or cost, characterize the production process. Enter the production or cost function? Enter the inefficiency distribution? How?

70 Shifting the Outcome Function Firm specific heterogeneity can also be incorporated into the inefficiency model as follows: This modifies the mean of the truncated normal distribution y i =  x i + v i - u i v i ~ N[0,  v 2 ] u i = | Ui | where U i ~ N[  i,  u 2 ],  i =  0 +  1 z i,

71

72 Heterogeneous Mean

73 Estimated Efficiency

74 One Step or Two Step 2 Step: Fit Half or truncated normal model, compute JLMS u i, regress u i on z i Airline EXAMPLE: Fit model without POINTS, LOADFACTOR, STAGE 1 Step: Include z i in the model, compute u i including z i Airline example: Include 3 variables Methodological issue: Left out variables in two step approach.

75 WHO Health Care Study

76 Application: WHO Data

77 One vs. Two Step

78 Unobservable Heterogeneity  Parameters vary across firms Random variation (heterogeneity, not Bayesian) Variation partially explained by observable indicators  Continuous variation – random parameter models: Considered with panel data models  Latent class – discrete parameter variation

79 A Latent Class Model

80 Latent Class Application Banking Costs

81 Heteroscedasticity in v and/or u Var[v i | h i ] =  v 2 g v (h i,  ) =  vi 2 g v (h i,0) = 1, g v (h i,  ) = [exp(  T h i )] 2 Var[U i | h i ] =  u 2 gu(hi,  )=  ui 2 g u (h i,0) = 1, g u (h i,  ) = [exp(  T h i )] 2

82 Application: WHO Data

83 A “Scaling” Model

84 Model Extensions  Simulation Based Estimators Normal-Gamma Frontier Model Bayesian Estimation of Stochastic Frontiers  Similar Model Structures  Similar Estimation Methodologies  Similar Results

85 Normal-Gamma Very flexible model. VERY difficult log likelihood function. Bayesians love it. Conjugate functional forms for other model parts

86 Normal-Gamma Model z ~ N[-  i +  v 2 /  u,  v 2 ]. q(r,ε i ) is extremely difficult to compute

87 Normal-Gamma

88 Simulating the Likelihood  i = y i -  Tx i,  i = -  i -  v 2 /  u,  =  v, and P L =  (-  i /  ) and F q is a draw from the continuous uniform(0,1) distribution.

89 Application to C&G Data This is the standard data set for developing and testing Exponential, Gamma, and Bayesian estimators.

90 Application to C&G Data

91 Bayesian Estimation  Short history – first developed post 1995  Range of applications Largely replicated existing classical methods Recent applications have extended received approaches  Common features of the application

92 Bayesian Formulation of SF Model Normal – Exponential Model vi – ui = yi -  -  Txi. Estimation proceeds (in principle) by specifying priors over  = ( , ,  v,  u), then deriving inferences from the joint posterior p(  |data). In general, the joint posterior for this model cannot be derived in closed form, so direct analysis is not feasible. Using Gibbs sampling, and known conditional posteriors, it is possible use Markov Chain Monte Carlo (MCMC) methods to sample from the marginal posteriors and use that device to learn about the parameters and inefficiencies. In particular, for the model parameters, we are interested in estimating E[  |data], Var[  |data] and, perhaps even more fully characterizing the density f(  |data).

93 Estimating Inefficiency One might, ex post, estimate E[u i |data] however, it is more natural in this setting to include (u 1,...,u N ) with , and estimate the conditional means with those of the other parameters. The method is known as data augmentation.

94 Priors Over Parameters

95 Priors for Inefficiencies

96 Posterior

97

98 Gibbs Sampling: Conditional Posteriors

99 Bayesian Normal-Gamma Model  Tsionas (2002) Erlang form – Integer P “Random parameters” Applied to C&G  River Huang (2004) Fully general Applied (as usual) to C&G

100 Bayesian and Classical Results

101 Methodological Comparison  Bayesian vs. Classical Interpretation Practical results: Bernstein – von Mises Theorem in the presence of diffuse priors  Kim and Schmidt comparison (JPA, 2000)  Important difference – tight priors over u i in this context.  Conclusions?


Download ppt "Econometrics in Health Economics Discrete Choice Modeling and Frontier Modeling and Efficiency Estimation Professor William Greene Stern School of Business."

Similar presentations


Ads by Google