NONPARAMETRIC LEAST SQUARES ESTIMATION IN DERIVATIVE FAMILIES Data On Derivatives, The Curse Of Dimensionality and Cost Function Estimation Adonis.

NONPARAMETRIC LEAST SQUARES ESTIMATION IN DERIVATIVE FAMILIES Data On Derivatives, The Curse Of Dimensionality and Cost Function Estimation Adonis Yatchew Economics Department, University of Toronto Joint work with Peter Hall, Department of Mathematics and Statistics, University of Melbourne. Econometric Society Meetings, Boston, June 6, 2009.

Introduction Nonparametric modeling in economics suffers from the curse of dimensionality. The curse can be mitigated through semiparametric specifications (partial linear model, index model), additive separability. The presence of multiple sometimes many explanatory variables and the inability to conduct controlled experiments means that nonparametric modeling in economics is particularly affected by the curse of dimensionality. Also symmetry (e.g. radial symmetry)

Introduction Main idea: data on derivatives can substantially mitigate the curse of dimensionality. Settings where data may be available on a function and (some of) its derivatives: Cost function and factor demands (Shepherd’s Lemma). Production function and factor prices. Systems modelled through optimization of an objective function. Data may be available on the function and on first order conditions. Other examples Cost function estimation encompasses a family of examples where costs are minimized subject to a set of constraints. The envelope theorem then yields relationships between the quantities chosen, the constraint parameters and the Lagrange multipliers associated with the constraints. Production function estimation -- data on factor prices contain information on slopes of isoquants. Data on option prices combined with direct data on ‘market expectations’. Option prices – suppose had data of the form “probability that value of the Dow-Jones will not exceed 10,600 the end of next month when options expire. (Or that the FTSE will not exceed 5,000.) Experimental economics.

Local Averaging Estimators
Given data on then can be estimated as though it were a function of a single nonparametric variable rather than two.

h x1 x2

In conventional nonparametric estimation, rate of convergence is given by where d is the number of bounded derivatives, k the dimension of x. If data on partial derivatives are available then a rate of can be achieved where where p is the dimension of the largest sub-vector of x for which all own and cross first-order partials are observed (Hall and Yatchew 2007). However, partial differentiation destroys certain additively separable portions of the function which can be recovered through data on lower order partials or data on the function itself. Data on partial derivatives eliminates local averaging in certain directions. Local averaging replaced with global averages in certain directions. OBJECTIVES Rate of convergence. Incorporation of data on various derivatives data on functions of derivatives additional constraints on function or derivatives. Selection of smoothing parameter(s).

Local Averaging -- Disadvantages
Local averaging will not, in general, optimally combine data from various derivatives. Data on functions of derivatives cannot be easily incorporated. Multi-step estimation procedure which may require smoothing parameter selection at each step. Non-trivial to introduce additional constraints (e.g., monotonicity, convexity). Other Disadvantages

Optimization Estimators
Let index a family of observed derivatives. For each we have data Consider For example, given data on function and first derivative Advantages Conveniently allow incorporation of data on various derivatives within a single optimization problem. With sufficient derivative data, optimal smoothing parameters may be selected without recourse to cross-validation. Data on functions of derivatives can be incorporated. Additional constraints (e.g., monotonicity, convexity) can be added. Additional Slides Optimization problem -- series estimator. Cosine orthonormal basis. Expansion of IMSE: general under sufficient smoothness assumptions so that var and bias^2 are O(1/n) under sufficient smoothness assumptions so that var is O(1/n) and bias^2 is o(1/n) Generalizations: p-dimensions non-uniform design target function estimation Asymptotic distribution of ISE Selection of smoothing parameter can be achieved without cross-validation.

Cosine Series Estimation – Scalar x’s
Write where and consider GLS regression – or mixed regression. Moreover the “X” matrix for the level regression is close to the identity matrix, while the “X” matrix for the derivative regression is close to a diagonal matrix. In particular, the basis functions are orthonormal and the derivatives of the basis functions are orthogonal (but not orthonormal). Except that need to select r, the smoothing parameter.

Cosine Series Estimation
Assume regularly spaced x’s, g sufficiently smooth, then For scalar x, and data on g and g‘ Contrast this with the canonical case where only level data are observed and var=sig^2(r/n). Under slightly stronger conditions on the alphas, the bias^2 term is o(1/n). This result implies root-n consistency of the estimator. Suppose data are available for g and g' if g' is square-integrable and O(n1/2) ≤ r ≤ O(n) , then converges at O(n-1) ; if g" is square-integrable and O(n1/3) ≤ r ≤ O(n) , then converges at O(n-1) .

Cosine Series Estimation – Bivariate x’s
Write and consider the heuristic: “var is times the number of coefficients that cannot be estimated from derivative data”. If no derivative data are observed then  If observe g10 then  If observe g11 then  If observe g10 and g01 then !? Check heuristic in three or more dimensions. Check heuristic if higher order “own” derivatives are available. In low dimensional models, much of the beneficial impact on rates of convergence is realized even if only ordinary first-order partials are observed . This is especially useful in the estimation of economic cost functions. For example, if costs are a function of output and three input prices, and factor demand data are observed, then the cost function may be estimated at rates only slightly slower than as if it were a function of a single nonparametric variable (output).

Simulations – Cost Function Estimation
Data are available on where is output and is a vector of input prices. By Shepherd’s Lemma, partial derivatives with respect to input prices equal quantities of inputs which may also be observed.

Simulations – Cost Function Estimation

Cosine Series Estimation
Since noise smoothed out more effectively, the number of terms in the approximation can grow more rapidly, thereby speeding the elimination of bias. Which dominates – bias or variance? In parametric models, variance dominates bias. In standard nonparametric models, bias and variance of the same order. If data on derivatives are available, bias can dominate variance. A brief comment on the respective roles of bias and variance may add perspective. The mean integrated squared error (MISE) of common parametric regression estimators is dominated by the variance term, with the bias term converging to zero much more quickly. Standard nonparametric regression, on the other hand, involves a trade-off between bias and variance: in order to achieve the optimal convergence rate in a given smoothness class, the two terms need to converge to zero at the same rate. For series estimators, this entails selecting the number of terms in the approximation, to balance the variance term — which increases with r, against the bias term — which diminishes with r. Data on derivatives, on the other hand, permit one to smooth out the noise more effectively. The variance term converges to zero more quickly, in certain cases achieving the parametric rate O(1/n). For sufficiently smooth, but still infinite dimensional classes of functions, the bias term can converge at the same rate as the variance or even faster. For less smooth classes, the bias term may dominate the variance term (in contrast to parametric modeling where the variance usually dominates the bias term). For series estimators incorporating derivative data, the most salient consequence is that ˜r can grow more rapidly, which in turn permits faster elimination of bias.

Production Function Estimation
Data are available on production function Additional data are available on L K Isoquant wL + rK = Co

Summary Rates of convergence of nonparametric estimators can be substantially, even dramatically, improved if data on derivatives are available. Economic theory rarely predicts functional form – hence nonparametric tools are appealing. Economists are generally not able to run controlled experiments so we often have to control for many confounding variables. Existence of many derivatives also improves rate of convergence. Reduction in nonparametric dimension even if just one partial derivative is observed. Usefulness of derivative data in: hypothesis testing semiparametric modeling.

NONPARAMETRIC LEAST SQUARES ESTIMATION IN DERIVATIVE FAMILIES Data On Derivatives, The Curse Of Dimensionality and Cost Function Estimation Adonis.

Similar presentations

Presentation on theme: "NONPARAMETRIC LEAST SQUARES ESTIMATION IN DERIVATIVE FAMILIES Data On Derivatives, The Curse Of Dimensionality and Cost Function Estimation Adonis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

NONPARAMETRIC LEAST SQUARES ESTIMATION IN DERIVATIVE FAMILIES Data On Derivatives, The Curse Of Dimensionality and Cost Function Estimation Adonis.

Similar presentations

Presentation on theme: "NONPARAMETRIC LEAST SQUARES ESTIMATION IN DERIVATIVE FAMILIES Data On Derivatives, The Curse Of Dimensionality and Cost Function Estimation Adonis."— Presentation transcript:

Similar presentations

About project

Feedback