NONPARAMETRIC LEAST SQUARES ESTIMATION IN DERIVATIVE FAMILIES Data On Derivatives, The Curse Of Dimensionality and Cost Function Estimation Adonis.

Slides:



Advertisements
Similar presentations
Copula Representation of Joint Risk Driver Distribution
Advertisements

Pattern Recognition and Machine Learning
Advanced topics in Financial Econometrics Bas Werker Tilburg University, SAMSI fellow.
Managerial Economics in a Global Economy
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Support Vector Machines Instructor Max Welling ICS273A UCIrvine.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
The General Linear Model Or, What the Hell’s Going on During Estimation?
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
The General Linear Model. The Simple Linear Model Linear Regression.
Data mining and statistical learning - lecture 6
Visual Recognition Tutorial
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
Pattern Recognition and Machine Learning
Chapter 10 Simple Regression.
CHAPTER 1 ECONOMETRICS x x x x x Econometrics Tools of: Economic theory Mathematics Statistical inference applied to Analysis of economic data.
Curve-Fitting Regression
Chapter 4 Multiple Regression.
Boyce/DiPrima 9th ed, Ch 11.2: Sturm-Liouville Boundary Value Problems Elementary Differential Equations and Boundary Value Problems, 9th edition, by.
THE MATHEMATICS OF OPTIMIZATION
Definition and Properties of the Production Function Lecture II.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Empirical Financial Economics Asset pricing and Mean Variance Efficiency.
CHAPTER 4 S TOCHASTIC A PPROXIMATION FOR R OOT F INDING IN N ONLINEAR M ODELS Organization of chapter in ISSO –Introduction and potpourri of examples Sample.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Yaomin Jin Design of Experiments Morris Method.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Curve-Fitting Regression
Machine Learning 5. Parametric Methods.
Introduction to Estimation Theory: A Tutorial
Basic Theory (for curve 01). 1.1 Points and Vectors  Real life methods for constructing curves and surfaces often start with points and vectors, which.
The simple linear regression model and parameter estimation
Transfer Functions Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: The following terminology.
CS 9633 Machine Learning Support Vector Machines
PREDICT 422: Practical Machine Learning
Ch 11.6: Series of Orthogonal Functions: Mean Convergence
Chapter 14 Introduction to Multiple Regression
Piecewise Polynomials and Splines
Chapter 7. Classification and Prediction
STATISTICAL ORBIT DETERMINATION Kalman (sequential) filter
Linear Regression.
Neural Networks Winter-Spring 2014
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Multiple Regression Analysis and Model Building
ECONOMETRICS DR. DEEPTI.
Charles University Charles University STAKAN III
Data Mining Practical Machine Learning Tools and Techniques
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Numerical Analysis Lecture 16.
Undergraduated Econometrics
10701 / Machine Learning Today: - Cross validation,
Lecture 4: Econometric Foundations
The Simple Linear Regression Model: Specification and Estimation
Linear Model Selection and regularization
Chapter 4, Regression Diagnostics Detection of Model Violation
OVERVIEW OF LINEAR MODELS
Basis Expansions and Generalized Additive Models (1)
Model generalization Brief summary of methods
Parametric Methods Berlin Chen, 2005 References:
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Generalized Additive Model
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

NONPARAMETRIC LEAST SQUARES ESTIMATION IN DERIVATIVE FAMILIES Data On Derivatives, The Curse Of Dimensionality and Cost Function Estimation Adonis Yatchew Economics Department, University of Toronto yatchew@chass.utoronto.ca Joint work with Peter Hall, Department of Mathematics and Statistics, University of Melbourne. Econometric Society Meetings, Boston, June 6, 2009.

Introduction Nonparametric modeling in economics suffers from the curse of dimensionality. The curse can be mitigated through semiparametric specifications (partial linear model, index model), additive separability. The presence of multiple sometimes many explanatory variables and the inability to conduct controlled experiments means that nonparametric modeling in economics is particularly affected by the curse of dimensionality. Also symmetry (e.g. radial symmetry)

Introduction Main idea: data on derivatives can substantially mitigate the curse of dimensionality. Settings where data may be available on a function and (some of) its derivatives: Cost function and factor demands (Shepherd’s Lemma). Production function and factor prices. Systems modelled through optimization of an objective function. Data may be available on the function and on first order conditions. Other examples Cost function estimation encompasses a family of examples where costs are minimized subject to a set of constraints. The envelope theorem then yields relationships between the quantities chosen, the constraint parameters and the Lagrange multipliers associated with the constraints. Production function estimation -- data on factor prices contain information on slopes of isoquants. Data on option prices combined with direct data on ‘market expectations’. Option prices – suppose had data of the form “probability that value of the Dow-Jones will not exceed 10,600 the end of next month when options expire. (Or that the FTSE will not exceed 5,000.) Experimental economics.

Local Averaging Estimators Given data on then can be estimated as though it were a function of a single nonparametric variable rather than two.

Local Averaging Estimators h x1 x2

Local Averaging Estimators In conventional nonparametric estimation, rate of convergence is given by where d is the number of bounded derivatives, k the dimension of x. If data on partial derivatives are available then a rate of can be achieved where where p is the dimension of the largest sub-vector of x for which all own and cross first-order partials are observed (Hall and Yatchew 2007). However, partial differentiation destroys certain additively separable portions of the function which can be recovered through data on lower order partials or data on the function itself. Data on partial derivatives eliminates local averaging in certain directions. Local averaging replaced with global averages in certain directions. OBJECTIVES Rate of convergence. Incorporation of data on various derivatives data on functions of derivatives additional constraints on function or derivatives. Selection of smoothing parameter(s).

Local Averaging -- Disadvantages Local averaging will not, in general, optimally combine data from various derivatives. Data on functions of derivatives cannot be easily incorporated. Multi-step estimation procedure which may require smoothing parameter selection at each step. Non-trivial to introduce additional constraints (e.g., monotonicity, convexity). Other Disadvantages

Optimization Estimators Let index a family of observed derivatives. For each we have data Consider For example, given data on function and first derivative Advantages Conveniently allow incorporation of data on various derivatives within a single optimization problem. With sufficient derivative data, optimal smoothing parameters may be selected without recourse to cross-validation. Data on functions of derivatives can be incorporated. Additional constraints (e.g., monotonicity, convexity) can be added. Additional Slides Optimization problem -- series estimator. Cosine orthonormal basis. Expansion of IMSE: general under sufficient smoothness assumptions so that var and bias^2 are O(1/n) under sufficient smoothness assumptions so that var is O(1/n) and bias^2 is o(1/n) Generalizations: p-dimensions non-uniform design target function estimation Asymptotic distribution of ISE Selection of smoothing parameter can be achieved without cross-validation.

Cosine Series Estimation – Scalar x’s Write where and consider GLS regression – or mixed regression. Moreover the “X” matrix for the level regression is close to the identity matrix, while the “X” matrix for the derivative regression is close to a diagonal matrix. In particular, the basis functions are orthonormal and the derivatives of the basis functions are orthogonal (but not orthonormal). Except that need to select r, the smoothing parameter.

Cosine Series Estimation Assume regularly spaced x’s, g sufficiently smooth, then For scalar x, and data on g and g‘ Contrast this with the canonical case where only level data are observed and var=sig^2(r/n). Under slightly stronger conditions on the alphas, the bias^2 term is o(1/n). This result implies root-n consistency of the estimator. Suppose data are available for g and g' if g' is square-integrable and O(n1/2) ≤ r ≤ O(n) , then converges at O(n-1) ; if g" is square-integrable and O(n1/3) ≤ r ≤ O(n) , then converges at O(n-1) .

Cosine Series Estimation – Bivariate x’s Write and consider the heuristic: “var is times the number of coefficients that cannot be estimated from derivative data”. If no derivative data are observed then .  If observe g10 then .  If observe g11 then .  If observe g10 and g01 then . !? Check heuristic in three or more dimensions. Check heuristic if higher order “own” derivatives are available. In low dimensional models, much of the beneficial impact on rates of convergence is realized even if only ordinary first-order partials are observed . This is especially useful in the estimation of economic cost functions. For example, if costs are a function of output and three input prices, and factor demand data are observed, then the cost function may be estimated at rates only slightly slower than as if it were a function of a single nonparametric variable (output).

Simulations – Cost Function Estimation Data are available on where is output and is a vector of input prices. By Shepherd’s Lemma, partial derivatives with respect to input prices equal quantities of inputs which may also be observed.

Simulations – Cost Function Estimation

Cosine Series Estimation Since noise smoothed out more effectively, the number of terms in the approximation can grow more rapidly, thereby speeding the elimination of bias. Which dominates – bias or variance? In parametric models, variance dominates bias. In standard nonparametric models, bias and variance of the same order. If data on derivatives are available, bias can dominate variance. A brief comment on the respective roles of bias and variance may add perspective. The mean integrated squared error (MISE) of common parametric regression estimators is dominated by the variance term, with the bias term converging to zero much more quickly. Standard nonparametric regression, on the other hand, involves a trade-off between bias and variance: in order to achieve the optimal convergence rate in a given smoothness class, the two terms need to converge to zero at the same rate. For series estimators, this entails selecting the number of terms in the approximation, to balance the variance term — which increases with r, against the bias term — which diminishes with r. Data on derivatives, on the other hand, permit one to smooth out the noise more effectively. The variance term converges to zero more quickly, in certain cases achieving the parametric rate O(1/n). For sufficiently smooth, but still infinite dimensional classes of functions, the bias term can converge at the same rate as the variance or even faster. For less smooth classes, the bias term may dominate the variance term (in contrast to parametric modeling where the variance usually dominates the bias term). For series estimators incorporating derivative data, the most salient consequence is that ˜r can grow more rapidly, which in turn permits faster elimination of bias.

Production Function Estimation Data are available on production function Additional data are available on L K Isoquant wL + rK = Co

Summary Rates of convergence of nonparametric estimators can be substantially, even dramatically, improved if data on derivatives are available. Economic theory rarely predicts functional form – hence nonparametric tools are appealing. Economists are generally not able to run controlled experiments so we often have to control for many confounding variables. Existence of many derivatives also improves rate of convergence. Reduction in nonparametric dimension even if just one partial derivative is observed. Usefulness of derivative data in: hypothesis testing semiparametric modeling.