Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software.

Slides:



Advertisements
Similar presentations
3.3 Hypothesis Testing in Multiple Linear Regression
Advertisements

A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Regression and correlation methods
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
The Multiple Regression Model.
Biomedical Statistics Testing for Normality and Symmetry Teacher:Jang-Zern Tsai ( 蔡章仁 ) Student: 邱瑋國.
Hypothesis Testing Steps in Hypothesis Testing:
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Linear regression models
A Short Introduction to Curve Fitting and Regression by Brad Morantz
CJT 765: Structural Equation Modeling Class 3: Data Screening: Fixing Distributional Problems, Missing Data, Measurement.
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L10.1 CorrelationCorrelation The underlying principle of correlation analysis.
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
OUTLIER, HETEROSKEDASTICITY,AND NORMALITY
The Simple Linear Regression Model: Specification and Estimation
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
QA-2 FRM-GARP Sep-2001 Zvi Wiener Quantitative Analysis 2.
FRM Zvi Wiener Following P. Jorion, Financial Risk Manager Handbook Financial Risk Management.
Chapter 6 Continuous Random Variables and Probability Distributions
Probability theory 2011 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different definitions.
Lecture 23 Multiple Regression (Sections )
REGRESSION AND CORRELATION
Chapter 5 Continuous Random Variables and Probability Distributions
Probability theory 2008 Outline of lecture 5 The multivariate normal distribution  Characterizing properties of the univariate normal distribution  Different.
Lecture II-2: Probability Review
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution Business Statistics: A First Course 5 th.
Meta-Analysis and Meta- Regression Airport Noise and Home Values J.P. Nelson (2004). “Meta-Analysis of Airport Noise and Hedonic Property Values: Problems.
Correlation & Regression
Correlation and Linear Regression
Active Learning Lecture Slides
Correlation and Linear Regression
Simple Linear Regression
Chap 6-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 6 The Normal Distribution Business Statistics: A First Course 6 th.
(a.k.a: The statistical bare minimum I should take along from STAT 101)
The Examination of Residuals. The residuals are defined as the n differences : where is an observation and is the corresponding fitted value obtained.
1 Advanced Topics in Regression Quantile Regression Analysis of Causality Mediation Analysis Hierarchical Linear Modeling Compiled by Nick Evangelopoulos,
MTH 161: Introduction To Statistics
Lecture 15: Statistics and Their Distributions, Central Limit Theorem
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
1 G Lect 7M Statistical power for regression Statistical interaction G Multiple Regression Week 7 (Monday)
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Univariate Linear Regression Problem Model: Y=  0 +  1 X+  Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
LECTURE 9 Tuesday, 24 FEBRUARY STA291 Fall Administrative 4.2 Measures of Variation (Empirical Rule) 4.4 Measures of Linear Relationship Suggested.
Econometrics Course: Cost as the Dependent Variable (I) Paul G. Barnett, PhD November 20, 2013.
Advanced Statistical Methods: Continuous Variables REVIEW Dr. Irina Tomescu-Dubrow.
KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.
Lecture 1: Basic Statistical Tools. A random variable (RV) = outcome (realization) not a set value, but rather drawn from some probability distribution.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Quantitative Methods Residual Analysis Multiple Linear Regression C.W. Jackson/B. K. Gordor.
Estimating standard error using bootstrap
Ch9 Random Function Models (II)
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
Diagnostics and Transformation for SLR
Review of Probability Concepts
What is Regression Analysis?
Shape of Distributions
Simple Linear Regression
Diagnostics and Transformation for SLR
Topic 11: Matrix Approach to Linear Regression
Microeconometric Modeling
The Normal Distribution
Introductory Statistics
Presentation transcript:

Quantile Regression

The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software

Quantile Regression The Problem

Quantile Regression Problem –The distribution of Y, the “dependent” variable, conditional on the covariate X, may have thick tails. –The conditional distribution of Y may be asymmetric. –The conditional distribution of Y may not be unimodal. Neither regression nor ANOVA will give us robust results. Outliers are problematic, the mean is pulled toward the skewed tail, multiple modes will not be revealed.

Quantile Regression Problem –ANOVA and regression provide information only about the conditional mean. –More knowledge about the distribution of the statistic may be important. –The covariates may shift not only the location or scale of the distribution, they may affect the shape as well.

Quantile Regression

The Estimator

Quantile Regression Ordinarily we specify a quadratic loss function. That is, L(u) = u 2 Under quadratic loss we use the conditional mean, via regression or ANOVA, as our predictor of Y for a given X=x.

Quantile Regression Definition: Given p ∈ [0, 1]. A pth quantile of a random variable Z is any number ζ p such that Pr(Z< ζ p ) ≤ p ≤ Pr(Z ≤ ζ p ). The solution always exists, but need not be unique. Ex: Suppose Z={3, 4, 7, 9, 9, 11, 17, 21} and p=0.5 then Pr(Z<9) = 3/8 ≤ 1/2 ≤ Pr(Z ≤ 9) = 5/8

Quantile Regression Quantiles can be used to characterize a distribution: –Median –Interquartile Range –Interdecile Range –Symmetry = (ζ.75 - ζ.5 )/(ζ.5 - ζ.25 ) –Tail Weight = (ζ.90 - ζ.10 )/(ζ.75 - ζ.25 )

Quantile Regression Suppose Z is a continuous r.v. with cdf F(.), then Pr(Z<z) = Pr(Z≤z)=F(z) for every z in the support and a pth quantile is any number ζ p such that F(ζ p ) = p If F is continuous and strictly increasing then the inverse exists and ζ p =F -1 (p)

Quantile Regression Definition: The asymmetric absolute loss function is Where u is the prediction error we have made and I(u) is an indicator function of the sort

Quantile Regression Absolute Loss vs. Quadratic Loss

Quantile Regression Proposition: Under the asymmetric absolute loss function L p a best predictor of Y given X=x is a p th conditional quantile. For example, if p=.5 then the best predictor is the median.

Quantile Regression Definition: A parametric quantile regression model is correctly specified if, for example, That is, is a particular linear combination of the independent variable(s) such that where F( ) is some univariate distribution.

Quantile Regression.25

Quantile Regression Definition: A quantile regression model is identifiable if has a unique solution.

Quantile Regression A family of conditional quantiles of Y given X=x. –Let Y=α + βx + u with α = β = 1

Quantile Regression

Quantiles at.9,.75,.5,.25, and.10. Today’s temperature fitted to a quartic on yesterday’s temperature.

Quantile Regression Computation of the Estimate

Quantile Regression Estimation The quantile regression coefficients are the solution to The k first order conditions are

Quantile Regression Estimation The fitted line will go through k data points. The # of negative residuals ≤ np ≤ # of neg residuals + # of zero residuals The computational algorithm is to set up the objective function as a linear programming problem The solution to (1) – (2) [previous slide] need not be unique.

Quantile Regression Properties of the Regression

Quantile Regression Properties of the regression Transformation equivariance For any monotone function, h(.), since P(T<t|x) = P(h(T)<h(t)|x). This is especially important where the response variable has been censored, I.e. top coded.

Properties The mean does not have transformation equivariance since Eh(Y) ≠ h(E(Y))

Quantile Regression Equivariance Practical implications

Equivariance (i) and (ii) imply scale equivariance (iii) is a shift or regression equivariance (iv) is equivariance to reparameterization of design

Quantile Regression Properties Robust to outliers. As long as the sign of the residual does not change, any Y i may be changed without shifting the conditional quantile line. The regression quantiles are correlated.

Quantile Regression Properties of the Estimator

Quantile Regression Properties of the Estimator Asymptotic Distribution The covariance depends on the unknown f(.) and the value of the vector x at which the covariance is being evaluated.

Quantile Regression Properties of the Estimator When the error is independent of x then the coefficient covariance reduces to where

Quantile Regression Properties of the Estimator In general the quantile regression estimator is more efficient than OLS The efficient estimator requires knowledge of the true error distribution.

Quantile Regression Coefficient Interpretation The marginal change in the Θth conditional quantile due to a marginal change in the jth element of x. There is no guarantee that the ith person will remain in the same quantile after her x is changed.

Quantile Regression Hypothesis Testing

Quantile Regression Hypothesis Testing Given asymptotic normality, one can construct asymptotic t-statistics for the coefficients The error term may be heteroscedastic. The test statistic is, in construction, similar to the Wald Test. A test for symmetry, also resembling a Wald Test, can be built relying on the invariance properties referred to above.

Heteroscedasticity Model: y i = β o +β 1 x i +u i, with iid errors. –The quantiles are a vertical shift of one another. Model: y i = β o +β 1 x i +σ(x i )u i, errors are now heteroscedastic. –The quantiles now exhibit a location shift as well as a scale shift. Khmaladze-Koenker Test Statistic

Quantile Regression Bibliography Koenker and Hullock (2001), “Quantile Regression,” Journal of Economic Perspectives, Vol. 15, Pps Buchinsky (1998), “Recent Advances in Quantile Regression Models”, Journal of Human Resources, Vo. 33, Pps

Quantile Regression S+ Programs - Lib.stat.cmu.edu/s TSP Limdep