Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inverse Regression Methods Prasad Naik 7 th Triennial Choice Symposium Wharton, June 16, 2007.

Published by Modified over 9 years ago

Similar presentations

Presentation on theme: "Inverse Regression Methods Prasad Naik 7 th Triennial Choice Symposium Wharton, June 16, 2007."— Presentation transcript:

1 Inverse Regression Methods Prasad Naik 7 th Triennial Choice Symposium Wharton, June 16, 2007

2 Outline Motivation Principal Components (PCR) Sliced Inverse Regression (SIR) Application Constrained Inverse Regression (CIR) Partial Inverse Regression (PIR) p > N problem simulation results

3 Motivation Estimate the high-dimensional model: y = g(x 1, x 2,..., x p ) Link function g(.) is unknown Small p (  6 variables) apply multivariate local (linear) polynomial regression Large p (> 10 variables), Curse of dimensionality => Empty space phenomenon

4 Principal Components (PCR, Massy 1965, JASA) PCR High-dimensional data X   x Eigenvalue decomposition  x e = e ( 1, e 1 ), ( 2, e 2 ),..., ( p, e p ) Retain K components, (e 1, e 2,..., e K ) where K < p Low-dimensional data, Z = (z 1, z 2,..., z K ) where z i = Xe i are the “new” variables (or factors) Low-dimensional subspace, K = ?? Not the most predictive variables Because y information is ignored

5 Sliced Inverse Regression (SIR, Li 1991, JASA) Similar idea: X n x p  Z n x K Generalized Eigen-decomposition   e =  x e where   = Cov(E[X|y]) Retain K* components, (e 1,..., e K* ) Create new variables Z = (z 1,..., z K* ), where z i = Xe i K* is the smallest integer q (= 0, 1, 2,...) such that Most predictive variables across any set of unit-norm vectors e’s and any transformation T(y)

6 SIR Applications (Naik, Hagerty, Tsai 2000, JMR) Model p variables reduced to K factors New Product Development context 28 variables  1 factor Direct Marketing context 73 variables  2 factors

7 Constrained Inverse Regression (CIR, Naik and Tsai 2005, JASA) Can we extract meaningful factors? Yes First capture this information in a set of constraints Then apply our proposed method, CIR

8 Example 4.1 from Naik and Tsai (2005, JASA) Consider 2-Factor Model p = 5 variables Factor 1 includes variables (4,5) Factor 2 includes variables (1,2,3) Constraint sets:

9 CIR (contd.) CIR approach Solve the eigenvalue decomposition: (I-P c )   e =  x e where the projection matrix When P c = 0, we get SIR (i.e., nested) Shrinkage (e.g., Lasso) set insignificant effects to zero by formulating an appropriate constraint improves t-values for the other effects (i.e., efficiency)

10 p > N Problem OLS, MLE, SIR, CIR break down when p > N Partial Inverse Regression (Li, Cook, Tsai, Biometrika, forthcoming) Combines ideas from PLS and SIR Works well even when p > 3N Variables are highly correlated Single-index Model g(.) unknown

11 p > N Solution To estimate , first construct the matrix R as follows where e 1 is the principal eigenvector of   = Cov(E[X|y]) Then

12 Conclusions Inverse Regression Methods offer estimators that are applicable for a remarkably broad class of models high-dimensional data including p > N (which is conceptually the limiting case) Estimators are closed-form, so Easy to code (just a few lines) Computationally inexpensive No iterations or re-sampling or draws (hence no do or for loops) Guaranteed convergence Standard errors for inference are derived in the cited papers

Download ppt "Inverse Regression Methods Prasad Naik 7 th Triennial Choice Symposium Wharton, June 16, 2007."

Similar presentations

About project

SlidePlayer
Terms of Service

Do Not Sell
My Personal
Information

Feedback

Privacy Policy
Feedback

© 2025 SlidePlayer.com. Inc.
All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google