Inverse Regression Methods Prasad Naik 7 th Triennial Choice Symposium Wharton, June 16, 2007
Outline Motivation Principal Components (PCR) Sliced Inverse Regression (SIR) Application Constrained Inverse Regression (CIR) Partial Inverse Regression (PIR) p > N problem simulation results
Motivation Estimate the high-dimensional model: y = g(x 1, x 2,..., x p ) Link function g(.) is unknown Small p ( 6 variables) apply multivariate local (linear) polynomial regression Large p (> 10 variables), Curse of dimensionality => Empty space phenomenon
Principal Components (PCR, Massy 1965, JASA) PCR High-dimensional data X x Eigenvalue decomposition x e = e ( 1, e 1 ), ( 2, e 2 ),..., ( p, e p ) Retain K components, (e 1, e 2,..., e K ) where K < p Low-dimensional data, Z = (z 1, z 2,..., z K ) where z i = Xe i are the “new” variables (or factors) Low-dimensional subspace, K = ?? Not the most predictive variables Because y information is ignored
Sliced Inverse Regression (SIR, Li 1991, JASA) Similar idea: X n x p Z n x K Generalized Eigen-decomposition e = x e where = Cov(E[X|y]) Retain K* components, (e 1,..., e K* ) Create new variables Z = (z 1,..., z K* ), where z i = Xe i K* is the smallest integer q (= 0, 1, 2,...) such that Most predictive variables across any set of unit-norm vectors e’s and any transformation T(y)
SIR Applications (Naik, Hagerty, Tsai 2000, JMR) Model p variables reduced to K factors New Product Development context 28 variables 1 factor Direct Marketing context 73 variables 2 factors
Constrained Inverse Regression (CIR, Naik and Tsai 2005, JASA) Can we extract meaningful factors? Yes First capture this information in a set of constraints Then apply our proposed method, CIR
Example 4.1 from Naik and Tsai (2005, JASA) Consider 2-Factor Model p = 5 variables Factor 1 includes variables (4,5) Factor 2 includes variables (1,2,3) Constraint sets:
CIR (contd.) CIR approach Solve the eigenvalue decomposition: (I-P c ) e = x e where the projection matrix When P c = 0, we get SIR (i.e., nested) Shrinkage (e.g., Lasso) set insignificant effects to zero by formulating an appropriate constraint improves t-values for the other effects (i.e., efficiency)
p > N Problem OLS, MLE, SIR, CIR break down when p > N Partial Inverse Regression (Li, Cook, Tsai, Biometrika, forthcoming) Combines ideas from PLS and SIR Works well even when p > 3N Variables are highly correlated Single-index Model g(.) unknown
p > N Solution To estimate , first construct the matrix R as follows where e 1 is the principal eigenvector of = Cov(E[X|y]) Then
Conclusions Inverse Regression Methods offer estimators that are applicable for a remarkably broad class of models high-dimensional data including p > N (which is conceptually the limiting case) Estimators are closed-form, so Easy to code (just a few lines) Computationally inexpensive No iterations or re-sampling or draws (hence no do or for loops) Guaranteed convergence Standard errors for inference are derived in the cited papers