inverse regression methods n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Inverse Regression Methods PowerPoint Presentation
Download Presentation
Inverse Regression Methods

Loading in 2 Seconds...

play fullscreen
1 / 12

Inverse Regression Methods - PowerPoint PPT Presentation


  • 167 Views
  • Uploaded on

Inverse Regression Methods. Prasad Naik 7 th Triennial Choice Symposium Wharton, June 16, 2007. Outline. Motivation Principal Components (PCR) Sliced Inverse Regression (SIR) Application Constrained Inverse Regression (CIR) Partial Inverse Regression (PIR) p > N problem

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Inverse Regression Methods' - huey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
inverse regression methods

Inverse Regression Methods

Prasad Naik

7th Triennial Choice Symposium

Wharton, June 16, 2007

outline
Outline
  • Motivation
  • Principal Components (PCR)
  • Sliced Inverse Regression (SIR)
    • Application
  • Constrained Inverse Regression (CIR)
  • Partial Inverse Regression (PIR)
    • p > N problem
    • simulation results
motivation
Motivation
  • Estimate the high-dimensional model:
    • y = g(x1, x2, ..., xp)
    • Link function g(.) is unknown
  • Small p ( 6 variables)
    • apply multivariate local (linear) polynomial regression
  • Large p (> 10 variables),
    • Curse of dimensionality => Empty space phenomenon
principal components pcr massy 1965 jasa
Principal Components (PCR, Massy 1965, JASA)
  • PCR
    • High-dimensional data X  x
    • Eigenvalue decomposition
      • x e =  e
      • (1, e1), (2, e2), ... , (p, ep)
    • Retain K components, (e1, e2, ..., eK)
      • where K < p
    • Low-dimensional data, Z = (z1, z2, ..., zK)
      • where zi = Xei are the “new” variables (or factors)
  • Low-dimensional subspace, K = ??
  • Not the most predictive variables
    • Because y information is ignored
sliced inverse regression sir li 1991 jasa
Sliced Inverse Regression (SIR, Li 1991, JASA)
  • Similar idea: Xn x p Z n x K
  • Generalized Eigen-decomposition
    •  e =  x e
      • where  = Cov(E[X|y])
    • Retain K* components, (e1, ..., eK*)
    • Create new variables Z = (z1,..., zK*), where zi = Xei
  • K* is the smallest integer q (= 0, 1, 2, ...) such that
  • Most predictive variables across
    • any set of unit-norm vectors e’s and
    • any transformation T(y)
sir applications naik hagerty tsai 2000 jmr
SIR Applications (Naik, Hagerty, Tsai 2000, JMR)
  • Model
  • p variables reduced to K factors
  • New Product Development context
    • 28 variables  1 factor
  • Direct Marketing context
    • 73 variables  2 factors
constrained inverse regression cir naik and tsai 2005 jasa
Constrained Inverse Regression (CIR, Naik and Tsai 2005, JASA)
  • Can we extract meaningful factors?
  • Yes
    • First capture this information in a set of constraints
    • Then apply our proposed method, CIR
example 4 1 from naik and tsai 2005 jasa
Example 4.1 from Naik and Tsai (2005, JASA)
  • Consider 2-Factor Model
    • p = 5 variables
    • Factor 1 includes variables (4,5)
    • Factor 2 includes variables (1,2,3)
  • Constraint sets:
cir contd
CIR (contd.)
  • CIR approach
    • Solve the eigenvalue decomposition:
    • (I-Pc)  e =  x e
    • where the projection matrix
  • When Pc = 0, we get SIR (i.e., nested)
  • Shrinkage (e.g., Lasso)
    • set insignificant effects to zero by formulating an appropriate constraint
    • improves t-values for the other effects (i.e., efficiency)
p n problem
p > N Problem
  • OLS, MLE, SIR, CIR break down when p > N
  • Partial Inverse Regression (Li, Cook, Tsai, Biometrika, forthcoming)
    • Combines ideas from PLS and SIR
    • Works well even when
      • p > 3N
      • Variables are highly correlated
  • Single-index Model
    • g(.) unknown
p n solution
p > N Solution
  • To estimate , first construct the matrix R as follows
    • where e1 is the principal eigenvector of  = Cov(E[X|y])
  • Then
conclusions
Conclusions
  • Inverse Regression Methods offer estimators that are applicable for
    • a remarkably broad class of models
    • high-dimensional data
      • including p > N (which is conceptually the limiting case)
  • Estimators are closed-form, so
    • Easy to code (just a few lines)
    • Computationally inexpensive
      • No iterations or re-sampling or draws (hence no do or for loops)
    • Guaranteed convergence
    • Standard errors for inference are derived in the cited papers