Robust Generalized Principal Component Analysis and Its Applications

Robust Generalized Principal Component Analysis and Its Applications Kun Huang Perception & Decision Laboratory Decision & Control Group, CSL Electrical & Computer Engineering Dept., UIUC http://black.csl.uiuc.edu/~kunh

IcAMP,Na Nitric Oxide Ca2+ IT Convergence Lab NEUROPHYSIOLOGY AND NETWORKED CONTROL SYSTEM Neuromodulation in seaslug Pleurobranchea C. Gillette Lab Testbed for networked control system

Symmetry-based photo editing Berkeley Aerial Robot (BEAR) Project GEOMETRIC VISION • 3-D reconstruction from multiple views • A unified characterization of constraints among images (ACCV ’02) • 3-D reconstruction algorithms (IJCV’04, ECCV’02, ECCV-VAMODS’02) • Symmetry-based 3-D reconstruction (IJCV’04) • Automatic recognition and reconstruction of symmetric objects from images (ICCV’03) • Large-baseline matching of symmetric objects across multiple images (ICRA’04) • Photo editing and image alignment (ICCV-HLK’03)

INTRODUCTION GENERALIZED PRINCIPAL COMPONENT ANALYSIS (GPCA) MINIMUM EFFECTIVE DIMENSION (MED) AND ROBUST GPCA APPLICATIONS CONCLUSION AND FUTURE DIRECTIONS

Gardner et. al. ’03 INTRODUCTION –High-Dimensional Data

Zhang ’04 SVD Basis for P INTRODUCTION – Principal Component Analysis (PCA) Dimensionality reduction Find a low-dimensional representation (model) of high-dimensional data (Gauss, Jain…). Principal component analysis (PCA, SVD, KLT) Variations of PCA • Nonlinear PCA (Scholkopf-Smola-Muller’98) • Probabilistic PCA (Tipping-Bishop’99, Collins et.al’01) • Higher-Order SVD (HOSVD) (Tucker’66, Davis’02)

INTRODUCTION – Multiple Model Fitting Multiple Linear Models • the number of subspaces is unknown • the dimensions of the subspaces are unknown • the basis of the subspaces are unknown • the segmentation of the data points is unknown Chicken-and-egg problem • Given segmentation, estimate subspaces • Given subspaces, segment the data

INTRODUCTION – Multiple Model Fitting • Geometric approaches (Boult et.al, Costeira et.al, Kanatani) • Segment data using similarity matrices + clustering • Apply standard PCA to each group • Spectral algorithm (Vempala-Wang’02) • Iterative approaches • Generative model: data membership + mixture model • Identify subspaces using Expectation Maximization • E-step: estimate membership given model parameters • M-step: estimate parameters given membership • Probabilistic PCA (Tipping-Bishop’99), K-subspaces (Ho et. al’03), subspace growing and selection (Leonardis et. al’02) How can we initialize iterative algorithms?

INTRODUCTION – Generalized PCA (GPCA) • Generalized PCA (Vidal-Ma-Sastry’03, 04) • Solving the “chicken-and-egg” dilemma • Go directly from data to a mixture of models • Closed-form analytical solution • Propose analgebro-geometric approach to data segmentation • Number of groups = degree of a polynomial • Groups = polynomial factors • In the absence of noise • There exists a unique solution which is closed form iff n<5 • The exact solution can be computed using linear algebra

INTRODUCTION – Recursive GPCA • Polynomial factorization does not apply to the case with a mixture of unknown number of linear models with possibly different dimensions. • Recursive GPCA (Huang-Vidal-Ma CVPR’04, Huang-Yang-Ma ICIP’04, Huang-Wagner-Ma CDC’04) • Recursively segment the data points.

INTRODUCTION – Model Selection Criteria What is the stopping criteria for the recursive process? In the presence of noise, how can we decide the number of groups? How to handle outliers? Note that for some applications, outliers are present even in the absence of noise. • Model selection criteria • Minimum message length (MML) (Wallace-Boulton’68) • Minimum description length (MDL) (Rissanen’78) • Akaike information criterion (AIC) (Akaike’77) • Geometric AIC (Kanatani’03) • Robust AIC (Torr’98) Balance model complexity and data fidelity • Minimum effective dimension (Huang-Vidal-Ma’04) • Specifically developed for mixture of linear models

INTRODUCTION –Robust GPCA Algorithm • Robust recursive GPCAalgorithm (Huang -Vidal-Ma CVPR’04, Huang-Yang-Ma ICIP’04, Huang-Wagner-Ma CDC’04) • In the absence of noise and outliers • Recursively identify the correct number of subspaces and their dimensions and bases. • In the presence of noise and outliers • Robustly segment the data points based on specified maximum error tolerance; • Automatically determine the number of subspaces based on the specified error tolerance and percentage of outliers. • Applications • Motion segmentation • Image compression • Hybrid system identification

GENERALIZED PCA – Problem Formulation Given a set ofNdata points sampled froms(unknown) different subspaces in aK-dimensional ambient space: • Estimate the number of subspacessand the dimensionki (i=1,2,…,s) of each subspace, identify a basis for each subspace; • Segment the given data points into the subspaces. Solve the problem non-iteratively?

De Morgan’s rule Identification of polynomials GENERALIZED PCA – An Example Identification of subspaces

Veronese Map GENERALIZED PCA – Polynomial Fitting The null space of Ln contains information about all the polynomials.

GENERALIZED PCA – Polynomial Differentiation The information of the mixture of subspaces can be obtained via polynomial differentiation.

Recursion for every group. Segment the points: Group every data point into the closest subspace. Identify the subspaces: Perform PCA on DP(xi), the dimension of the i-th subspace is ki= K-rank(DP(xi)) and the basis of PCA are the normal vectors of the subspace. Find the polynomials: Solve forc1, c2, …, cl,the basis of the null space of to Ls. Define all the polynomials aspi(x)=ciTns(x)(i=1,2,…,l). Find points: Find one point xi for each subspace using the polynomials(i=1,2,…,l). Differentiate: For each point xi, calculate the derivative of all the polynomials at xi to obtain DP(xi)=[Dp1(xi), …, Dpl(xi)] GENERALIZED PCA – Recursive GPCA Algorithm Given N points from n subspaces in K-dimensional ambient space: Find n: Set s=1. Construct Veronese data matrix Ls. Increase suntil Ls is not full rank. Recursion: when to stop?

Number of subspaces Total number of points Dimension of each subspace Number of points in each subspace MINIMUM EFFECTIVE DIMENSION – Introduction and Definition Problems with model selection • Model complexity; • Data fidelity; • Robustness and efficiency of the algorithms

two planes • one plane, two lines MINIMUM EFFECTIVE DIMENSION – Examples Model selection criterion: Minimum Effective Dimension (MED)

Robust approach for mixture linear model fitting: For a specified maximum error tolerance t, minimize the Effective Dimension (ED) of the data set. t MINIMUM EFFECTIVE DIMENSION – Error Tolerance t Effective dimension of a data set is closely related to error tolerance t. Extreme cases: • If t is infinity, then ED=0; • If t is 0, then ED=K.

t MED AND ROBUST GPCA ALGORITHM • Robust recursive GPCA algorithm • Assign points to the subspace based on the specified error tolerance t; • Accommodate outliers; • Automatically search for the null space (rank) of Ls and the number of subspaces; • Recursively segment each group. Error tolerance t is the guideline!

ED=3 ED=2.0067 ED=1.6717 A ROBUST RECURSIVE ALGORITHM – A Simulation Example

APPLICATIONS – Motion Segmentation via Point Grouping Data points with different motions belong to different subspaces. All types of motion segmentation problems can be solved by GPCA (Vidal-Ma’04). Robust GPCA can detect different types of motions (Huang et. al’04).

APPLICATIONS –Hybrid Linear System Identification

l(t) APPLICATIONS –Hybrid Linear System Identification System 1 System 2 Output Input System 3 Hybrid linear system identification can be converted to the problem of identifying multiple linear subspaces. (Huang-Wagner-Ma’04)

Amonlirdviman et. al. ’02 Gardner et. al. ’03 APPLICATIONS –Hybrid Linear System Identification Hybrid System Identification Using hybrid system to model cell signaling network. Using system identification technique to infer genetic network.

More efficient representations? Pixel-based representation APPLICATIONS – Representation & Compression of Images • Fixed Basis • Discrete Fourier transform (DFT) or discrete cosine transform (DCT) (Ahmed ’74) . JPEG uses DCT transform. • Wavelet transforms (multi-resolution) (Daubechies’88, Mallat’92). JPEG-2000 uses multi-resolution bior4.4 wavelet. • Adaptive Basis • Karhunen-Loeve transform (KLT), also known as PCA (Jain’89) • KLT (PCA) is known to be the optimal transform if signals are assumed to be drawn from one stationary 2nd-order stochastic model (Effros’95). • Sparse Component Analysis (Fields’96, Donoho et. al’00)

PCA stack stack GPCA APPLICATIONS – Image Representation with Multiple Linear Models Single adaptive linear model Multiple adaptive linear models

APPLICATIONS – An Example

GPCA Original KLT DCT (JPEG) Harr Wavelet GPCA APPLICATIONS – Lossy Image Compression

APPLICATIONS – Sparse Coding for Image Ensembles

GPCA stack APPLICATIONS – GPCA-based Image Segmentation

CONCLUSIONS (My Contributions) • Multi-model fitting • Statistics versus (polynomial) algebra. • GPCA solves the hybrid linear model problem non-iteratively. • Algorithm • Minimum effective dimension criterion. • Robust recursive GPCA algorithm. • Applications • Many practical problems can be solved by GPCA: motion segmentation, hybrid LTI system identification, vanishing point detection. • GPCA can be used to fit piecewise linear models for complex data sets: Sparse representation of images, image/video segmentation.

FUTURE DIRECTIONS • Computation • Noise sensitivity, software toolbox, user interface. • Mathematics • Extension of GPCA to more complex algebraic varieties (e.g., quadratic, cubic, manifold). • Applications to other data • Medical imaging, image segmentation and compression • Bioinformatics, proteomics data analysis • System biology, genetic networks • Hyper-spectral imaging • Video segmentation and event detection • Speech, music.

ACKNOWLEDGEMENT UIUC: Prof. Yi Ma (ECE) Prof. P. R. Kumar (ECE) Prof. Robert Fossum (MATH) Prof. Rhanor Gillette (PHYSL) Dr. Nathan Hatcher (PHYSL) Dr. Jie Zhang (BIOCHEM) Wei Hong (ECE) Yang Yang (ECE) Shankar Rao (ECE) Andrew Wagner (GE) Scott Graham (ECE) Girish Baliga (CS) Johns Hopkins University: Prof. Rene Vidal (BME) UC Berkeley: Prof. Shankar Sastry (EECS) Dr. Omid Shakernia (EECS) George Mason University: Prof. Jana Kosecka (CS) University of Florida: Prof. Leonard Moroz (PHYSL) Funding agencies: DARPA, ONR, NSF.

11/2003 Generalized Principal Component Analysis and Its Applications Kun Huang Thank You! Perception & Decision Laboratory Decision & Control Group, CSL Electrical & Computer Engineering Dept., UIUC http://black.csl.uiuc.edu/~kunh

Robust Generalized Principal Component Analysis and Its Applications