USPACOR: Universal Sparsity-Controlling Outlier Rejection

USPACOR: Universal Sparsity-Controlling Outlier Rejection G. B. Giannakis, G. Mateos, S. Farahmand, V. Kekatos, and H. Zhu ECE Department, University of Minnesota Acknowledgments: NSF grants no. CCF-0830480, 1016605 EECS-0824007, 1002180 May 24, 2011

Robust learning DNA microarray Preference modeling Traffic surveillance Major innovative claim:sparsity control robustness control • Motivation: (statistical) learning from high-dimensional data • Outliers: data not adhering to postulated models • Resilience key to: model selection, prediction, classification, tracking,… • Our goal: `universally’ robustify learning algorithms 2

Robustifying linear regression Least-trimmed squares regression [Rousseeuw’87] (LTS) • is the -th order statistic among • residuals discarded A: Try all subsets of size , solve, and pick the best • Q: How should we go about minimizing nonconvex (LTS)? • Simple but intractable beyond small problems • Near-optimal solvers [Rousseeuw’06]; RANSAC [Fischler-Bolles’81] 3

Modeling outliers outlier • Outlier variables s.t. otherwise • Nominal data obey ; outliers something else • -contamination [Fuchs’99], Bayesian framework [Jin-Rao’10] • Remarks • Both and are unknown • If outliers sporadic, then vector is sparse! • Natural (but intractable) nonconvex estimator 4

LTS as sparse regression (P0) • Tuning parameter controls sparsity in number of outliers Proposition 1: If solves (P0) with chosen s.t. , then in (LTS). • Lagrangian form • The result • Formally justifies the regression model and its estimator (P0) • Ties sparse regression with robust estimation 5

Just relax! • (P0) is NP-hard relax ; e.g., [Tropp’06] (P1) • Q: Does (P1) yield robust estimates ? A: Yap! Huber estimator is a special case where • (P1) convex, and thus efficiently solved • Role of sparsity-controlling is central 6

Lassoing outliers Proposition 2: Minimizers of (P1) are fully determined by as and • Enables effective data-driven methods to select • Lasso solvers return entire robustification path (RP) • Suffices to solve Lasso [Tibshirani’94] • Cross-validation (CV) fails with multiple outliers [Hampel’86] 7

Robustification paths • Lasso path of solutions is piecewise linear • LARS returns whole RP [Efron’03] • Same cost of a single LS fit ( ) Coeffs. • Leverage these solvers consider a grid • values of with • Lasso is simple in the scalar case • Coordinate descent is fast! [Friedman ‘07] • Exploits warm starts, sparsity • Other solvers: SpaRSA [Wright et al’09], SPAMS [Mairal et al’10] 8

Selecting • Number of outliers known: from RP, obtain range of s.t. .Discard outliers (known), and use CV to determine • Variance of the nominal noise known: from RP, for each on the grid, find the sample variance The best is s.t. • Variance of the nominal noise unknown: replace above with a robust estimate , e.g., median absolute deviation (MAD) • Relies on RP and knowledge on the data model 9

USPACOR vs. RANSAC • , i.i.d. • Nominal: , , i.i.d. • Outliers: i.i.d. for 10

Beyond linear regression Nonparametric (kernel) regression State: Measurement: • General criteria • Loss functions: quadratic, , Huber, -insensitive • Regularization for and : ridge, (group)-Lasso, adaptive Lasso,… • Doubly-robust Kalman smoother [Farahmand et al’10] Fixed-lag DRKS Fixed-lag KS 11

Unsupervised learning Sparsity control for robust PCA [Mateos-Giannakis’10] Original Robust PCA `Outliers’ Clustering result Data • Low-rank factor analysis model: • Outlier-aware robust clustering [Forero et al’11] • Generative model for K-means: 12

Concluding remarks Universal sparsity-controlling framework for robust learning Tuning along RP controls: • Universality • Information used for selecting • Nominal data model • Criterion adopted to fit the chosen model CONVEX OPTIMIZATION OUTLIER-RESILIENT ESTIMATION LASSO • Degree of sparsity in model residuals • Number of outliers rejected • More on USPACOR: • Now! - `Robust nonparametric regression by controlling sparsity’ • Friday -`Outlier-aware robust clustering’ 13

USPACOR: Universal Sparsity-Controlling Outlier Rejection