Smooth ε-Insensitive Regression by Loss Symmetrization in Learning Theory

Smooth ε-Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew University {oferd,shais,singer}@cs.huji.ac.il COLT 2003: The Sixteenth Annual Conference on Learning Theory

Before We Begin … Linear Regression: given find such that Least Squares: minimize Support Vector Regression: minimize s.t.

Loss Symmetrization Loss functions used in classification Boosting: Symmetric versions of these losses can be used for regression:

A General Reduction • Begin with a regression training set where , • Generate 2mclassification training examples of dimension n+1: • Learn while maintaining by minimizing a margin-based classification loss

A Batch Algorithm An illustration of a single batch iteration Simplifying assumptions (just for the demo) • Instances are in • Set • Use the Symmetric Log-loss

A Batch Algorithm Calculate discrepancies and weights: 43210 0 1 2 3 4

A Batch Algorithm Cumulative weights: 0 1 2 3 4

or Additive update Two Batch Algorithms Update the regressor: 43210 Log-Additive update 0 1 2 3 4

Progress Bounds Theorem: (Log-Additive update) Theorem: (Additive update) Lemma: Both bounds are non-negative and equal zero only at the optimum

Boosting Regularization A new form of regularization for regression and classification Boosting Can be implemented by addingpseudo-examples * Communicated by Rob Schapire where

Regularization Contd. • Regularization Compactness of the feasible set for • Regularization A unique attainable optimizer of the loss function  Proof of Convergence Progress + compactness + uniqueness = asymptotic convergence to the optimum

Exp-loss vs. Log-loss • Two synthetic datasets Log-loss Exp-loss

Extensions • Parallel vs. Sequential updates • Parallel - update all elements of in parallel • Sequential - update the weight of a single weak regressor on each round (like classic boosting) • Another loss function – the “Combined Loss” Log-loss Exp-loss Comb-loss

On-line Algorithms • GD and EG online algorithms for Log-loss • Relative loss bounds Future Directions • Regression tree learning • Solving one-class and various ranking problems using similar constructions • Regression generalization bounds based on natural regularization

Smooth ε-Insensitive Regression by Loss Symmetrization in Learning Theory

Smooth ε-Insensitive Regression by Loss Symmetrization in Learning Theory

Presentation Transcript

Efficient Asynchronous Protocol Converters for Two-Phase Delay-Insensitive Global Communication

Regression for Data Mining

Logistic Regression – Basic Relationships

Principles of Biostatistics Simple Linear Regression

Postoperative Visual Loss

MULTIPLE REGRESSION ANALYSIS

Smooth Bore Nozzles vs. Combination Nozzles

Regression Analysis

3.3 Hypothesis Testing in Multiple Linear Regression

Chapter 2 The Simple Linear Regression Model: Specification and Estimation

The Multiple Regression Model

Artistic Regression

Chapter 2: Logistic Regression

Relative Importance of Predictors with Regression Models

Correlation and regression

Chapter 3

Nonlinear Regression Models

Multilevel Regression Models

Chapter 10 Correlation and Regression

Regression