Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew.

Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew University {oferd,shais,singer}@cs.huji.ac.il COLT 2003: The Sixteenth Annual Conference on Learning Theory

Before We Begin … Linear Regression: given find such that Least Squares: minimize Support Vector Regression: minimizes.t.

Loss Symmetrization Loss functions used in classification Boosting: Symmetric versions of these losses can be used for regression:

Begin with a regression training setBegin with a regression training set where, Generate 2m classification training examples of dimension n+1:Generate 2m classification training examples of dimension n+1: Learn while maintainingLearn while maintaining by minimizing a margin-based classification loss A General Reduction

An illustration of a single batch iteration Simplifying assumptions (just for the demo) –Instances are in –Set –Use the Symmetric Log-loss A Batch Algorithm

Calculate discrepancies and weights: 0 1 2 3 4 4321043210

Cumulative weights: 0 1 2 3 4 A Batch Algorithm

Update the regressor: 0 1 2 3 4 4321043210 Two Batch Algorithms or Additive update Log-Additive update

Theorem: (Log-Additive update) Theorem: (Additive update) Lemma: Both bounds are non-negative and equal zero only at the optimum Progress Bounds

A new form of regularization for regression and classification Boosting C Can be implemented by adding pseudo-examples * Communicated by Rob Schapire where Boosting Regularization

Regularization  Compactness of the feasible set forRegularization  Compactness of the feasible set for Regularization  A unique attainable optimizer of the loss functionRegularization  A unique attainable optimizer of the loss function Regularization Contd. Proof of Convergence  Progress + compactness + uniqueness = asymptotic convergence to the optimum

Two synthetic datasetsTwo synthetic datasets Exp-loss vs. Log-loss Log-loss Exp-loss

Extensions Parallel vs. Sequential updatesParallel vs. Sequential updates –Parallel - update all elements of in parallel –Sequential - update the weight of a single weak regressor on each round (like classic boosting) Another loss function – the “Combined Loss”Another loss function – the “Combined Loss” Log-lossExp-lossComb-loss

On-line Algorithms GD and EG online algorithms for Log-loss Relative loss bounds Future Directions Regression tree learning Solving one-class and various ranking problems using similar constructions Regression generalization bounds based on natural regularization

Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew.

Similar presentations

Presentation on theme: "Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew.

Similar presentations

Presentation on theme: "Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew."— Presentation transcript:

Similar presentations

About project

Feedback