Presentation is loading. Please wait.

Presentation is loading. Please wait.

Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew.

Similar presentations


Presentation on theme: "Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew."— Presentation transcript:

1 Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew University {oferd,shais,singer}@cs.huji.ac.il COLT 2003: The Sixteenth Annual Conference on Learning Theory

2 Before We Begin … Linear Regression: given find such that Least Squares: minimize Support Vector Regression: minimizes.t.

3 Loss Symmetrization Loss functions used in classification Boosting: Symmetric versions of these losses can be used for regression:

4 Begin with a regression training setBegin with a regression training set where, Generate 2m classification training examples of dimension n+1:Generate 2m classification training examples of dimension n+1: Learn while maintainingLearn while maintaining by minimizing a margin-based classification loss A General Reduction

5 An illustration of a single batch iteration Simplifying assumptions (just for the demo) –Instances are in –Set –Use the Symmetric Log-loss A Batch Algorithm

6 Calculate discrepancies and weights: 0 1 2 3 4 4321043210

7 Cumulative weights: 0 1 2 3 4 A Batch Algorithm

8 Update the regressor: 0 1 2 3 4 4321043210 Two Batch Algorithms or Additive update Log-Additive update

9 Theorem: (Log-Additive update) Theorem: (Additive update) Lemma: Both bounds are non-negative and equal zero only at the optimum Progress Bounds

10 A new form of regularization for regression and classification Boosting C Can be implemented by adding pseudo-examples * Communicated by Rob Schapire where Boosting Regularization

11 Regularization  Compactness of the feasible set forRegularization  Compactness of the feasible set for Regularization  A unique attainable optimizer of the loss functionRegularization  A unique attainable optimizer of the loss function Regularization Contd. Proof of Convergence  Progress + compactness + uniqueness = asymptotic convergence to the optimum

12 Two synthetic datasetsTwo synthetic datasets Exp-loss vs. Log-loss Log-loss Exp-loss

13 Extensions Parallel vs. Sequential updatesParallel vs. Sequential updates –Parallel - update all elements of in parallel –Sequential - update the weight of a single weak regressor on each round (like classic boosting) Another loss function – the “Combined Loss”Another loss function – the “Combined Loss” Log-lossExp-lossComb-loss

14 On-line Algorithms GD and EG online algorithms for Log-loss Relative loss bounds Future Directions Regression tree learning Solving one-class and various ranking problems using similar constructions Regression generalization bounds based on natural regularization


Download ppt "Smooth ε -Insensitive Regression by Loss Symmetrization Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer School of Computer Science and Engineering The Hebrew."

Similar presentations


Ads by Google