Trading Convexity for Scalability Marco A. Alvarez CS7680 Department of Computer Science Utah State University
Paper Collobert, R., Sinz, F., Weston, J., and Bottou, L Trading convexity for scalability. In Proceedings of the 23rd International Conference on Machine Learning (Pittsburgh, Pennsylvania, June , 2006). ICML '06, vol ACM Press, New York, NY,
Introduction Previously in Machine Learning Non-convex cost function in MLP Difficult to optimize Work efficiently SVM are defined by a convex function Easier optimization (algorithms) Unique solution (we can write theorems) Goal of the paper Sometimes non-convexity has benefits Faster == training and testing (less support vectors) Non-convex SVMs (faster and sparser) Fast transductive SVMs
From SVM Decision function Primal formulation Minimize ||w|| so that margin is maximized w is a combination of a small number of data (sparsity) Decision boundary is determined by the support vectors Dual formulation s.t.
SVM problem Number of support vectors increases linearly with L Cost attributed to one example (x,y): From:
Ramp Loss Function Given: Outliers Non SV
Concave-Convex Procedure (CCCP) Given a cost function: Decompose into a convex part and a concave part Is guaranteed to decrease at each iteration
Using the Ramp Loss
CCCP for Ramp Loss
Time and Number of SVs
Transductive SVMs
Loss Function Cost to be minimized:
Balancing Constraint Necessary for TSVMs
Training Time
Quadratic Fit