Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.

Similar presentations


Presentation on theme: "Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of."— Presentation transcript:

1 Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of Wisconsin - Madison

2 Key Contributions  Fast new support vector machine classifier  An order of magnitude faster than standard classifiers  Extremely simple to implement  4 lines of MATLAB code  NO optimization packages (LP,QP) needed

3 Outline of Talk  (Standard) Support vector machines (SVM)  Classify by halfspaces  Proximal support vector machines (PSVM)  Classify by proximity to planes  Linear PSVM classifier  Nonlinear PSVM classifier  Full and reduced kernels  Numerical results  Correctness comparable to standard SVM  Much faster classification!  2-million points in 10-space in 21 seconds  Compared to over 10 minutes for standard SVM

4 Support Vector Machines Maximizing the Margin between Bounding Planes A+ A-

5 Proximal Vector Machines Fitting the Data using two parallel Bounding Planes A+ A-

6 Standard Support Vector Machine Algebra of 2-Category Linearly Separable Case  Given m points in n dimensional space  Represented by an m-by-n matrix A  Membership of each in class +1 or –1 specified by:  An m-by-m diagonal matrix D with +1 & -1 entries  More succinctly: where e is a vector of ones.  Separate by two bounding planes,

7 Standard Support Vector Machine Formulation  Margin is maximized by minimizing  Solve the quadratic program for some : min s. t. (QP),, denotes where or membership.

8 PSVM Formulation We have from the QP SVM formulation: (QP) min s. t. This simple, but critical modification, changes the nature of the optimization problem tremendously!! Solving for in terms of and gives: min

9 Advantages of New Formulation  Objective function remains strongly convex  An explicit exact solution can be written in terms of the problem data  PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space  Exact leave-one-out-correctness can be obtained in terms of problem data

10 Linear PSVM We want to solve: min  Setting the gradient equal to zero, gives a nonsingular system of linear equations.  Solution of the system gives the desired PSVM classifier

11 Linear PSVM Solution Here,  The linear system to solve depends on: which is of the size  is usually much smaller than

12 Linear Proximal SVM Algorithm Classifier: Input Define Solve Calculate

13 Nonlinear PSVM Formulation By QP “duality”,. Maximizing the margin in the “dual space”, gives: min  Replace by a nonlinear kernel : min  Linear PSVM: (Linear separating surface: ) (QP) min s. t.

14 The Nonlinear Classifier  The nonlinear classifier:  Where K is a nonlinear kernel, e.g.:  Gaussian (Radial Basis) Kernel :  The -entry of represents the “similarity” of data pointsand

15 Nonlinear PSVM Defining slightly different:  Similar to the linear case, setting the gradient equal to zero, we obtain: However, reduced kernels techniques can be used (RSVM) to reduce dimensionality.  Here, the linear system to solve is of the size

16 Linear Proximal SVM Algorithm Input Solve Calculate Non Define Classifier:

17 Linear & Nonlinear PSVM MATLAB Code function [w, gamma] = psvm(A,d,nu) % PSVM: linear and nonlinear classification % INPUT: A, d=diag(D), nu. OUTPUT: w, gamma % [w, gamma] = psvm(A,d,nu); [m,n]=size(A);e=ones(m,1);H=[A -e]; v=(d’*H)’ %v=H’*D*e; r=(speye(n+1)/nu+H’*H)\v % solve (I/nu+H’*H)r=v w=r(1:n);gamma=r(n+1); % getting w,gamma from r

18 Linear PSVM Comparisons with Other SVMs Much Faster, Comparable Correctness Data Set m x n PSVM Ten-fold test % Time (sec.) SSVM Ten-fold test % Time (sec.) SVM Ten-fold test % Time (sec.) WPBC (60 mo.) 110 x 32 68.5 0.02 68.5 0.17 62.7 3.85 Ionosphere 351 x 34 87.3 0.17 88.7 1.23 88.0 2.19 Cleveland Heart 297 x 13 85.9 0.01 86.2 0.70 86.5 1.44 Pima Indians 768 x 8 77.5 0.02 77.6 0.78 76.4 37.00 BUPA Liver 345 x 6 69.4 0.02 70.0 0.78 69.5 6.65 Galaxy Dim 4192 x 14 93.5 0.34 95.0 5.21 94.1 28.33

19 Linear PSVM vs LSVM 2-Million Dataset Over 30 Times Faster DatasetMethodTraining Correctness % Testing Correctness % Time Sec. NDC “Easy” LSVM90.8691.23658.5 PSVM90.8091.1320.8 NDC “Hard” LSVM69.8069.44655.6 PSVM69.8469.5220.6

20 Nonlinear PSVM: Spiral Dataset 94 Red Dots & 94 White Dots

21 Nonlinear PSVM Comparisons Data Set m x n PSVM Ten-fold test % Time (sec.) SSVM Ten-fold test % Time (sec.) LSVM Ten-fold test % Time (sec.) Ionosphere 351 x 34 95.2 4.60 95.8 25.25 95.8 14.58 BUPA Liver 345 x 6 73.6 4.34 73.7 20.65 73.7 30.75 Tic-Tac-Toe 958 x 9 98.4 74.95 98.4 395.30 94.7 350.64 Mushroom * 8124 x 22 88.0 35.50 88.8 307.66 87.8 503.74 * A rectangular kernel was used of size 8124 x 215

22 Conclusion  PSVM is an extremely simple procedure for generating linear and nonlinear classifiers  PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space for a linear classifier  Comparable test set correctness to standard SVM  Much faster than standard SVMs : typically an order of magnitude less.

23 Future Work  Extension of PSVM to multicategory classification  Massive data classification using an incremental PSVM  Parallel formulation and implementation of PSVM


Download ppt "Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of."

Similar presentations


Ads by Google