Download presentation
Presentation is loading. Please wait.
Published byCharleen Gordon Modified over 9 years ago
1
Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Second Annual Review June 1, 2001 Data Mining Institute University of Wisconsin - Madison
2
Key Contributions Fast new support vector machine classifier An order of magnitude faster than standard classifiers Extremely simple to implement 4 lines of MATLAB code NO optimization packages (LP,QP) needed
3
Outline of Talk (Standard) Support vector machine (SVM) classifiers Proximal support vector machines (PSVM) classifiers Geometric motivation Linear PSVM classifier Nonlinear PSVM classifier Full and reduced kernels Numerical results Correctness comparable to standard SVM Much faster classification! 2-million points in 10-space in 21 seconds Compared to over 10 minutes for standard SVM
4
Support Vector Machines Maximizing the Margin between Bounding Planes A+ A-
5
Proximal Vector Machines Fitting the Data using two parallel Bounding Planes A+ A-
6
SVM as an Unconstrained Minimization Problem At the solution of (QP) : where, Hence (QP) is equivalent to : min s. t. (QP) Changing to 2-norm and measuring margin in ( ) space:
7
PSVM Formulation We have from the QP SVM formulation: (QP) min s. t. This simple, but critical modification, changes the nature of the optimization problem tremendously!! Solving for in terms of and gives: min
8
Advantages of New Formulation Objective function remains strongly convex An explicit exact solution can be written in terms of the problem data PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space Exact leave-one-out-correctness can be obtained in terms of problem data
9
Linear PSVM We want to solve: min Setting the gradient equal to zero, gives a nonsingular system of linear equations. Solution of the system gives the desired PSVM classifier
10
Linear PSVM Solution Here, The linear system to solve depends on: which is of the size is usually much smaller than
11
Linear Proximal SVM Algorithm Classifier: Input Define Solve Calculate
12
Nonlinear PSVM Formulation By QP “duality”,. Maximizing the margin in the “dual space”, gives: min Replace by a nonlinear kernel : min Linear PSVM: (Linear separating surface: ) (QP) min s. t.
13
The Nonlinear Classifier Gaussian (Radial Basis) Kernel : Polynomial Kernel : The nonlinear classifier: Where K is a nonlinear kernel, e.g.:
14
Nonlinear PSVM Defining slightly different: Similar to the linear case, setting the gradient equal to zero, we obtain: However, reduced kernels techniques can be used (RSVM) to reduce dimensionality. Here, the linear system to solve is of the size
15
Linear Proximal SVM Algorithm Input Solve Calculate Non Define Classifier:
16
PSVM MATLAB Code function [w, gamma] = psvm(A,d,nu) % PSVM: linear and nonlinear classification % INPUT: A, d=diag(D), nu. OUTPUT: w, gamma % [w, gamma] = pvm(A,d,nu); [m,n]=size(A);e=ones(m,1);H=[A -e]; v=(d’*H)’ %v=H’*D*e; r=(speye(n+1)/nu+H’*H)\v % solve (I/nu+H’*H)r=v w=r(1:n);gamma=r(n+1); % getting w,gamma from r
17
Linear PSVM Comparisons with Other SVMs Much Faster, Comparable Correctness Data Set m x n PSVM Ten-fold test % Time (sec.) SSVM Ten-fold test % Time (sec.) SVM Ten-fold test % Time (sec.) WPBC (60 mo.) 110 x 32 68.5 0.02 68.5 0.17 62.7 3.85 Ionosphere 351 x 34 87.3 0.17 88.7 1.23 88.0 2.19 Cleveland Heart 297 x 13 85.9 0.01 86.2 0.70 86.5 1.44 Pima Indians 768 x 8 77.5 0.02 77.6 0.78 76.4 37.00 BUPA Liver 345 x 6 69.4 0.02 70.0 0.78 69.5 6.65 Galaxy Dim 4192 x 14 93.5 0.34 95.0 5.21 94.1 28.33
18
Linear PSVM Comparisons on Larger Adult Dataset Much Faster & Comparable Correctness Dataset SizeTesting correctness % Running time Sec. (Best in Red) (Train,Test) Attributes=123 PSVMLSVMSSVMSORSMOSVM (11221,21341)84.48 2.5 84.84 38.9 84.79 14.1 84.37 18.8 - 17.0 84.68 306.6 (16101,16461)84.78 3.7 85.01 60.5 84.96 21.5 84.62 24.8 - 35.3 84.83 667.2 (22697,9865)85.16 5.2 85.35 92.0 85.35 29.0 85.06 31.3 - 85.7 85.17 1425.6 (32562,16282)84.56 7.4 85.05 140.9 85.02 44.5 84.96 83.9 - 163.6 85.05 2184.0
19
Linear PSVM vs LSVM 2-Million Dataset Over 30 Times Faster DatasetMethodTraining Correctness % Testing Correctness % Time Sec. NDC “Easy” LSVM90.8691.23658.5 PSVM90.8091.1320.8 NDC “Hard” LSVM69.8069.44655.6 PSVM69.8469.5220.6
20
Nonlinear PSVM: Spiral Dataset 94 Red Dots & 94 White Dots
21
Nonlinear PSVM Comparisons Data Set m x n PSVM Ten-fold test % Time (sec.) SSVM Ten-fold test % Time (sec.) LSVM Ten-fold test % Time (sec.) Ionosphere 351 x 34 95.2 4.60 95.8 25.25 95.8 14.58 BUPA Liver 345 x 6 73.6 4.34 73.7 20.65 73.7 30.75 Tic-Tac-Toe 958 x 9 98.4 74.95 98.4 395.30 94.7 350.64 Mushroom * 8124 x 22 88.0 35.50 88.8 307.66 87.8 503.74 * A rectangular kernel was used of size 8124 x 215
22
Conclusion PSVM is an extremely simple procedure for generating linear and nonlinear classifiers PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space for a linear classifier Comparable test set correctness to standard SVM Much faster than standard SVMs : typically an order of magnitude less.
23
Future Work Extension of PSVM to multicategory classification Massive data classification using an incremental PSVM Parallel extension and implementation of PSVM
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.