Presentation is loading. Please wait.

Presentation is loading. Please wait.

Difficulties with Nonlinear SVM for Large Problems  The nonlinear kernel is fully dense  Computational complexity depends on  Separating surface depends.

Similar presentations


Presentation on theme: "Difficulties with Nonlinear SVM for Large Problems  The nonlinear kernel is fully dense  Computational complexity depends on  Separating surface depends."— Presentation transcript:

1 Difficulties with Nonlinear SVM for Large Problems  The nonlinear kernel is fully dense  Computational complexity depends on  Separating surface depends on almost entire dataset  Need to store the entire dataset after solving the problem  Complexity of nonlinear SSVM  Runs out of memory while storing kernel matrix  Long CPU time to compute numbers

2 Overcoming Computational & Storage Difficulties Use a Rectangular Kernel  Choose a small random sample of  The small random sample is a representative sample of the entire dataset  Typically is 1% to 10% of the rows of  Replace by with corresponding in nonlinear SSVM the rectangular kernel  Only need to compute and store numbers for  Computational complexity reduces to  The nonlinear separator only depends on Using gives lousy results!

3 Reduced Support Vector Machine Algorithm Nonlinear Separating Surface: (i) Choose a random subset matrix of entire data matrix (ii) Solve the following problem by the Newton method with corresponding : min (iii) The separating surface is defined by the optimal solution in step (ii):

4 How to choose in RSVM?  Remove a random portion of dataset as a tuning set  Start with a small  Compute correctness for each run on the fixed tuning set  Compute the standard deviation of tuning set correctness for the 10 runs  Remaining part of dataset is our training set Otherwise increase  If the standard deviation is small (< 0.01) then use this.  Repeat RSVM for 10 different random subsets training set of the

5 How to Choose in RSVM?  is a representative sample of the entire dataset  Need not be a subset of  A good selection of may generate a classifier using very small  Possible ways to choose :  Choose random rows from the entire dataset  Choose such that the distance between its rows exceeds a certain tolerance  Use k cluster centers of as and

6 A Nonlinear Kernel Application Checkerboard Training Set: 1000 Points in Separate 486 Asterisks from 514 Dots

7 Conventional SVM Result on Checkerboard Using 50 Randomly Selected Points Out of 1000

8 RSVM Result on Checkerboard Using SAME 50 Random Points Out of 1000

9 RSVM on Moderate Sized Problems (Best Test Set Correctness %, CPU seconds) Cleveland Heart 297 x 13, 30 86.47 3.04 85.92 32.42 76.88 1.58 BUPA Liver 345 x 6, 35 74.86 2.68 73.62 32.61 68.95 2.04 Ionosphere 351 x 34, 35 95.19 5.02 94.35 59.88 88.70 2.13 Pima Indians 768 x 8, 50 78.64 5.72 76.59 328.3 57.32 4.64 Tic-Tac-Toe 958 x 9, 96 98.75 14.56 98.43 1033.5 88.24 8.87 Mushroom 8124 x 22, 215 89.04 466.20 N/A 83.90 221.50

10 RSVM on Large UCI Adult Dataset Standard Deviation over 50 Runs = 0.001 Average Correctness % & Standard Deviation, 50 Runs (6414, 26148) 84.470.00177.030.014210 3.2% (11221, 21341) 84.710.00175.960.016225 2.0% (16101, 16461) 84.900.00175.450.017242 1.5% (22697, 9865) 85.310.00176.730.018284 1.2% (32562, 16282) 85.070.00176.950.013326 1.0%

11 CPU Times on UCI Adult Dataset RSVM, SMO and PCGC with a Gaussian Kernel Adult Dataset : Training Set Size vs. CPU Time in Seconds Size 31854781641411221161012269732562 RSVM 44.283.6123.4227.8342.5587.4980.2 SMO 66.2146.6258.8781.41784.44126.47749.6 PCGC 380.51137.22530.611910.6 Ran out of memory

12 Time( CPU sec. ) Training Set Size CPU Time Comparison on UCI Dataset RSVM, SMO and PCGC with a Gaussian Kernel


Download ppt "Difficulties with Nonlinear SVM for Large Problems  The nonlinear kernel is fully dense  Computational complexity depends on  Separating surface depends."

Similar presentations


Ads by Google