Download presentation
Presentation is loading. Please wait.
Published byKelly Angela Cole Modified over 8 years ago
1
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute of Statistical Science Academia Sinica 2003 International Conference on Informatics, Cybernetics, and Systems ISU, Kaohsiung, Dec. 14 2003
2
Outline Difficulties with nonlinear SVMs for large problems Storage and computational complexity Reduced Support Vector Machines Support Vector Machines for classification problems Linear and nonlinear SVMs Incremental Reduced Support Vector Machines Numerical Results Conclusions
3
Support Vector Machines (SVMs) Powerful tools for Data Mining SVMs have a sound theoretical foundation Base on statistical learning theory SVMs can be generated very efficiently and have high accuracy SVMs have an optimal defined separating surface algorithm for classification and regression SVMs become the most promising learning SVMs can be extend from linear to nonlinear case By using kernel functions
4
Support Vector Machines for Classification Maximizing the Margin between Bounding Planes A+ A-
5
Support Vector Machine Formulation Solve the quadratic program for some : min s. t. (QP) ,, denotes where or membership. SSVM : Smooth Support Vector Machine is an efficient SVM algorithm proposed by Yuh-Jye Lee
6
Nonlinear Support Vector Machine Extend to nonlinear cases by using kernel functions min s. t. Nonlinear Support Vector Machine formulation: The value of kernel function represents the inner product in the feature space Map data from input space to a higher dimensional feature space where the data can be separated linearly
7
Difficulties with Nonlinear SVM for Large Problems Separating surface depends on almost entire dataset Need to store the entire dataset after solving the problem The nonlinear kernel is fully dense Long CPU time to compute numbers Runs out of memory while storing kernel matrix Computational complexity depends on Complexity of nonlinear SSVM
8
Reduced Support Vector Machines Overcoming Computational & Storage Difficulties by Using a Rectangular Kernel Choose a small random sample of The small random sample is a representative sample of the entire dataset Typically is 1% to 10% of the rows of Replace by with corresponding in nonlinear SSVM the rectangular kernel Only need to compute and store numbers for Computational complexity reduces to The nonlinear separator only depends on
9
Reduced Set plays the most important role in RSVM It is natural to raise two questions: Is there a way to choose the reduced set other than random selection so that RSVM will have a better performance? Is there a mechanism that determines the size of reduced set automatically or dynamically? Incremental reduced support vector machine is proposed to answer these questions
10
Our Observations ( Ⅰ ) is a linear combination of a set of kernel functions If the kernel functions are very similar, the hypothesis space spanned by this kernel functions will be very limited. The nonlinear separating surface
11
Our Observations ( Ⅱ ) Start with a very small reduced set, then add new data point only when the kernel function is dissimilar to the current function set These points contribute the most extra information
12
The distance from the kernel vector to the column space of is greater than a threshold The information criterion is This distance can be determined by solving a least squares problem How to measure the dissimilar? solving least squares problems
13
Dissimilar Measurement solving least squares problems It has a unique solution, and the distance is
14
IRSVM Algorithm pseudo-code (sequential version) 1 Randomly choose two data from the training data as the initial reduced set 2 Compute the reduced kernel matrix 3 For each data point not in the reduced set 4 Computes its kernel vector 5 Computes the distance from the kernel vector 6 to the column space of the current reduced kernel matrix 7 If its distance exceed a certain threshold 8 Add this point into the reduced set and form the new reduced kernal matrix 9 Until several successive failures happened in line 7 10 Solve the QP problem of nonlinear SVMs with the obtained reduced kernel 11 A new data point is classified by the separating surface
15
Speed up IRSVM Note we have to solve the least squares problem many times whose time complixity is The main cost depends on but not on Take advantage of this fact, we proposed a batch version of IRSVM that examines a batch points once
16
IRSVM Algorithm pseudo-code (Batch version) 1 Randomly choose two data from the training data as the initial reduced set 2 Compute the reduced kernel matrix 3 For a batch data point not in the reduced set 4 Computes their kernel vectors 5 Computes the corresponding distances from these kernel vector 6 to the column space of the current reduced kernel matrix 7 For those points’ distance exceed a certain threshold 8 Add those point into the reduced set and form the new reduced kernal matrix 9 Until no data points in a batch were added in line 7,8 10 Solve the QP problem of nonlinear SVMs with the obtained reduced kernel 11 A new data point is classified by the separating surface
17
IRSVM on four public data sets
18
Conclusions IRSVM — an advanced algorithm of RSVM Start with extremely small reduced set and sequentially expands to include informative data points into the reduced set Determine the size of the reduced set automatically and dynamically but no pre-specified The reduced set generated by IRSVM will be more representative All advantages of RSVM for dealing with large scale nonlinear classification problem are retained Experimental tests show that IRSVM used a smaller reduced set without scarifying classification accuracy
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.