컴퓨터 과학부 김명재
Introduction Data Preprocessing Model Selection Experiments
Support Vector Machine
SVM (Support vector machine) ◦ Training set of instance-label pairs ◦ where ◦ Objective function subject to
Dual space form ◦ Objective function maximize subject to
Nonlinear SVM ◦ Kernel method Training vectors Mapped into a higher dimensional space Maybe infinite Mapping function Objective function
◦ Kernel function Linear Polynomial Radial basis function Sigmoid are kernel parameter
Example ◦ Data url Application#training data #testing data #features#classes Astroparticle3, 0894,00042 Bioinfomatics Vehicle1,
Proposed Procedure ◦ Transform data to format of an SVM package ◦ Conduct simple scaling on the data ◦ Consider the RBF kernel ◦ Use cross-validation to find the best parameter and ◦ Use the best parameter and to train the whole training set ◦ Test
Categorical Feature ◦ Example Three-category such as {red, green, blue} can be represented as (0, 0, 1), (0, 1, 0), and (1, 0, 0) Scaling ◦ Scaling before applying SVM is very important. ◦ Linearly scaling each attribute to the range [-1, +1] or [0, 1].
RBF kernel ◦ RBF kernel is a reasonable first choice ◦ Nonlinearly maps samples into a higher dimensional space ◦ The number of hyperparameters which influences the complexity of model selection. ◦ Fewer numerical difficulties
Cross-validation
◦ Find the good ◦ Avoid the overfitting problem ◦ v-fold cross-validation Divide the training set into v subsets of equal size Sequentially, on subset is tested using the classifier trained on the remaining v-1 subsets.
Grid-search ◦ Various pairs of ◦ Find a good parameter for example
Grid-search
Astroparticle Physics ◦ original accuracy % ◦ after scaling % ◦ after grid-search % (3875/4000)
Bioinformatics ◦ original cross validation accuracy % ◦ after scaling cross validation accuracy % ◦ after grid-search %
Vehicle ◦ original accuracy % ◦ after scaling % ◦ after grid-searching % (36/41)
libSVM ◦ A Training Algorithm for optimal Margin classifiers ◦ Bernhard E. Boser, Isabelle M. Guyon, Vladimir N. Vapnik 수업교재
end of pages