Libsvm-2.6使用介绍 quietsea@bbs.hit.edu.cn.

Libsvm-2.6使用介绍

Libsvm-2.6特点 Support multi-class classification
Different SVM formulation Cross-validation for model selection Probability estimate Weighted SVM for unbalanced data Both C++ and Java sources Version 2.8 released on April fool’s day,2005

Libsvm-2.6程序结构 Kernel 类 Solver类：Generalized SMO和SVMLight algorithm 解二次规划问题采用one-against-one 解决多类分类

Format of training and testing data file
<label> <index1>:<value1> <index2>:<value2> ... +1 1: :1 3:1 4: : :-1 7:1 -1 1: :-1 3: : :1 6:-1 7:1 +1 1: :1 3:-1 4: : :-1 7:-1 -1 1: :1 3:1 4: : :-1 7:-1

Data scaling svmscale –l -1 –u 1 –s range train.1>train.1.scale
Avoid attributes in greater numeric ranges dominate those in smaller number ranges. Usually scale each attribute to [0,1] or[-1,+1]. svmscale –l -1 –u 1 –s range train.1>train.1.scale svmscale –r range test.1>test.1.scale

Svmtrain One-class:Here a hyperplane is placed such that it separates the dataset from the origin with maximal margin. The regularization parameter nu(0,1), is a user defined parameter indicating the fraction of the data that should be accepted by the description. nu-SVR: nu回归机。引入能够自动计算epsilon的参数nu。若记错误样本的个数为q ,则nu大于等于q/l,即nu是错误样本的个数所占总样本数的份额的上界；若记支持向量的个数为p,则nu小于等于p/l,即nu是支持向量的个数所占总样本数的份额的下界。首先选择参数nu和C,然后求解最优化问题。 Shrinking：优化求解过程中是否采用shrinking. 边界支持向量BSVs（ai＝C的SV）在迭代过程中ai不会变化，如果找到这些点，并把它们固定为C，可以减少QP的规模。 Probability estimate: 是否训练SVC和SVR获得概率输出 -wi 不平衡样本的加权参数

Output of training C-SVM
optimization finished, #iter = 219 nu = :nu-SVM is a somewhat equivalent form of C-SVM where C is replaced by nu. obj = :optimal objective value of the dual problme. rho = :bias term of the decision function. nSV = 132, nBSV = 107: number of the bounded support vectors Total nSV = 132

Model file svm_type c_svc kernel_type rbf gamma 0.0769231
nr_class 2:number of classes. For regression and one-class model, this number is 2. total_sv 132 rho label 1 -1 nr_sv 64 68: number of support vector for each class. SV

Two tools for Model Selection
Easy.py: does everything automatically-from data scaling to parameter selection Grid.py: uses grid search to find the best model parameters Grid.py的输出文件 -out: 搜索过程。每个参数取值及此时精度 -png: 搜索过程等高线图

Proposed procedure Transform data to the format of Libsvm.
Conduct simple scaling on the data. Consider the RBF kernel. Using the cross-validate to find the best model parameters. Using the best parameters to train the whole training set. Test

Experiments Original sets with default parameters Accuracy=9.7561%
Scaled sets with default parameters Accuracy= % Scaled sets with parameter selection Accuracy=95.123% Using an automatic script Accuracy=95.122%

Remark Recommend Python 2.3
Recommend Gnuplot version Vesion has a bug.

References A practical guide to support vector machines classification
LIBSVM: a Library for Support Vector Machines FAQ and Readme in Libsvm-2.6

Libsvm-2.6使用介绍 quietsea@bbs.hit.edu.cn.

Similar presentations

Presentation on theme: "Libsvm-2.6使用介绍 quietsea@bbs.hit.edu.cn."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Libsvm-2.6使用介绍 quietsea@bbs.hit.edu.cn.

Similar presentations

Presentation on theme: "Libsvm-2.6使用介绍 quietsea@bbs.hit.edu.cn."— Presentation transcript:

Similar presentations

About project

Feedback