Download presentation
Presentation is loading. Please wait.
1
Libsvm-2.6使用介绍
2
Libsvm-2.6特点 Support multi-class classification
Different SVM formulation Cross-validation for model selection Probability estimate Weighted SVM for unbalanced data Both C++ and Java sources Version 2.8 released on April fool’s day,2005
3
Libsvm-2.6程序结构 Kernel 类 Solver类:Generalized SMO和SVMLight algorithm 解二次规划问题 采用one-against-one 解决多类分类
4
Format of training and testing data file
<label> <index1>:<value1> <index2>:<value2> ... +1 1: :1 3:1 4: : :-1 7:1 -1 1: :-1 3: : :1 6:-1 7:1 +1 1: :1 3:-1 4: : :-1 7:-1 -1 1: :1 3:1 4: : :-1 7:-1
5
Data scaling svmscale –l -1 –u 1 –s range train.1>train.1.scale
Avoid attributes in greater numeric ranges dominate those in smaller number ranges. Usually scale each attribute to [0,1] or[-1,+1]. svmscale –l -1 –u 1 –s range train.1>train.1.scale svmscale –r range test.1>test.1.scale
6
Svmtrain One-class:Here a hyperplane is placed such that it separates the dataset from the origin with maximal margin. The regularization parameter nu(0,1), is a user defined parameter indicating the fraction of the data that should be accepted by the description. nu-SVR: nu回归机。引入能够自动计算epsilon的参数nu。若记错误样本的个数为q ,则nu大于等于q/l,即nu是错误样本的个数所占总样本数的份额的上界;若记支持向量的个数为p,则nu小于等于p/l,即nu是支持向量的个数所占总样本数的份额的下界。首先选择参数nu和C,然后求解最优化问题。 Shrinking: 优化求解过程中是否采用shrinking. 边界支持向量BSVs(ai=C的SV)在迭代过程中ai不会变化,如果找到这些点,并把它们固定为C,可以减少QP的规模。 Probability estimate: 是否训练SVC和SVR获得概率输出 -wi 不平衡样本的加权参数
7
Output of training C-SVM
optimization finished, #iter = 219 nu = :nu-SVM is a somewhat equivalent form of C-SVM where C is replaced by nu. obj = :optimal objective value of the dual problme. rho = :bias term of the decision function. nSV = 132, nBSV = 107: number of the bounded support vectors Total nSV = 132
8
Model file svm_type c_svc kernel_type rbf gamma 0.0769231
nr_class 2:number of classes. For regression and one-class model, this number is 2. total_sv 132 rho label 1 -1 nr_sv 64 68: number of support vector for each class. SV
9
Two tools for Model Selection
Easy.py: does everything automatically-from data scaling to parameter selection Grid.py: uses grid search to find the best model parameters Grid.py的输出文件 -out: 搜索过程。每个参数取值及此时精度 -png: 搜索过程等高线图
10
Proposed procedure Transform data to the format of Libsvm.
Conduct simple scaling on the data. Consider the RBF kernel. Using the cross-validate to find the best model parameters. Using the best parameters to train the whole training set. Test
11
Experiments Original sets with default parameters Accuracy=9.7561%
Scaled sets with default parameters Accuracy= % Scaled sets with parameter selection Accuracy=95.123% Using an automatic script Accuracy=95.122%
12
Remark Recommend Python 2.3
Recommend Gnuplot version Vesion has a bug.
13
References A practical guide to support vector machines classification
LIBSVM: a Library for Support Vector Machines FAQ and Readme in Libsvm-2.6
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.