An Improved Algorithm for Decision-Tree-Based SVM Sindhu Kuchipudi INSTRUCTOR Dr.DONGCHUL KIM
OUTLINE: Introduction Decision-tree-based SVM. The class separability Measure in feature space. The Improved Algorithm For Decision-tree- Based SVM. Experiments And Results. Conclusion
INTRODUCTION: Decision-tree-based support vector machine which combines support vector machines and decision tree is an effective way for solving multi-class problems. Support vector machines(SVM) are the classifiers which were originally designed for binary classification. Distance measures such as the Euclidean distance and the Mahalanobis distance are often used as separability measures.
Decision-tree-based SVM: Decision-tree-based SVM for multi-class problem can resolve the existence of unclassifiable regions and has higher generalization ability than conventional method. Different tree structure corresponds to different division of feature structure and the classification performance of the classifier is closely related to the tree structure.
a)The example of the division of feature space b)Expression by decision tree Example:
a)The example of the division of feature space b)Expression by decision tree
THE CLASS SEPARABILITY MEASURE IN FEATURE SPACE: The Euclidean distance is commonly used as the separability measure. Euclidean distance between centers of the two classes can not always denote the separability between classes rightly.
Example: The comparison of separability among classes with equal center distance. The Euclidean distances among the centers of the three classes are the same, but it is obviously that class k can be classified more easily than that other classes. Therefore the distribution of classes is also an important factor of the between classes separability measure.
For a problem with k-classes, Suppose X i, i =1,...,k are sets of training data included in class i. sm ij be the separability measure between class i and class j. Where d ij is the Euclidean distance between the centre of the class i and class j. i,j = 1 …..,k, d ij = ||ci – cj||. C i is the centre of class i based on training sample.
n i is the sample number of class i σ i is the class variance, It is an index of the class distribution. If sm ij ≥ 1, then there is no overlap between class i and class j If sm ij < 1 there is overlap between class i and class j From the formula sm ij,we can say that bigger the sm ij the more easily separated between class i and class j.
Let the separability measure of class i be sm i, it can be defined as the minimum of the separability measure between class i and the others. The separability measure of class i indicates the separability of class i from the others. The most easily separated class is the class with the maximum separability measure:
The above separability measure sm ij is defined in input space. To get better separability the input space is mapped into the high-dimensional feature space. Suppose Φ is the mapping, the feature space is H and the kernel function is k(.,.). For input sample x1 and x2,Φ map them into feature space H, then the Euclidean distance between x1 and x2 in feature space H is :
In the feature space H, suppose m Φ is the class centre and Where n is the number of samples within class. Suppose {x1,x2,…xn1} and {x1’,x2’,… xn2’} are the training samples for two classes, Φ map them into feature space H, mΦ and m’Φ are the class centers in feature space H. Let d H (mΦ,m’Φ) be the distance between mΦ and m’Φ in feature space then,
For t e training samples {x1,x2,….xn} of a given class, let d h (x, mΦ) be the distance between training sample x and class center mΦ in feature space H, then Therefore,the separability measure between class i and j in feature space H can be defined as Where I is the class variance in feature space. The newly defined separability measure will be used in the formation of the decision tree.
The Improved Algorithm For Decision-tree- Based SVM: Suppose one class is separated from the remaining classes at hyper plane corresponding to each node of the decision tree. For a problem with k-classes the number of hyper planes to be calculated is k-1. i.e. the decision tree has k-1 nodes except the leaf nodes.
[algorithm : Improved decision tree based SVM] Suppose X i, i=1,….k are sets of training data included in class i, they constitute the set of active training data X., Step 1: calculate the separabillity measure in feature space sm ij i,j=1…k, the sm ij constitute a matrix of separability measures Step 2 : select the most easily separated class i o. i o = arg max sm i h where sm i h is the separability measure of class i Step 3:Using X i0 and X- X i0 as the training data set, calculate a hyperplane f i0,j0. Step 4:Update the set of active training data X. X<- X-X i0, t<- t -1 Step 5: If t>1,go to step 2;else end.
EXPERIMENTS AND RESULTS To evaluate effectiveness and the performance improvement of the improved algorithm for decision-tree based SVM. Experiments for the – Spiral data. – Wine data set.
Experiment for spiral data: Recognizing the two or three spiral data is a difficult task for many pattern recognition approaches since spiral data is highly non-linear. The synthetic 2D three-spiral data set has been used in our classification experiments. each spiral line belongs to different class. The synthetic 2D spiral can be expressed as parametric equation. Where k and α are constant,θ is radian and variable
There are 720 data points samples altogether, and 240 data points for each spiral. Three-spiral in three cycles The training of SVM is under the same condition. c=1000, the Gaussian kernel functions with same kernel size σ are used respectively.
Classification results for the synthetic three-spiral data set prove the performance improvement of the improved decision tree –based SVM.
Experiments for wine data set: Wine data set from UCI repository consist of 178 samples of 3 class, 59 in class1 71 in class2 48 in class each sample has 13 attributes. The training of SVMs is under the same condition the Gaussian kernel functions with the same kernel sizeσ are used, the kernel size σ changes from 5, 40 to 90.
Classification results for this data set also prove the performance improvement of the improved algorithm for decision-tree-based SVM.
CONCLUSION : In this paper we discussed decision-tree based SVM and the separability measure between classes based on the distribution of classes. In order to improve the generalization ability of SVM decision tree, a novel separability measure is given based on the distribution of the training samples in the feature space. Based on idea experiments for different data sets prove the performance improvement of the improved algorithm for decision-tree based SVM.
THANK YOU