Download presentation
Presentation is loading. Please wait.
1
Support Vector Machine
COMP9417 Machine Learning and Data Mining Support Vector Machine June 3, 2009 12/25/2017 Support Vector Machine
2
Support Vector Machine
Linear Classifier Two-class classification can be viewed as segmenting feature space: Linear decision boundary – hyperplane Any regression techniques can be used for classification 12/25/2017 Support Vector Machine
3
Issues of Linear Classifiers
There are many of linear decision boundary, which one should be used? 12/25/2017 Support Vector Machine
4
Issues of Linear Classifiers (ctd)
Some time it is not possible to define a separating hyperplane. Filled circle – output one; hollow circle – output zero. AND – divided 1’s from 0’s with single line; XOR – not possible. AND 0,1 0,0 1,1 1,0 XOR 0,1 0,0 1,1 1,0 12/25/2017 Support Vector Machine
5
Support Vector Machine
Support vector machines (machine = algorithm) learn linear classifiers Can avoid overfitting – learn a form of decision boundary called the maximum margin hyperplane Fast for mapping to nonlinear spaces Employ a mathematical trick to avoid the actual creation of new “pseudo-attributes” in transformed instance space i.e. the nonlinear space is created implicitly 12/25/2017 Support Vector Machine
6
Training A Support Vector Machine
Learning problems: fit maximum margin hyperplane, i.e. a kind of linear model For a linearly separable two-class data set, the maximum margin hyperplane is the classification surface which Correctly classifies all examples in the data set Has the greatest separation between classes Maximum margin hyperplane is orthogonal to shortest line connecting convex hulls, intersects with it halfway “Convex hall” of instances in each class is tightest enclosing convex polygon For a linearly separable two-class data set convex hulls do not overlap The more “separated” the classes, the larger the margin, the better the generalisation 12/25/2017 Support Vector Machine
7
Training A Support Vector Machine
ρ r 12/25/2017 Support Vector Machine
8
Margin and Support Vector
Distance between an example x and the decision boundary (hyperplane): The instance closest to the maximum margin hyperplane are called support vector Margin ρ of the separator is the distance between support vectors: Only support vectors matter; other training examples are ignorable, i.e. can be deleted without changing the position and orientation of the hyperplane. 12/25/2017 Support Vector Machine
9
Solving the Optimisation Problem
Find w and b such that is maximised and for all This is equivalent to minimising Determining the parameters is a constrained quadratic optimisation problem This can be solved by standard algorithm or special-purpose algorithms such Platt’s sequential minimal optimisation (SMO in Weka). 12/25/2017 Support Vector Machine
10
Solving the Optimisation Problem
Then the classifying function is Relies on an inner product between the test point x and the support vectors xi. Solving the optimisation problem involved computing the inner products xiTxj between all training points. 12/25/2017 Support Vector Machine
11
Support Vector Machine
Nonlinear SVMs All the above assumes linear separable data. What about datasets not linear separable? Map the original feature space to some higher-dimensional feature space where the training set is separable 12/25/2017 Support Vector Machine
12
Support Vector Machine
Kernel Trick The linear classifier relies on inner product between vectors When mapping every data point into high-dimensional space via some transformation function Ф, the inner product becomes: A kernel function is a function that is equivalent to an inner product in some feature spaces. A kernel function implicitly maps data to a space without the need to compute Ф(x). 12/25/2017 Support Vector Machine
13
Support Vector Machine
Kernel Function Mercer’s theorem: Every semi-positive definite symmetric function is a kernel function Examples of kernel function: Linear: Polynomial of degree p: Radial-basis function: 12/25/2017 Support Vector Machine
14
Support Vector Machine
Variations of SVMs Soft margin SVMs – noisy data Minimise Slack variables ξi can be added to allow misclassification of noisy instances Parameter C to control overfitting by trading off maximising the margin and fitting the training data Support vector regression – numeric prediction problems ξi 12/25/2017 Support Vector Machine
15
Support Vector Machine
Applications SVMs are currently among the best classifiers for a number of tasks Example of SVM applications Computer Vision Handwritten digit recognition Bioinformatics Text classification 12/25/2017 Support Vector Machine
16
SVM for Computer Vision
Face detection Pedestrian detection Figure from, “A general framework for object detection,” by C. Papageorgiou, M. Oren and T. Poggio, Proc. Int. Conf. Computer Vision, 1998, copyright 1998, IEEE 12/25/2017 Support Vector Machine
17
SVM for Vision – Pedestrian Detection
12/25/2017 Support Vector Machine
18
Pedestrian Detection (ctd)
12/25/2017 Support Vector Machine
19
Pedestrian Detection (ctd)
12/25/2017 Support Vector Machine
20
Pedestrian Detection (ctd)
Figure from, “A general framework for object detection,” by C. Papageorgiou, M. Oren and T. Poggio, Proc. Int. Conf. Computer Vision, 1998, copyright 1998, IEEE 12/25/2017 Support Vector Machine
21
Pedestrian Detection (ctd)
Figure from, Constantine Papageorgiou, Tomaso Poggio, A Pattern Classification Approach to Dynamical Object Detection. Proceedings of ICCV, 1999, pp 12/25/2017 Support Vector Machine
22
Support Vector Machine
References Machine Learning, Tom M. Mitchell, McGraw-Hill, 1997. Pattern Classification, Richard O. Duda, Peter E. Hart, and David G. Stork, John Wiley & Sons, 2001. 12/25/2017 Support Vector Machine
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.