Kernels in Pattern Recognition
A Langur - Baboon Binary Problem m/2006/ /himplu s4.jpg … HA HA HA … db4/00381/sickworld.net/ _uimages/baboons.jpg
Representation of Binary Data
Concept of Kernels Idea proposed by Aizerman in Feature … space … dimensionality … transformation such that The dot product exists {i.e. is not infinite} in higher dimension & Data is linearly separable.
Dot Product The scalar value signifies the amount of projection of a in the direction of b The scalar value also signifies the degree of similarity between a and b Adopted from k/~jenolive/vect6.html
A Geometrical Interpretation Mapping Mapping data from low dimension to high dimension. Data is linearly separable in higher dimension. Separable hyperplane defined by a normal or weight vector.
Cross Product Normal vector or Weight vector i.e. perpendicular to the hyperplane. nolive/vect8.html Area covered while moving a to b in counterclockwise direction moves the vector upwards... Like tightening of a screw This vector is perpendicular to the plane in which a and b lie.
Importance of dot product & kernel == dot product Classification requires computation of dot product between normal of hyperplane and test point. Often, normal is expressed as a linear combination of points in higer dimension. Dot products signify on which side of the hyperplane the test point lies – act of classification Dot product computation expensive and transformation not easy to find, so propose a kernel function, whose scalar value is equivalent to the dot product in higer dimensional plane.
Geometrical Interpretation of Importance of dot product & kernel == dot product
How does a kernel look like? A Planner View from Top
How does a kernel look like? An Isometric View from different Side angles
The End
Vapnick proposes Support Vector Machines
An Apple – Orange Binary Problem /wiki/Image:Apples.jp g /wiki/Image:Ambersw eet_oranges.jpg
Representation of Binary Data
Separable Case
The Lagrangian Optimize Subject to Differentiate w.r.t w weight vector b the constant alpha Lagrangian parameter
Non-Separable Case
The Lagrangian Optimize Subject to Differentiate w.r.t w weight vector b the constant alpha Lagrangian parameter xi another Lagrangian paramer
Finally … after some mental mathematical harrasment we get: Optimized values of weight vector and b values. And Then Use it to classify new test examples …
In The End If SVMs can’t help classify… then DITCH them and classify apples and oranges by eating them yourself...