Presentation is loading. Please wait.

Presentation is loading. Please wait.

Support Vector Machines S.V.M. Special session Bernhard Schölkopf & Stéphane Canu GMD-FIRST I.N.S.A. - P.S.I.

Similar presentations


Presentation on theme: "Support Vector Machines S.V.M. Special session Bernhard Schölkopf & Stéphane Canu GMD-FIRST I.N.S.A. - P.S.I."— Presentation transcript:

1 Support Vector Machines S.V.M. Special session Bernhard Schölkopf & Stéphane Canu GMD-FIRST I.N.S.A. - P.S.I. http://svm.first.gmd.de/ http://psichaud.insa-rouen.fr/~scanu/

2 2 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 radial SVM

3 3 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Road map linear discrimination: the separable case linear discrimination: the NON separable case quadratic discrimination radial SVM –principle –3 regularization hyperparametres –some benchmark results (glass data) SMV for regression

4 4 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 What ’s new with SVM Artificial Neural Networks Support Vector Machine From biology to Machine learning –It works ! Some reason –formalization of learning : statistical learning theory - learning from data From maths ! to Machine learning = minimization –universality learn every thing : Kernel trick –complexity control but not any thing : Margin minimization + constraints

5 5 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Space functional Kernel’s trick

6 6 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Minimization with constraints L(x, ) : the Lagrangian (Lagrange, 1788)

7 7 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Minimization with constraints dual formulation Phase 1 Phase 2

8 8 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Linear discrimination the separable case + + + + + ++ + + + + + + wx+ b=0 Well classify all examples

9 9 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Margin + + + + + ++ + + + + + + Linear discrimination the separable case wx+ b=0 With the largest MARGIN Well classify all examples

10 10 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 + Linear discrimination the separable case y x ++ + 1 - 1

11 11 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 + Linear discrimination the separable case y = wx y x ++ + 1 - 1 MARGIN

12 12 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Margin With the largest MARGIN + + + + + ++ + + + + + + Linear discrimination the separable case wx+ b=0 Well classify all examples

13 13 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Linear classification- the separable case

14 14 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Equality constraint integration 0 0 = H c y y 

15 15 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Inequality constraint integration While (  ) do not verify optimality conditions = M -1 b and  = - H + c +  y if <0, a constraint is blocked : ( i =0) (an active variable is eliminated) else if  < 0, a constraint is relaxed QP

16 16 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Linear classification : the non separable case Error variables

17 17 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 quadratic SVM

18 18 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 polynomial classification 1n1n 1 5 Rang(H) = 5 regularization needed

19 19 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Gaussian Kernel based S.V.M.

20 20 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 1 d example Class 1 : mixture of 2 gaussian Class 2 : gaussian Training set Output of the SVM for the test set Margin Support vectors

21 21 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 3 regularization parameters C : the superior bound  : the kernel bandwidth: K  (x,y)  the linear system regularization H =b => (H+  I) =b

22 22 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Small bandwidth and large C

23 23 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Large bandwidth and large C

24 24 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Large bandwidth and small C

25 25 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 SVM for regression

26 26 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Example...

27 27 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999  small and  also

28 28 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Geostatistics

29 29 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 An other way to see things (Girosi, 97)

30 30 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 SVM history and trends Vapnik, V.; Lerner, A. 1963 statistical learning theory Mangasarian, O. 1965, 1968 optimization Kimeldorf, G; Wahba, G; 1971 non parametric regression : splines Boser, B.; Guyon, I..; Vapnik, V. 1992 Bennett, K.; Mangasarian, O. 1992 Learning Theory : Cortes, C. 1995. soft margin classifier, effective VC-dimensions other formalisms,... The pioneers The 2nd start : ANN, learning & computers... Trends... Applications : on-line handwritten C. R. Face recognition Text mining... Optimization : Vapnik Osuna, E. & Girosi, John C. Platt Linda Kaufman Thorsten Joachims

31 31 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Optimization issues QP with constraints Box constraints H is positive semidefinite (beware commercial solver) Size of H ! But a lot of  are 0 or C –active constraint set, starting with  = 0 –do not compute (store) the whole H –chunk multiclass issue !

32 32 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Optimization issues Solve the whole problem commercial : LOQO (primal-dual approach), MINOS, Matlab !!! Vapnik : More and Toraldo (1991) Decompose the problem Chunking (Vapnik, 82, 92), Ozuna & Girosi (implemented in SVMlight by Thorsten Joachims, 98) Sequential Minimal Optimization (SMO) John C. Platt, 98 No H : Start from 0 - active set technique (Linda Kaufman, 98) minimize the cost function –2nd order : Newton, –conjugate gradient, projected conjugate gradient PCG, Burges, 98 select the relevant constraints Interior point methods Moré, 91, Z. Dostal, 97 and others...

33 33 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Some benchmark considerations (Platt 98) Osuna’s decomposition technique permits the solution of SVMs via fixed-size QP subproblems Using two-variable QP subproblems (SMO) does not require QP library SMO trades off QP time for kernel evaluation time Optimizations can dramatically reduce kernel time –Linear SVMs (useful for text categorization) –Sparse dot products –Kernel caching (good for smaller problems, Thorsten Joachims, 98 ) SMO can be much faster than other techniques for some problems what about active set and interior points technique ?

34 34 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 open issues VC Entropy for Margin Classifiers: learning bounds other margin classifiers: boosting Non “L 2 ” (quadratic) cost function: Sparse coding (Drezet & Harrsion) curse of dimensionality: local vs global kernel influence (Tsuda) applications: –classification (Weston & Watkins), –…to regression (Pontil & al.) –face detection (Fernandez & Viennet) algorithms (Christiani & Campbell) making bridges - other formalisms: –bayesian (Kwok), –statistical mechanics (Buhot & Gordon), –logic (Sebag), …

35 35 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Books in Support Vector Research V. Vapnik, The Nature of Statistical Learning Theory. Springer-Verlag, 1995, Statistical Learning Theory. Wiley, 1998. SVM introductive chapter in : S. Haykin, Neural Networks, a Comprehensive Foundation. Macmillan, New York, NY., 1998 (2nd ed). V. Cherkassky and F. Mulier; Learning from Data: Concepts, Theory, and Methods. Wiley, 1998. C.J.C. Burges; 1998. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge, Discovery, Vol 2 Number 2. Schölkopf, B.; 1997. Support Vector Learning. PhD Thesis. Published by: R. Oldenbourg Verlag, Munich, 1997. ISBN 3-486-24632-1. Smola, A. J.; 1998. Learning with Kernels. PhD Thesis. Published by: GMD, Birlinghoven, 1999 NIPS’ 97 workshop’s book : B. Schölkopf, C. Burges, A. Smola. Advances in Kernel Methods: Support Vector Machines, MIT Press, Cambridge, MA; December 1998, NIPS’ 98 workshop’s book on large margin classifier… is coming

36 36 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Events in Support Vector Research ACAI '99 WORKSHOP Support Vector Machine Theory and Applications Workshop on Support Vector Machines - IJCAI'99, August 2, 1999, Stockholm, Sweden EUROCOLT'99 workshop on Kernel Methods, March 27, 1999, Nordkirchen Castle, Germany

37 37 ESANN'99 : Special session 7 on Support Vector Machines, Thursday 22 nd April 1999 Conclusion SVM select relevant patterns in a robust way - svm.cs.rhbnc.ac.uk Matlab code available under request - scanu@insa-rouen.fr Multi class problems Small error


Download ppt "Support Vector Machines S.V.M. Special session Bernhard Schölkopf & Stéphane Canu GMD-FIRST I.N.S.A. - P.S.I."

Similar presentations


Ads by Google