Page: 1 of 38 Support Vector Machine 李旭斌 (LI mining Lab. 6/19/2012.

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

Sequential Minimal Optimization Advanced Machine Learning Course 2012 Fall Semester Tsinghua University.
ECG Signal processing (2)
An Introduction of Support Vector Machine
Classification / Regression Support Vector Machines
A KTEC Center of Excellence 1 Pattern Analysis using Convex Optimization: Part 2 of Chapter 7 Discussion Presenter: Brian Quanz.
An Introduction of Support Vector Machine
Support Vector Machines
Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
Face Recognition & Biometric Systems Support Vector Machines (part 2)
Chapter 4: Linear Models for Classification
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
The loss function, the normal equation,
Feature Selection Presented by: Nafise Hatamikhah
LibSVM LING572 Fei Xia Week 9: 3/4/08 1. Documentation The libSVM directory on Patas: /NLP_TOOLS/svm/libsvm/latest/
Support Vector Machines (and Kernel Methods in general)
x – independent variable (input)
SVM Support Vectors Machines
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
SVM (Support Vector Machines) Base on statistical learning theory choose the kernel before the learning process.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Ranga Rodrigo April 5, 2014 Most of the sides are from the Matlab tutorial. 1.
An Introduction to Support Vector Machines Martin Law.
Efficient Model Selection for Support Vector Machines
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
“Study on Parallel SVM Based on MapReduce” Kuei-Ti Lu 03/12/2015.
8/25/05 Cognitive Computations Software Tutorial Page 1 SNoW: Sparse Network of Winnows Presented by Nick Rizzolo.
Part II Support Vector Machine Algorithms. Outline  Some variants of SVM  Relevant algorithms  Usage of the algorithms.
Support Vector Machines Mei-Chen Yeh 04/20/2010. The Classification Problem Label instances, usually represented by feature vectors, into one of the predefined.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
Support vector machines for classification Radek Zíka
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.
Presenter: Wen-Feng Hsiao ( 蕭文峰 ) 2009/8/31 1. Outline Introduction Related Work Proposed Method Experiments and Results Conclusions 2.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
An Introduction to Support Vector Machines (M. Law)
Applying Statistical Machine Learning to Retinal Electrophysiology Matt Boardman January, 2006 Faculty of Computer Science.
START OF DAY 5 Reading: Chap. 8. Support Vector Machine.
CS 478 – Tools for Machine Learning and Data Mining SVM.
Libsvm-2.6使用介绍
Ohad Hageby IDC Support Vector Machines & Kernel Machines IP Seminar 2008 IDC Herzliya.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Machine Learning in Practice Lecture 19 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Text Classification using Support Vector Machine Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
ECE 471/571 – Lecture 22 Support Vector Machine 11/24/15.
Gist 2.3 John H. Phan MIBLab Summer Workshop June 28th, 2006.
SVMs in a Nutshell.
CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't threshold the shapes.ppt image: Shape1: elongation.
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
Experience Report: System Log Analysis for Anomaly Detection
CSSE463: Image Recognition Day 14
PREDICT 422: Practical Machine Learning
Support Vector Machine 04/26/17
Zhenshan, Wen SVM Implementation Zhenshan, Wen
Ying shen Sse, tongji university Nov. 2016
Support Vector Machines (SVM)
An Introduction to Support Vector Machines
An Introduction to Support Vector Machines
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Machine Learning in Practice Lecture 26
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
The loss function, the normal equation,
SVMs for Document Ranking
Support Vector Machines 2
Presentation transcript:

Page: 1 of 38 Support Vector Machine 李旭斌 (LI mining Lab. 6/19/2012

Page: 2 of 38 No theory, Just use Structural risk minimization VC dimension hyperplane Maximum Margin Classifier Bla,bla …. Paper: What is a support vector machine? Kernel function Theory is so complicated …

Page: 3 of 38 What can it do?  Main usage: Classification: C-SVC, nu-SVC Regression: epsilon-SVR, nu-SVR Distribution estimation: one-class SVM  Other: clustering

Page: 4 of 38 But, we have many software with friendly interface.

Page: 5 of 38 Who can achieve SVM?  libSVM Java, C, R, MATLAB, Python, Perl, C#...CUDA! Hadoop(Mahout)! WEKA Weka-Parallel MATLAB SVM Toolbox Spider SVM in R GPU-accelerated LIBSVM

Page: 6 of 38 Examples for Machine Learning Algorithms

Page: 7 of 38 Classification SVM

Page: 8 of 38 Regression SVR

Page: 9 of 38 Clustering K-means Shortcuts from MLDemos.MLDemos

Page: 10 of 38 Let ’ s back to libSVM

Page: 11 of 38 Format of input The format of training and testing data file is: : :.... Each line contains an instance and is ended by a '\n' character. For classification, is an integer indicating the class label (multi-class is supported). For regression, is the target value which can be any real number. For one-class SVM, it's not used so can be any number. The pair : gives a feature (attribute) value: is an integer starting from 1 and is a real number. Example: 1 0:1 1:4 2:6 3:1 1 0:2 1:6 2:8 3:0 0 0:3 1:1 2:0 3:1

Page: 12 of 38 Parameters Usage: svm-train [options] training_set_file [model_file] options: -s svm_type : set type of SVM (default 0) 0 -- C-SVC 1 -- nu-SVC 2 -- one-class SVM 3 -- epsilon-SVR 4 -- nu-SVR -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) 4 -- precomputed kernel (kernel values in training_set_file) Attention: Parameters in formula

Page: 13 of 38 -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/num_features) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu- SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1) -v n: n-fold cross validation mode -q : quiet mode (no outputs)

Page: 14 of 38 nu-SVC & C-SVC “ Basically they are the same thing but with different parameters. The range of C is from zero to infinity but nu is always between [0,1]. A nice property of nu is that it is related to the ratio of support vectors and the ratio of the training error. ”

Page: 15 of 38 one-class SVM  Fault diagnosis Train set is always made up of normal instances. Label: 1 (no -1) Test set contains unknown statuses (instances). Output Label: 1 or -1 1 : normal -1: anomalous  Anomaly detection

Page: 16 of 38 epsilon-SVR & nu-SVR Paper: LIBSVM: A Library for Support Vector Machines

Page: 17 of 38 Comparison epsilon nu epsilon nu

Page: 18 of 38 Related experience Usage and grid search Code Analysis Chinese version of libSVM FAQ

Page: 19 of 38 libSVM Guide uide.pdf

Page: 20 of 38 train svm-train.model svm-predict test result Flowchart of Task  Train set and test set should been both scaled. Before that, do you really need to scale them? svm-scale train.scale svm-scale test.scale

Page: 21 of 38 Parameters are important!  Good parameters will build a good model. How to get the ‘ good ’ parameters? Features are important! Model is also important! Stupid line

Page: 22 of 38 Example Train Set C=2, g=100 Positive 83% Negative 85% C=50, g=100 Positive 86% Negative 91%  ROC ? Click here

Page: 23 of 38 ROC ? Predict 10Total Real1 True Positive ( TP ) False Negative ( FN ) Actual Positive(TP+FN) 0 False Positive ( FP) True Negative(TN)Actual Negative(FP+TN) TotalPredicted Positive(TP+FP)Predicted Negative(FN+TN)TP+FP+FN+TN  We need TPR and FPR X-axis: FPR (1-specificity) Y-axis: TPR (Sensitivity)

Page: 24 of 38 Parameter Selection  Grid Search  Particle Swarm Optimization  Other Algorithm  Manual try … Random? My God!! Now, our work: Type: Classification. Goal: Find best (C, G)

Page: 25 of 38 Grid Search

Page: 26 of 38 Parallel Grid Search SSH Command grid.py Hadoop-based: 使用 MapReduce 对 svm 模型进行训练

Page: 27 of 38 Particle Swarm Optimization (PSO)  demo demo  Demo can ’ t work? Click here

Page: 28 of 38 Climb mountain  Peak is destination.  Higher and slower.

Page: 29 of 38 Similar Algorithms  Hill-climbing algorithm  Genetic algorithm  Ant colony optimization  Simulated annealing algorithm

Page: 30 of 38 Let ’ s back to PSO. Paper: Development of Particle Swarm Optimization Algorithm

Page: 31 of 38 Particle Swarm Optimization Birds hurt food C G 0 Distance (C best, G best )

Page: 32 of 38 PSO and Parameter Selection  PSO Find a point (C, G) to make the distance between (C, G) and (Cbest, Gbest) shortest.  Parameter Selection Find a pair (C, G) to make the error rate lowest. Estimate function

Page: 33 of 38 Position of Particle i : Speed: Particle i best: Global best: Update rule: Update position Update speed Update weight

Page: 34 of 38 Max Iteration(20) threshold (0.03) Max dead-stop times(10) Stop criterion Algorithm const variables Dimension (M = 2) Number of Particles (N = 20-50) Space scope (0<X[i]<1024, 0<i<M) Max speed Speedup factor = 2 =

Page: 35 of 38 Figure too small? Click here

Page: 36 of 38 Example There is a problem.

Page: 37 of 38 Discussion

Page: 38 of 38 Thank you for your attention!