Minimax Probability Machine (MPM)

Slides:



Advertisements
Similar presentations
Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Advertisements

ECG Signal processing (2)
Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
An Introduction of Support Vector Machine
Support Vector Machines
SVM—Support Vector Machines
Rutgers CS440, Fall 2003 Support vector machines Reading: Ch. 20, Sec. 6, AIMA 2 nd Ed.
Variations of Minimax Probability Machine Huang, Kaizhu
Discriminant Functions Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
CHAPTER 4: Parametric Methods. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Parametric Estimation X = {
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
An Introduction to Support Vector Machines Martin Law.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
An Introduction to Support Vector Machines (M. Law)
CS 478 – Tools for Machine Learning and Data Mining SVM.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Support Vector Machines Tao Department of computer science University of Illinois.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Support Vector Machines Exercise solutions Ata Kaban The University of Birmingham.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
SVMs in a Nutshell.
CSSE463: Image Recognition Day 14 Lab due Weds. Lab due Weds. These solutions assume that you don't threshold the shapes.ppt image: Shape1: elongation.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Linear machines márc Decison surfaces We focus now on the decision surfaces Linear machines = linear decision surface Non-optimal solution but.
Support Vector Machine Slides from Andrew Moore and Mingyue Tan.
CSSE463: Image Recognition Day 14
Support Vector Machine
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
CH 5: Multivariate Methods
Geometrical intuition behind the dual problem
LECTURE 16: SUPPORT VECTOR MACHINES
An Introduction to Support Vector Machines
An Introduction to Support Vector Machines
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Pawan Lingras and Cory Butz
Robust Optimization and Applications in Machine Learning
Support Vector Machines
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Linear machines 28/02/2017.
Support Vector Machines
CSSE463: Image Recognition Day 14
CSSE463: Image Recognition Day 14
COSC 4335: Other Classification Techniques
Machine Learning Week 3.
CSSE463: Image Recognition Day 15
CSSE463: Image Recognition Day 15
LECTURE 17: SUPPORT VECTOR MACHINES
CSSE463: Image Recognition Day 14
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
CSSE463: Image Recognition Day 15
Support Vector Machines
CSSE463: Image Recognition Day 14
Support Vector Machines and Kernels
Support vector machines
CSSE463: Image Recognition Day 14
CSSE463: Image Recognition Day 15
COSC 4368 Machine Learning Organization
Mathematical Foundations of BME
Linear Discrimination
MAS 622J Course Project Classification of Affective States - GP Semi-Supervised Learning, SVM and kNN Hyungil Ahn
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Pattern Recognition ->Machine Learning- >Data Analytics Supervised Learning Unsupervised Learning Semi-supervised Learning Reinforcement Learning.
Presentation transcript:

Minimax Probability Machine (MPM) Jay Silver

Very High Level Diagram of Training a Pattern Classifier Augmented Testing a New Data Point

Finding a Function that Decides Decision If , choose class wy , choose class wx Assume Binary Non Parametric Parametric Support Vector Machine (SVM) Minimax Probability Machine (MPM) Gaussian

MPM SVM Non-Parametric Linear Decision Boundaries Maximal Margin Classifier Minimize Worst Future Error An SVM and MPM toolbox were used for implementation [1,4]. MPM figure borrowed from [2].

MPM Problem Statement Lower bound on test accuracy Upper bound of misclassifying future point with Mahalanobis Distance Equal Problem Statement s.t. Lower bound on test accuracy An SVM and MPM toolbox were used for implementation [1,4]. MPM figure borrowed from [2].

Expanding the Feature Space with Kernels Original Feature Space Expanded Feature Space XOR: {x1, x2} XOR: {x1, x2, x1x2} Not Linearly Separable Linearly Separable Kernel Examples Gaussian Kernel: Polynomial Kernel:

Take a Look at Some Linear Decision Boundaries Key

Results for the Distribution We Just Saw SVM Performs Best MPM Performs Well SVM Homogeneous Polynomial Fails to Converge

Alpha as an Underbound to Test Accuracy Compare Alpha to Test Accuracy Just Note Correlation Between Alpha and Test Accuracy Key

Testing on a Real Speech Task Deterding Data – 11 vowel sounds with 10 features Multiple classes – Use 1 vs. 1 voting to generalize binary classifiers Test Accuracy for the Gaussian Kernel MPM Peaks At 67.3% Key SVM Peaks At 68.4%

Summary of Deterding Results Distill Results Further Linear Nonlinear Classifier Accuracy Bayes 50.7% SVM 51.7% MPM 48.7% Classifier Accuracy Bayes 47.2% SVM 68.4% MPM 67.3%

Conclusions Alpha is an accurate lower bound for all cases but one. Alpha was reasonably well correlated with test accuracy. SVM homogeneous polynomial kernel outperformed MPM But MPM homo. poly. kernel was more consistent MPM Gaussian kernel performed 1% below SVM on Deterding MPM: Competitive, including realistic speech tasks Mathematically pleasing Room to grow Not quite as accurate as SVMs

References

Questions? The Rainbow Linear Discriminant Between CSTIT Students