Introduction to Machine Learning 236756 Prof. Nir Ailon Lecture 5: Support Vector Machines (SVM)

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

Lecture 9 Support Vector Machines
ECG Signal processing (2)
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
INTRODUCTION TO Machine Learning 2nd Edition
Support Vector Machine & Its Applications Mingyue Tan The University of British Columbia Nov 26, 2004 A portion (1/3) of the slides are taken from Prof.
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
CHAPTER 10: Linear Discrimination
An Introduction of Support Vector Machine
Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.
Support Vector Machines
1 Lecture 5 Support Vector Machines Large-margin linear classifier Non-separable case The Kernel trick.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
SVMs Reprised. Administrivia I’m out of town Mar 1-3 May have guest lecturer May cancel class Will let you know more when I do...
Support Vector Machines and Kernel Methods
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Margins, support vectors, and linear programming Thanks to Terran Lane and S. Dreiseitl.
Binary Classification Problem Learn a Classifier from the Training Set
Support Vector Machines and Kernel Methods
Support Vector Machines
A Kernel-based Support Vector Machine by Peter Axelberg and Johan Löfhede.
1 Computational Learning Theory and Kernel Methods Tianyi Jiang March 8, 2004.
SVMs Finalized. Where we are Last time Support vector machines in grungy detail The SVM objective function and QP Today Last details on SVMs Putting it.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
SVMs, cont’d Intro to Bayesian learning. Quadratic programming Problems of the form Minimize: Subject to: are called “quadratic programming” problems.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Support Vector Machine & Image Classification Applications
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
CS 478 – Tools for Machine Learning and Data Mining SVM.
SVM – Support Vector Machines Presented By: Bella Specktor.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Support Vector Machines Tao Department of computer science University of Illinois.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
CSSE463: Image Recognition Day 15 Announcements: Announcements: Lab 5 posted, due Weds, Jan 13. Lab 5 posted, due Weds, Jan 13. Sunset detector posted,
Text Classification using Support Vector Machine Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
SVMs in a Nutshell.
Support Vector Machines Optimization objective Machine Learning.
CSSE463: Image Recognition Day 15 Today: Today: Your feedback: Your feedback: Projects/labs reinforce theory; interesting examples, topics, presentation;
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
SUPPORT VECTOR MACHINES
PREDICT 422: Practical Machine Learning
Support Vector Machine
Support Vector Machines
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Support Vector Machines
Statistical Learning Dong Liu Dept. EEIS, USTC.
Online Learning Kernels
CSSE463: Image Recognition Day 14
CSSE463: Image Recognition Day 14
CSSE463: Image Recognition Day 15
CSSE463: Image Recognition Day 14
CSSE463: Image Recognition Day 15
CSSE463: Image Recognition Day 14
Support Vector Machines and Kernels
CSSE463: Image Recognition Day 14
CSSE463: Image Recognition Day 15
SVMs for Document Ranking
Support Vector Machines 2
Introduction to Machine Learning
Presentation transcript:

Introduction to Machine Learning Prof. Nir Ailon Lecture 5: Support Vector Machines (SVM)

Linear Separators

Which Is Better?

Margin The margin of a linear separator is defined as the distance of the closest instance point to the linear hyperplane Large margins are intuitively more stable: If noise is added to data, then it is more likely to still be separated

The Margin

Hard-SVM

Hard-SVM Equivalent Formulation

Sample Complexity With Margin

NO! Margin must be relative to data scale (Could take any data of tiny margin and blow it up for free.)

Sample Complexity With Margin

Shattering With a Margin Separated with large margin Separated, but not with large margin

What does this replace? Sample Complexity of Hard-SVM with Margin

Soft-SVM 1

Soft-SVM: Equivalent Definition SRM (structural risk minimization) Hypothesis penalized by norm

Sample Complexity for Soft- SVM No dimensionality dependence

What About Computational Complexity in High Dimension?

The Representer Theorem

Gram

The Kernel

Polynomial Kernels

Gaussian Kernels (RBF: Radial Basis Functions)

Kernels As Prior Knowledge If we think that positive examples can (almost) be separated by some ellipse: then we should use polynomials of degree 2 What should we do if we believe that we can classify a text message using words in a dictionary? A Kernel encodes a measure of similarity between objects. Must be a valid inner product function.

Solving SVM’s Efficiently

SGD for SVM