ECE 5424: Introduction to Machine Learning

Slides:



Advertisements
Similar presentations
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Advertisements

INTRODUCTION TO Machine Learning 2nd Edition
Classification / Regression Support Vector Machines
Linear Classifiers/SVMs
CHAPTER 10: Linear Discrimination
Crash Course on Machine Learning Part III
An Introduction of Support Vector Machine
Support Vector Machines
Support vector machine
Support Vector Machines (and Kernel Methods in general)
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –SVM –Multi-class SVMs –Neural Networks –Multi-layer Perceptron Readings:
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Lecture 10: Support Vector Machines
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
An Introduction to Support Vector Machines Martin Law.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Naïve Bayes Readings: Barber
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
Least Squares Support Vector Machine Classifiers J.A.K. Suykens and J. Vandewalle Presenter: Keira (Qi) Zhou.
An Introduction to Support Vector Machines (M. Law)
Recognition II Ali Farhadi. We have talked about Nearest Neighbor Naïve Bayes Logistic Regression Boosting.
Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Unsupervised Learning: Kmeans, GMM, EM Readings: Barber
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Nearest Neighbour Readings: Barber 14 (kNN)
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Logistic Regression –NB & LR connections Readings: Barber.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
Support Vector Machine Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 3, 2014.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Regression Readings: Barber 17.1, 17.2.
CS 2750: Machine Learning Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh February 17, 2016.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Model selection –Error decomposition –Bias-Variance Tradeoff –Classification:
Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:
Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Support Vector Machine Slides from Andrew Moore and Mingyue Tan.
Support Vector Machines
Neural networks and support vector machines
PREDICT 422: Practical Machine Learning
Support Vector Machine
Fun with Hyperplanes: Perceptrons, SVMs, and Friends
Large Margin classifiers
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
Support Vector Machines
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning
An Introduction to Support Vector Machines
Kernels Usman Roshan.
An Introduction to Support Vector Machines
ECE 5424: Introduction to Machine Learning
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Support Vector Machines
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Statistical Learning Dong Liu Dept. EEIS, USTC.
CS 2750: Machine Learning Support Vector Machines
ECE 5424: Introduction to Machine Learning
Large Scale Support Vector Machines
Machine Learning Week 3.
Support Vector Machines and Kernels
Support Vector Machine I
Usman Roshan CS 675 Machine Learning
CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis
COSC 4368 Machine Learning Organization
Linear Discrimination
Support Vector Machines
Presentation transcript:

ECE 5424: Introduction to Machine Learning Topics: SVM soft & hard margin comparison to Logistic Regression Readings: Barber 17.5 Stefan Lee Virginia Tech

New Topic (C) Dhruv Batra

Generative vs. Discriminative Generative Approach (Naïve Bayes) Estimate p(x|y) and p(y) Use Bayes Rule to predict y Discriminative Approach Estimate p(y|x) directly (Logistic Regression) Learn “discriminant” function f(x) (Support Vector Machine) (C) Dhruv Batra

Linear classifiers – Which line is better? w.x = j w(j) x(j)

Margin x+ x- margin 2 (C) Dhruv Batra Slide Credit: Carlos Guestrin w.x + b = +1 w.x + b = 0 w.x + b = -1 x+ x- margin 2 (C) Dhruv Batra Slide Credit: Carlos Guestrin

Support vector machines (SVMs) w.x + b = +1 w.x + b = -1 w.x + b = 0 margin 2 Solve efficiently by quadratic programming (QP) Well-studied solution algorithms Hyperplane defined by support vectors (C) Dhruv Batra Slide Credit: Carlos Guestrin

What if the data is not linearly separable? Slide Credit: Carlos Guestrin

What if the data is not linearly separable? Minimize w.w and number of training mistakes 0/1 loss Slack penalty C Not QP anymore Also doesn’t distinguish near misses and really bad mistakes (C) Dhruv Batra Slide Credit: Carlos Guestrin

Slack variables – Hinge loss If margin >= 1, don’t care If margin < 1, pay linear penalty (C) Dhruv Batra Slide Credit: Carlos Guestrin

Soft Margin SVM Matlab Demo (C) Dhruv Batra

Side note: What’s the difference between SVMs and logistic regression? Log loss: SVM: Hinge Loss LR: Logistic Loss (C) Dhruv Batra Slide Credit: Carlos Guestrin

(C) Dhruv Batra Slide Credit: Andrew Moore

(C) Dhruv Batra Slide Credit: Andrew Moore

(C) Dhruv Batra Slide Credit: Andrew Moore

Does this always work? In a way, yes (C) Dhruv Batra Slide Credit: Blaschko & Lampert

Caveat (C) Dhruv Batra Slide Credit: Blaschko & Lampert

Kernel Trick One of the most interesting and exciting advancement in the last 2 decades of machine learning The “kernel trick” High dimensional feature spaces at no extra cost! But first, a detour Constrained optimization! (C) Dhruv Batra Slide Credit: Carlos Guestrin