ECE 5424: Introduction to Machine Learning

Slides:

Advertisements

Similar presentations

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Advertisements

INTRODUCTION TO Machine Learning 2nd Edition

Classification / Regression Support Vector Machines

Linear Classifiers/SVMs

CHAPTER 10: Linear Discrimination

Crash Course on Machine Learning Part III

An Introduction of Support Vector Machine

Support Vector Machines

Support vector machine

Support Vector Machines (and Kernel Methods in general)

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –SVM –Multi-class SVMs –Neural Networks –Multi-layer Perceptron Readings:

Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.

Lecture 10: Support Vector Machines

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

An Introduction to Support Vector Machines Martin Law.

Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Naïve Bayes Readings: Barber

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

Least Squares Support Vector Machine Classifiers J.A.K. Suykens and J. Vandewalle Presenter: Keira (Qi) Zhou.

An Introduction to Support Vector Machines (M. Law)

Recognition II Ali Farhadi. We have talked about Nearest Neighbor Naïve Bayes Logistic Regression Boosting.

Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Unsupervised Learning: Kmeans, GMM, EM Readings: Barber

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Nearest Neighbour Readings: Barber 14 (kNN)

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Classification: Logistic Regression –NB & LR connections Readings: Barber.

CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.

Support Vector Machine Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 3, 2014.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Regression Readings: Barber 17.1, 17.2.

CS 2750: Machine Learning Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh February 17, 2016.

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Model selection –Error decomposition –Bias-Variance Tradeoff –Classification:

Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:

Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.

Support Vector Machine Slides from Andrew Moore and Mingyue Tan.

Support Vector Machines

Neural networks and support vector machines

PREDICT 422: Practical Machine Learning

Support Vector Machine

Fun with Hyperplanes: Perceptrons, SVMs, and Friends

Large Margin classifiers

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

Support Vector Machines

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

An Introduction to Support Vector Machines

Kernels Usman Roshan.

An Introduction to Support Vector Machines

ECE 5424: Introduction to Machine Learning

Support Vector Machines Introduction to Data Mining, 2nd Edition by

Support Vector Machines

Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis

Statistical Learning Dong Liu Dept. EEIS, USTC.

CS 2750: Machine Learning Support Vector Machines

ECE 5424: Introduction to Machine Learning

Large Scale Support Vector Machines

Machine Learning Week 3.

Support Vector Machines and Kernels

Support Vector Machine I

Usman Roshan CS 675 Machine Learning

CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis

COSC 4368 Machine Learning Organization

Linear Discrimination

Support Vector Machines

Presentation transcript:

ECE 5424: Introduction to Machine Learning Topics: SVM soft & hard margin comparison to Logistic Regression Readings: Barber 17.5 Stefan Lee Virginia Tech

New Topic (C) Dhruv Batra

Generative vs. Discriminative Generative Approach (Naïve Bayes) Estimate p(x|y) and p(y) Use Bayes Rule to predict y Discriminative Approach Estimate p(y|x) directly (Logistic Regression) Learn “discriminant” function f(x) (Support Vector Machine) (C) Dhruv Batra

Linear classifiers – Which line is better? w.x = j w(j) x(j)

Margin x+ x- margin 2 (C) Dhruv Batra Slide Credit: Carlos Guestrin w.x + b = +1 w.x + b = 0 w.x + b = -1 x+ x- margin 2 (C) Dhruv Batra Slide Credit: Carlos Guestrin

Support vector machines (SVMs) w.x + b = +1 w.x + b = -1 w.x + b = 0 margin 2 Solve efficiently by quadratic programming (QP) Well-studied solution algorithms Hyperplane defined by support vectors (C) Dhruv Batra Slide Credit: Carlos Guestrin

What if the data is not linearly separable? Slide Credit: Carlos Guestrin

What if the data is not linearly separable? Minimize w.w and number of training mistakes 0/1 loss Slack penalty C Not QP anymore Also doesn’t distinguish near misses and really bad mistakes (C) Dhruv Batra Slide Credit: Carlos Guestrin

Slack variables – Hinge loss If margin >= 1, don’t care If margin < 1, pay linear penalty (C) Dhruv Batra Slide Credit: Carlos Guestrin

Soft Margin SVM Matlab Demo (C) Dhruv Batra

Side note: What’s the difference between SVMs and logistic regression? Log loss: SVM: Hinge Loss LR: Logistic Loss (C) Dhruv Batra Slide Credit: Carlos Guestrin

(C) Dhruv Batra Slide Credit: Andrew Moore

(C) Dhruv Batra Slide Credit: Andrew Moore

(C) Dhruv Batra Slide Credit: Andrew Moore

Does this always work? In a way, yes (C) Dhruv Batra Slide Credit: Blaschko & Lampert

Caveat (C) Dhruv Batra Slide Credit: Blaschko & Lampert

Kernel Trick One of the most interesting and exciting advancement in the last 2 decades of machine learning The “kernel trick” High dimensional feature spaces at no extra cost! But first, a detour Constrained optimization! (C) Dhruv Batra Slide Credit: Carlos Guestrin