ECE 5424: Introduction to Machine Learning

Slides:

Advertisements

Similar presentations

Introduction to Support Vector Machines (SVM)

Advertisements

ECG Signal processing (2)

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Support Vector Machine & Its Applications Mingyue Tan The University of British Columbia Nov 26, 2004 A portion (1/3) of the slides are taken from Prof.

An Introduction of Support Vector Machine

Classification / Regression Support Vector Machines

Crash Course on Machine Learning Part III

An Introduction of Support Vector Machine

Support Vector Machines

Computer vision: models, learning and inference Chapter 8 Regression.

Support Vector Machine

Support Vector Machines Joseph Gonzalez TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A AA A AA.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –SVM –Multi-class SVMs –Neural Networks –Multi-layer Perceptron Readings:

Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.

CS 4700: Foundations of Artificial Intelligence

A Kernel-based Support Vector Machine by Peter Axelberg and Johan Löfhede.

SVMs Finalized. Where we are Last time Support vector machines in grungy detail The SVM objective function and QP Today Last details on SVMs Putting it.

Lecture 10: Support Vector Machines

An Introduction to Support Vector Machines Martin Law.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

An Introduction to Support Vector Machines (M. Law)

Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.

CS 478 – Tools for Machine Learning and Data Mining SVM.

Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.

Text Classification using Support Vector Machine Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.

CS 2750: Machine Learning Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh February 17, 2016.

Learning by Loss Minimization. Machine learning: Learn a Function from Examples Function: Examples: – Supervised: – Unsupervised: – Semisuprvised:

Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)

ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –(Finish) Model selection –Error decomposition –Bias-Variance Tradeoff –Classification:

Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:

PREDICT 422: Practical Machine Learning

Support Vector Machine

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

Support Vector Machines and Kernels

Geometrical intuition behind the dual problem

Support Vector Machines

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning

An Introduction to Support Vector Machines

Kernels Usman Roshan.

An Introduction to Support Vector Machines

LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS

ECE 5424: Introduction to Machine Learning

Support Vector Machines

CS 2750: Machine Learning Support Vector Machines

ECE 5424: Introduction to Machine Learning

CSSE463: Image Recognition Day 14

CSSE463: Image Recognition Day 14

Recitation 6: Kernel SVM

CSSE463: Image Recognition Day 15

CSSE463: Image Recognition Day 14

CSSE463: Image Recognition Day 15

CSSE463: Image Recognition Day 14

Support Vector Machines and Kernels

Support Vector Machine I

Usman Roshan CS 675 Machine Learning

CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis

CSSE463: Image Recognition Day 14

CSSE463: Image Recognition Day 15

Linear Discrimination

SVMs for Document Ranking

Support Vector Machines

Introduction to Machine Learning

Presentation transcript:

ECE 5424: Introduction to Machine Learning Topics: SVM SVM dual & kernels Readings: Barber 17.5 Stefan Lee Virginia Tech

Lagrangian Duality On paper (C) Dhruv Batra

Dual SVM derivation (1) – the linearly separable case (C) Dhruv Batra Slide Credit: Carlos Guestrin

Dual SVM derivation (1) – the linearly separable case (C) Dhruv Batra Slide Credit: Carlos Guestrin

Dual SVM formulation – the linearly separable case (C) Dhruv Batra Slide Credit: Carlos Guestrin

Dual SVM formulation – the non-separable case (C) Dhruv Batra Slide Credit: Carlos Guestrin

Dual SVM formulation – the non-separable case (C) Dhruv Batra Slide Credit: Carlos Guestrin

Why did we learn about the dual SVM? Builds character! Exposes structure about the problem There are some quadratic programming algorithms that can solve the dual faster than the primal The “kernel trick”!!! (C) Dhruv Batra Slide Credit: Carlos Guestrin

Dual SVM interpretation: Sparsity w.x + b = +1 w.x + b = -1 w.x + b = 0 margin 2 (C) Dhruv Batra Slide Credit: Carlos Guestrin

Dual formulation only depends on dot-products, not on w! (C) Dhruv Batra

Dot-product of polynomials Vector of Monomials of degree m (C) Dhruv Batra Slide Credit: Carlos Guestrin

Higher order polynomials d – input features m – degree of polynomial m=4 number of monomial terms grows fast! m = 6, d = 100 D = about 1.6 billion terms m=3 m=2 number of input dimensions (d) (C) Dhruv Batra Slide Credit: Carlos Guestrin

Common kernels Polynomials of degree d Polynomials of degree up to d Gaussian kernel / Radial Basis Function Sigmoid 2 (C) Dhruv Batra Slide Credit: Carlos Guestrin

Kernel Demo Demo http://www.eee.metu.edu.tr/~alatan/Courses/Demo/AppletSVM.html (C) Dhruv Batra

What is a kernel? k: X x X  R Any measure of “similarity” between two inputs Mercer Kernel / Positive Semi-Definite Kernel Often just called “kernel” (C) Dhruv Batra

(C) Dhruv Batra Slide Credit: Blaschko & Lampert

(C) Dhruv Batra Slide Credit: Blaschko & Lampert

(C) Dhruv Batra Slide Credit: Blaschko & Lampert

(C) Dhruv Batra Slide Credit: Blaschko & Lampert

Finally: the “kernel trick”! Never represent features explicitly Compute dot products in closed form Constant-time high-dimensional dot-products for many classes of features Very interesting theory – Reproducing Kernel Hilbert Spaces (C) Dhruv Batra Slide Credit: Carlos Guestrin

Kernels in Computer Vision Features x = histogram (of color, texture, etc) Common Kernels Intersection Kernel Chi-square Kernel (C) Dhruv Batra

What about at classification time For a new input x, if we need to represent (x), we are in trouble! Recall classifier: sign(w.(x)+b) Using kernels we are fine! (C) Dhruv Batra Slide Credit: Carlos Guestrin

Kernels in logistic regression Define weights in terms of support vectors: Derive simple gradient descent rule on i

Kernels Kernel Logistic Regression Kernel Least Squares Kernel PCA … (C) Dhruv Batra