INTRODUCTION TO Machine Learning 2nd Edition

Slides:



Advertisements
Similar presentations
CHAPTER 13: Alpaydin: Kernel Machines
Advertisements

Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Lecture 9 Support Vector Machines
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
CHAPTER 10: Linear Discrimination
Support Vector Machines
Machine learning continued Image source:
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning 3rd Edition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Support Vector Machines (and Kernel Methods in general)
Support Vector Machines and Kernel Methods
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
SVM Support Vectors Machines
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Lecture 10: Support Vector Machines
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
An Introduction to Support Vector Machines Martin Law.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Métodos de kernel. Resumen SVM - motivación SVM no separable Kernels Otros problemas Ejemplos Muchas slides de Ronald Collopert.
Outline Separating Hyperplanes – Separable Case
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Support Vector Machines Mei-Chen Yeh 04/20/2010. The Classification Problem Label instances, usually represented by feature vectors, into one of the predefined.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
An Introduction to Support Vector Machines (M. Law)
Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
SVM – Support Vector Machines Presented By: Bella Specktor.
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Text Classification using Support Vector Machine Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
Review for final exam 2015 Fundamentals of ANN RBF-ANN using clustering Bayesian decision theory Genetic algorithm SOM SVM.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
Introduction to Machine Learning Prof. Nir Ailon Lecture 5: Support Vector Machines (SVM)
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
Copyright 2005 by David Helmbold1 Support Vector Machines (SVMs) References: Cristianini & Shawe-Taylor book; Vapnik’s book; and “A Tutorial on Support.
Support vector machines
PREDICT 422: Practical Machine Learning
ECE 5424: Introduction to Machine Learning
Kernels Usman Roshan.
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
Support Vector Machines
CS 2750: Machine Learning Support Vector Machines
Online Learning Kernels
Support vector machines
INTRODUCTION TO Machine Learning
INTRODUCTION TO Machine Learning
Usman Roshan CS 675 Machine Learning
Support vector machines
CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis
Support vector machines
INTRODUCTION TO Machine Learning 3rd Edition
Support vector machines
Linear Discrimination
Review for test #3 Radial basis functions SVM SOM.
SVMs for Document Ranking
Introduction to Machine Learning
Presentation transcript:

INTRODUCTION TO Machine Learning 2nd Edition Lecture Slides for INTRODUCTION TO Machine Learning 2nd Edition ETHEM ALPAYDIN © The MIT Press, 2010 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml2e

CHAPTER 13: Kernel Machines

Kernel Machines Discriminant-based: No need to estimate densities first Define the discriminant in terms of support vectors The use of kernel functions, application-specific measures of similarity No need to represent instances as vectors Convex optimization problems with a unique solution Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Optimal Separating Hyperplane (Cortes and Vapnik, 1995; Vapnik, 1995) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Margin Distance from the discriminant to the closest instances on either side Distance of x to the hyperplane is We require For a unique sol’n, fix ρ||w||=1, and to max margin Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Margin Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Most αt are 0 and only a small number have αt >0; they are the support vectors Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Soft Margin Hyperplane Not linearly separable Soft error New primal is Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Hinge Loss Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

n-SVM n controls the fraction of support vectors Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Kernel Trick Preprocess input x by basis functions z = φ(x) g(z)=wTz g(x)=wT φ(x) The SVM solution Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Vectorial Kernels Polynomials of degree q: Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Vectorial Kernels Radial-basis functions: Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Defining kernels Kernel “engineering” Defining good measures of similarity String kernels, graph kernels, image kernels, ... Empirical kernel map: Define a set of templates mi and score function s(x,mi) f(xt)=[s(xt,m1), s(xt,m2),..., s(xt,mM)] and K(x,xt)=f (x)T f (xt) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Multiple Kernel Learning Fixed kernel combination Adaptive kernel combination Localized kernel combination Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Multiclass Kernel Machines 1-vs-all Pairwise separation Error-Correcting Output Codes (section 17.5) Single multiclass optimization Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

SVM for Regression Use a linear model (possibly kernelized) f(x)=wTx+w0 Use the є-sensitive error function Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Kernel Regression Polynomial kernel Gaussian kernel Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

One-Class Kernel Machines Consider a sphere with center a and radius R Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Kernel Dimensionality Reduction Kernel PCA does PCA on the kernel matrix (equal to canonical PCA with a linear kernel) Kernel LDA Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)