Presentation is loading. Please wait.

Presentation is loading. Please wait.

ADVANCED TOPIC: KERNELS 1. The kernel trick where i 1,…,i k are the mistakes… so: Remember in our alternate perceptron:

Similar presentations


Presentation on theme: "ADVANCED TOPIC: KERNELS 1. The kernel trick where i 1,…,i k are the mistakes… so: Remember in our alternate perceptron:"— Presentation transcript:

1 ADVANCED TOPIC: KERNELS 1

2 The kernel trick where i 1,…,i k are the mistakes… so: Remember in our alternate perceptron:

3 The kernel trick – con ’ t where i 1,…,i k are the mistakes… then Since: Consider a preprocesser that replaces every x with x ’ to include, directly in the example, all the pairwise variable interactions, so what is learned is a vector v ’ : And it has some advantages…(everything is in terms of dot-product). I can stick my preprocessor here, before the dot-product gets called

4 The kernel trick – con ’ t A voted perceptron over vectors like u,v is a linear function applied to x= Replacing u with u ’ would lead to non- linear functions – f(x,y,xy,x 2,…)

5 The kernel trick – con ’ t But notice…if we replace u.v with (u.v+1) 2 …. Compare to

6 The kernel trick – con ’ t So – up to constants on the cross-product terms Why not replace the computation of With the computation of where ?

7 The kernel trick – con ’ t Consider a preprocesser that replaces every x with x ’ to include, directly in the example, all the pairwise variable interactions, so what is learned is a vector v ’ : I can stick my preprocessor here, before the dot-product gets called Better yet: use No preprocessor! I never build x’!

8 Example of separability 8

9 Some results with polynomial kernels 9

10 10

11

12 12

13 13

14 The kernel trick – con ’ t General idea: replace an expensive preprocessor x  x ’ and ordinary inner product with no preprocessor and a function K(x,x i ) where This is really useful when you want to learn over objects x with some non-trivial structure.

15 The kernel trick – con ’ t Even more general idea: use any function K that is Continuous Symmetric—i.e., K(u,v)=K(v,u) “Positive semidefinite”—i.e., K(u,v)≥0 Then by an ancient theorem due to Mercer, K corresponds to some combination of a preprocessor and an inner product: i.e., Terminology: K is a Mercer kernel. The set of all x ’ is a reproducing kernel Hilbert space (RKHS). The matrix M[i,j]=K(x i,x j ) is a Gram matrix.


Download ppt "ADVANCED TOPIC: KERNELS 1. The kernel trick where i 1,…,i k are the mistakes… so: Remember in our alternate perceptron:"

Similar presentations


Ads by Google