Irena Váňová
B A1A1. A2A2. A3A3.
repeat until no sample is misclassified … labels of classes Perceptron algorithm for i=1...N if then end * * * * * * * * * o o o o oo
repeat for i=1...N if then end until no sample is misclassified Find the coefficients is equivalent to find In the dual representation, the data points only appear inside dot products Many algorithms have dual form Rewitten algorithm – dual form Gram matrix
Perceptron works for linear separable problems There is a computational problem (very large vectors) Kernel trick:
Polynomial kernels Gaussian kernels ◦ Infinit dimensions ◦ Separated by a hyperplane Good kernel? Bad kernel! ◦ Almost diagonal
We precompute We are in implicitly in higher dimensions (too high?) Generalization problem - easy to overfit in high dimensional spaces repeat for i=1...N if then end until no sample is misclassified
Kernel function Use: replacing dot products with kernels Implicit mapping to feature space ◦ Solve the computational problem ◦ Can make it possible to use infinite dimensions Conditons: continuous, symmetric, positive definite Information ‘bottleneck’: contains all necessary information for the learning algorithm Fuses information about the data AND the kernel
Orthogonal linear transformation The greatest variance = first coordinate, … Rotation around mean value Dimensionality reduction ◦ many dimensions = high correlation
W,T – unitary matrix ( ) Columns of W,V? ◦ Basis vector, eigenvectors of X T X, resp. XX T n n n n m m m
Data with zero mean, SVD n n m m m m covariance matrix (1/n) eigenvectors
Projections of data onto few larger eigenvectors equation for PCA kernel function equation for high-dim. PCA We don’t know eigenvector explicitly - only vector of numbers which identify the vector projection onto k-th eigenvector
PCA is blind
fundamental assumption: normal distribution First: same covariance matrix, full rank
fundamental assumption: normal distribution First: only full rank Kernel variant
Face recognition – eigenfaces