Download presentation
Presentation is loading. Please wait.
Published byBranden Robinson Modified over 9 years ago
1
GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer and Information Science
2
GRASP The Big Picture Given high dimensional data sampled from a low dimensional manifold, how to compute a faithful embedding?
3
GRASP Outline Part I: kernel PCA Part II: Manifold Learning Part III: Algorithm Part IV: Experimental Results
4
GRASP Part I. kernel PCA
5
GRASP Nearby points remain nearby, distant points remain distant. Estimate d. Input: Output : Problem: Embedding :
6
GRASP Subspaces D=3 d=2 D=2 d=1
7
GRASP Principal Component Analysis Project data into subspace of maximum variance: Can be solved as eigenvalue problem:
8
GRASP Using the kernel trick Do PCA in a higher dimensional feature space Can be defined implicitly through kernel matrix
9
GRASP Linear Gaussian Polynomial Common Kernels Do very well for classification. How about manifold learning?
10
GRASP Linear Kernel
11
GRASP Gaussian Kernels
12
GRASP Gaussian Kernels Feature vectors span as many dimensions as number of spheres with radius needed to enclose input vectors.
13
GRASP Polynomial Kernels
14
GRASP Part II. Manifold Learning via Semidefinite Programming
15
GRASP Local Isometry A smooth, invertible mapping that preserves distances and looks locally like a rotation plus translation.
16
GRASP Local Isometry A smooth, invertible mapping that preserves distances and looks locally like a rotation plus translation.
17
GRASP Neighborhood graph Connect each point to its k nearest neighbors. Discretized manifolds
18
GRASP Preserve local distances Approximation of local isometry: Constraint Neighborhood indicator
19
GRASP Goal: Problem: Heuristic: Objective Function? Find Minimum Rank Kernel Matrix Computationally Hard Maximize Pairwise Distances
20
GRASP Objective Function? (Cont’d) What happens if we maximize the pairwise distances?
21
GRASP Semidefinite Programming Problem: Maximize : subject to: Preserve local neighborhoods Unfold manifold Center output Semipositive definite
22
GRASP Part III Semidefinite Embedding in three easy steps (Also known as “Maximum Variance Unfolding” [Sun, Boyd, Xiao, Diaconis])
23
GRASP 1. Step: K-Nearest Neighbors Compute nearest neighbors and the Gram matrix for each neighborhood
24
GRASP 2. Step: Semidefinite programming Compute centered, locally isometric dot-product matrix with maximal trace
25
GRASP Estimate d from eigenvalue spectrum. Top eigenvectors give embedding 3. Step: kernel PCA
26
GRASP Part IV. Experimental Results
27
GRASP Trefoil Knot N=539 k=4 D=3 d=2
28
GRASP Trefoil Knot N=539 k=4 D=3 d=2
29
GRASP Teapot (full rotation) N=400 k=4 D=23028 d=2
30
GRASP N=200 k=4 D=23028 d=2 Teapot (half rotation)
31
GRASP Faces N=1000 k=4 D=540 d=2
32
GRASP Twos vs. Threes N=953 k=3 D=256 d=2
33
GRASP Part V. Supervised Experimental Results
34
GRASP Large Margin Classification SDE Kernel used in SVM Task: Binary Digit Classification Input: USPS Data Set Training / Testing set: 810/90 Neighborhood Size: k=4
35
GRASP SVM Kernel SDE is not well-suited for SVMs
36
GRASP SVM Kernel (cont’d) Non-Linear decision boundaryLinear decision boundary Unfolding does not necessarily help classification Reducing the dimensionality is counter-intuitive. Needs linear decision boundary on manifold.
37
GRASP Part VI. Conclusion
38
GRASP Previous Work Isomap and LLE can both be seen from a kernel view [Jihun Ham et al., ICML’04]
39
GRASP Previous Work (Isomap) Isomap and LLE can both be seen from a kernel view [Jihun Ham et al., ICML’04] Matrix not necessarily semi-positive definite SDEIsomap
40
GRASP Previous Work (Isomap) Isomap and LLE can both be seen from a kernel view [Jihun Ham et al., ICML’04] Matrix not necessarily semi-positive definite SDEIsomap
41
GRASP Previous Work (LLE) Isomap and LLE can both be seen from a kernel view [Jihun Ham et al., ICML’04] Eigenvalues do not reveal true dimensionality SDELLE
42
GRASP Conclusion Semidefinite Embedding (SDE) +extends kernel PCA to do manifold learning +uses semidefinite programming +has a guaranteed unique solution -not well suited for support vector machines -exact solution (so far) limited to N=2000
43
GRASP
44
Semidefinite Programming Problem: Maximize : subject to: Preserve local neighborhoods Unfold Manifold Center Output semi-positive definite
45
GRASP Semidefinite Programming Problem: Maximize : subject to: Preserve local neighborhoods Unfold Manifold Center Output semi-positive definite Introduce Slack
46
GRASP Swiss Roll N=800 k=4 D=3 d=2
47
GRASP Applications Visualization of Data Natural Language Processing
48
GRASP Trefoil Knot N=539 k=4 D=3 d=2 RBF Polynomial SDE
49
GRASP Motivation Similar vectorized pictures lie on a non-linear manifolds Linear Methods don’t work here
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.