Download presentation
Presentation is loading. Please wait.
Published byPatrycja Seweryna Markowska Modified over 5 years ago
1
Using Manifold Structure for Partially Labeled Classification
by Belkin and Niyogi, NIPS 2002 Presented by Chunping Wang Machine Learning Group, Duke University November 16, 2007
2
Outline Motivations Algorithm Description Theoretical Interpretation
Experimental Results Comments
3
Motivations (1) Why manifold structure is useful?
Data lies on a lower-dimensional manifold – a dimension reduction is preferable an example: a handwritten digit 0 Usually, dimensionality is the number of pixels, typically very high (256)
4
Motivations (1) Why manifold structure is useful?
Data lies on a lower-dimensional manifold – a dimension reduction is preferable an example: a handwritten digit 0 Usually, dimensionality is the number of pixels, typically very high (256) d1 Ideally, 5-dimensional features f1 * d2 f2 *
5
Motivations (1) Why manifold structure is useful?
Data lies on a lower-dimensional manifold – a dimension reduction is preferable an example: a handwritten digit 0 Actually, a higher dimensionality, but perhaps no more than several dozens Usually, dimensionality is the number of pixels, typically far higher (256) d1 Ideally, 5-dimensional features f1 * d2 f2 *
6
Motivations (2) Why manifold structure is useful?
Data representation in the original space is unsatisfactory labeled unlabeled In the original space 2-d representation with Laplacian Eigenmaps
7
Algorithm Description (1)
Semi-supervised classification k points First s are labeled (s<k) for binary cases Constructing the Adjacency Graph if i is among n nearest neighbors of j or j is among n nearest neighbors of i Eigenfunctions compute , corresponding to the p smallest eigenvelues for the graph Laplacian L = D-W,
8
Algorithm Description (2)
Semi-supervised classification k points First s are labeled (s<k) for binary cases Building the classifier minimize the error function over the space of coefficients a the solution is Classifying unlabeled points (i >s)
9
Theoretical Interpretation (1)
For a manifold , the eigenfunctions of its Laplacian form a basis for the Hilbert space , i.e., any function can be written as with eigenfunctions satisfying The simplest nontrivial example: the manifold is a unit circle S1 Fourier series
10
Theoretical Interpretation (2)
Smoothness measure S: a small S means “smooth” For unit circle S1 Generally Smaller eigenvalues correspond to smoother eigenfunctions (lower frequency) is a constant function In terms of the smoothest p eigenfunctions, the approximation of an arbitrary function
11
Theoretical Interpretation (3)
Back to our problem with finite number of points The solution of a discrete version For binary classification, the alphabet of the function f only contains two possible values. For M-ary cases, the only difference is the number of possible values is more than two.
12
Results (1) Handwritten Digit Recognition (MNIST data set)
60, by-28 gray images (the first 100 principal components are used) p=20% k
13
Results (2) Text Classification (20 Newsgroups data set)
19,935 vectors with dimensionality of 6000 p=20% k
14
Comments This semi-supervised algorithm essentially converts the original problem to a linear regression problem in a new space with lower dimensionality. The approach to solve this linear regression problem is the standard least square estimation. Only n nearest neighbors are considered for each data point, thus the computation for eigen-decomposition is reduced. Little additional computation is expended after dimensionality reduction. More comments ……
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.