Download presentation
Presentation is loading. Please wait.
1
Proceedings of the 2007 SIAM International Conference on Data Mining
2
Abstract The paper studies semi-supervised dimensionality reduction. Besides unlabeled samples, must-link and cannot-link constraints are incorporated as domain knowledge. SSDR algorithm: preserves structure of data as well as constraints in the projected low-dimension space.
3
Introduction There exist supervised and unsupervised dimensionality reduction methods FLD (Fisher Linear Discriminant): extracts discriminant vectors when class labels are available cFLD (Constrained FLD): dimensionality reduction from equivalence constraints PCA (Principal Component Analysis): preserves the global covariance structure of data when class labels are not available
4
Introduction (cont) SSDR: Must-link constraints: pairs of instances belonging to the same class Cannot-link constraints: pairs of instances belonging to different classes Structure of data SSDR: simultaneously preserves the structure of data and pairwise constraints specified by users
5
SSDR Algorithm Maximizing objective function: Find project vector W: Subject to: w T w = 1 ???
6
SSDR Algorithm (cont) Extended objective function: Final form of extended objective function: (2.5) is a typical eigen-problem, which can be solved by computing the eigenvectors of XLX T corresponding to the largest eigenvalues.
7
Experiments Data sets: 6 UCI data sets, YaleB facial image data set, 20-Newsgroup. Results are averaged over 100 runs with different generation of constraints. Parameters: α = 1, β = 20.
8
Results on UCI Data Sets
9
Results on UCI Data Sets (cont)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.