Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proceedings of the 2007 SIAM International Conference on Data Mining.

Similar presentations


Presentation on theme: "Proceedings of the 2007 SIAM International Conference on Data Mining."— Presentation transcript:

1 Proceedings of the 2007 SIAM International Conference on Data Mining

2 Abstract The paper studies semi-supervised dimensionality reduction. Besides unlabeled samples, must-link and cannot-link constraints are incorporated as domain knowledge. SSDR algorithm: preserves structure of data as well as constraints in the projected low-dimension space.

3 Introduction There exist supervised and unsupervised dimensionality reduction methods FLD (Fisher Linear Discriminant): extracts discriminant vectors when class labels are available cFLD (Constrained FLD): dimensionality reduction from equivalence constraints PCA (Principal Component Analysis): preserves the global covariance structure of data when class labels are not available

4 Introduction (cont) SSDR: Must-link constraints: pairs of instances belonging to the same class Cannot-link constraints: pairs of instances belonging to different classes Structure of data SSDR: simultaneously preserves the structure of data and pairwise constraints specified by users

5 SSDR Algorithm Maximizing objective function: Find project vector W: Subject to: w T w = 1 ???

6 SSDR Algorithm (cont) Extended objective function: Final form of extended objective function: (2.5) is a typical eigen-problem, which can be solved by computing the eigenvectors of XLX T corresponding to the largest eigenvalues.

7 Experiments Data sets: 6 UCI data sets, YaleB facial image data set, 20-Newsgroup. Results are averaged over 100 runs with different generation of constraints. Parameters: α = 1, β = 20.

8 Results on UCI Data Sets

9 Results on UCI Data Sets (cont)


Download ppt "Proceedings of the 2007 SIAM International Conference on Data Mining."

Similar presentations


Ads by Google