Download presentation
Presentation is loading. Please wait.
1
Unsupervised Feature Selection for Multi-Cluster Data Deng Cai et al, KDD 2010 Presenter: Yunchao Gong Dept. Computer Science, UNC Chapel Hill
2
Introduction
3
The problem Feature selection for high dimensional data Assume high dimensional n*d data matrix X When d is too high (d > 10000), it causes problems – Storing the data is expensive – Computational challenges
4
A solution to this problem Select the most informative features Assume high dimensional n*d data matrix X n d
5
After feature selection n dd’ << d More scalable and can be much more efficient
6
Previous approaches Supervised feature selection – Fisher score – Information theoretical feature selection Unsupervised feature selection – LaplacianScore – MaxVariance – Random selection
7
Motivation of the paper Improving clustering (unsupervised) using feature selection Automatically select the most useful feature by discovering the manifold structure of data
8
Motivation and Novelty Previous approaches – Greedy selection – Selects each feature independently This approach – Considers all features together
9
A toy example
10
Some background—manifold learning Tries to discover manifold structure from data
11
appearance variation manifolds in vision
12
Some background—manifold learning Discovers smooth manifold structure Map different manifold (classes) as far as possible
13
The proposed method
14
Outline of the approach 1. manifold learning for cluster analysis 2. sparse coefficient regression 3. sparse feature selection
15
Outline of the approach 123 Generate response Y
16
1. Manifold learning for clustering Manifold learning to find clusters of data 2 1 3 4
17
1. Manifold learning for clustering Observations – Points in the same class are clustered together
18
1. Manifold learning for clustering Ideas? – How to discover the manifold structure?
19
1. Manifold learning for clustering Map similar points closer Map dissimilar points faraway Data points similarity min
20
1. Manifold learning for clustering Map similar points closer Map dissimilar points faraway min =, D = diag(sum(K))
21
1. Manifold learning for clustering Constraining f to be orthogonal (f^TDf=I) to eliminate free scaling So we have the following minimization problem
22
Summary of manifold discovery Construct graph K Choose a weighting scheme for K ( ) Perform Use f as the response vectors Y
23
Response vector Y Y reveals the manifold structure! 2 3 4 1 Response Y =
24
2. Sparse coefficient regression When we have obtained Y, how to use Y to perform feature selection? Y reveals the cluster structure, use Y as response to perform sparse regression
25
Sparse regression The ``Lasso’’ sparsely
26
Steps to perform sparse regression Generate Y from step 1 Data matrix input X For each column of Y, we denote it as Y_k Perform the following step to estimate
27
illustration XY_ka = nonzero
28
3. Sparse feature selection But for Y, we will have c different Y_k How to finally combine them? A simple heuristic approach
29
illustration aaaa
30
The final algorithm 1. manifold learning to obtain Y 2. sparse regression to select features 3. final combination, Y = f
31
2 3 4 1 Response Y = Step 1
32
Step 2 Xa nonzero 2 3 4 1 regression
33
Step 3 aaaa
34
Discussion
35
Novelty of this approach Considers all features Uses sparse regression ``lasso’’ to perform feature selection Combines manifold learning with feature selection
36
Results
37
Face image clustering
38
Digit recognition
39
Summary – The method is technically new and interesting – But not ground breaking
40
Summary Test on more challenging vision datasets – Caltech – Imagenet
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.