Download presentation
Presentation is loading. Please wait.
Published byMartha Morton Modified over 9 years ago
1
Feature Selection in k-Median Clustering Olvi Mangasarian and Edward Wild University of Wisconsin - Madison
2
Principal Objective Find a reduced number of input space features such that clustering in the reduced space closely replicates the clustering in the full dimensional space
3
Basic Idea Based on rigorous optimization theory, make a simple but fundamental modification in one of the two steps of the k-median algorithm In each cluster, find a point closest in the 1-norm to all points in that cluster and to the median of ALL data points Proposed approach can lead to a feature reduction as high as 64%, with clustering comparable to within 4% to that with the original set of features Based on increasing weight given to the data median, more features are deleted from problem
4
FSKM Example Start with median at origin Apply k-median algorithm As weight of data median increases, features are removed from the problem
5
Outline of Talk Ordinary k-median algorithm Two steps of the algorithm Feature Selecting k-Median (FSKM) Algorithm Overall optimization objective Basic idea Mathematical optimization formulation Algorithm statement Numerical examples Conclusion & outlook
6
Ordinary k-Median Algorithm Given m data points in n-dimensional input feature space Find k cluster centers with the following property The sum of the 1-norm distances between each data point and the closest cluster center is minimized Finding the minimum of a bunch of linear functions is a concave minimization problem and is NP-hard However, the two-step k-median algorithm terminates in a finite number of steps at a point satisfying the minimum principle necessary optimality condition
7
Two-Step k-Median Algorithm (0) Start with k initial cluster centers (1) Assign each data point to a 1-norm closest cluster center (2) For each cluster compute a new cluster center that is 1- norm closest to all points in the cluster (median of cluster) (3) Stop if all cluster centers are unchanged else go to (1) Algorithm terminates in a finite number of steps at a point satisfying the minimum principle necessary optimality conditions
8
Key Change in Step (2) of k-Median Algorithm (0) (1) (2) For each cluster compute a new cluster center that minimizes the sum of 1-norm distances to all points in the cluster and a weighted 1-norm distance to the median of all data points (3) Weight of 1-norm distance to dataset median determines number of features deleted: For a zero weight no features are suppressed For a sufficiently large weight all features are suppressed and a weighted 1-norm distance to the median of all data points
9
FSKM Theory
10
Subgradients f(y)-f(x) ¸ f(x) 0 (y-x) 8 x,y 2 R n Consider ||x|| 1, x 2 R 1 If x < 0 ||x|| 1 = -1 If x > 0 ||x|| 1 = 1 If x = 0 ||x|| 1 2 [-1, 1]
11
FSKM Theory (Continued)
12
Zeroing Cluster Features (Based on Necessary and Sufficient Optimality Conditions for Nondifferentiable Convex Optimization)
13
FSKM Algorithm
14
FSKM Example (Revisited) Start with median at origin Apply k-median algorithm Compute ’s x 1 = 1 y 1 = 5 x 2 = 0 y 2 = 4 max x = 1 max y = 5 For =1, feature x is removed from the problem 1 2 x y
15
Numerical Testing FSKM tested on five publicly available labeled datasets Labels were used only to test effectiveness of FSKM Data is first clustered using k-median then FSKM is applied to delete one feature at a time Without using data labels, “error” in FSKM clustering with reduced features is obtained by comparison with the “gold standard” clustering with the full set of features FSKM clustering error curve obtained without labels is compared with classification error curve obtained using data labels
16
3-Class Wine Dataset 178 Points in 13-dimensional Space
17
Remarks Curves close together Largest increase in error as last few features are removed Reduced 13 features to 4: Clustering error < 4% Classification error decreased by 0.56 percentage points
18
2-Class Votes Dataset 435 Points in 16-dimensional Space
19
Remarks Curves have similar shape Largest increase in error as last few features are removed Reduced 16 features to 3: Clustering error < 10% Classification error increased by 1.84 percentage points
20
2-Class WDBC Dataset (Wisconsin Diagnostic Breast Cancer) 569 Points in 30-dimensional Space
21
Remarks Curves have similar shape for 14 and fewer features First 3 features removed cause no change to either error curve Reduced 30 features to 7: Clustering error < 10% Classification error increased by 3.69 percentage points
22
2-Class Star/Galaxy-Bright Dataset 2462 Points in 14-dimensional Space
23
Remarks Clustering error increases gradually as number of features is reduced Some features obstructing classification Reduced 14 features to 4: Clustering error < 10% Classification error decreased by 1.42 percentage points
24
2-Class Cleveland Heart Dataset 297 Points in 13-dimensional Space
25
Remarks Largest increase in both curves going from 13 to 9 features Most features useful? Reduced 13 features to 8: Clustering error < 17% Classification error increased by 7.74 percentage points
26
Conclusion FSKM is a fast method for selecting relevant features while maintaining clusters similar to those in the original full dimensional space Features selected by FSKM without labels may be useful for labeled data classification as well FSKM eliminates costly search for appropriately reduced number of features required for clustering in smaller dimensional spaces (e.g. 14-choose-6 = 3003 k- median runs to get best 6 features out of 14 for the Star/Galaxy-Bright dataset compared to 9 k-median runs required by FSKM)
27
Outlook Feature & data selection for support vector machines Sparse kernel approximation methods Gene expression selection Incorporation of prior knowledge into learning Optimization-based clustering may be useful in other machine learning applications Minimalist supervised & unsupervised learning Select minimal knowledge for best model
28
Web Pages (Containing Paper & Talk)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.