Download presentation
Presentation is loading. Please wait.
Published byBetty Parks Modified over 9 years ago
1
An Impossibility Theorem for Clustering By Jon Kleinberg
2
Definitions Clustering function: operates on a set S of more than 2 points and the distances among them where is a partition of S Distance function: the distance is 0 only for d(i,i) Does not require the triangle inequality.
3
Many different clustering criteria k-center k-median k-means Inter-Intra etc
4
k-Center Minimize maximum distance
5
k-median Minimize average distance k-means: minimize distance squared
6
Inter-Intra T(C) D(C) Maximize D(C) – T(C)
7
Motivation Each criterion optimizes different features Is there one clustering criterion with phenomenal cosmic powers?
8
Method Give three intuitive axioms that any criterion should satisfy Surprise: Not possible to satisfy all three Reminiscent of Arrow’s Impossibility theorem: ranking is impossible
9
Axiom 1 – Scale-Invariance For any distance function d and any β >0 we have that f(S,d)=f(S,βd)
10
Axiom 2 - Richness Range(f) is equal to all partitions of S i.e. All possible clusterings can be generated given the right distances
11
Axiom 3 - Consistency Let d and d’ be two distance functions. If f(d) = and d’ is such that the distance between all points in a cluster is less than in d and the distance between inter-cluster points is larger than in d then f(d’)= d(i,j) d’(i,j)
12
Definition Anti-chain: A collection of partitions is an anti-chain if it does not contain two distinct partitions such that one is a refinement of the other Anti-Chains can not satisfy Richness
13
Main Result For each, there is no clustering function f that satisfies Scale-Invariance, Richness and Consistency Implied by proof that if f satisfies Scale- Invariance and Consistency, then Range(f) is an anti-chain
14
Reminder of Axioms Scale-Invariance: For any distance function d and any β >0 we have that f(d)=f(β d) Richness: Range(f) is equal to all partitions of S Consistency: Let d and d’ be two distance functions. If f(d) = and d’ is such that the distance between all points in a cluster is less than in d and the distance between inter-cluster points is larger than in d then f(d’)=
15
Single Linkage Cluster by combining the closest points 01491012151920
16
Any two axioms For every pair of axioms, there is a stopping condition for single linkage Consistency + Richness: only link if distance is less than r Consistency + SI: stop when you have k connected components Richness + SI: if x is the diameter of the graph, only add edges with weight βx
17
Centroid-Based Clustering (k,g)-centroid clustering function: Choose T, a set of k centroid points such that is minimized If g is identity, we get k-median, etc. Result: For every and every function g and n significantly larger than k the (k,g)-centroid clustering function does not satisfy consistency.
18
Proof: A contradiction r r+δ ε X (size m) Y (size λm)
19
A new distance function r’ r+δ ε Y (size λm) X 0 (size m/2) r’ r r+δ X 1 (size m/2) r’ < r
20
Wrapping Up If we pick λ, r, r’, ε and δ right then we can have: But then our new centers are in X 0 and X 1 But our new distance followed consistency, so it should give us X and Y. This covers the case where k is 2.
21
Discussion: Relaxing Axioms Refinement-consistency: if d’ is an f(d)- transformation of d, then f(d’) is a refinement of f(d) Near-Richness: all partitions except the trivial one can be obtained These together allow a function that satisfies these replacements. What other relaxations could we have?
22
Discussion Does this mean there is a law of continuous employment for clustering criterion creators? Is the clustering function properly defined? Allow overlaps Allow outliers Are these the right axioms? All partitions possible vs. power set Axioms for graph clustering?
23
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.