Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discerning Linkage-Based Algorithms Among Hierarchical Clustering Methods Margareta Ackerman and Shai Ben-David IJCAI 2011.

Similar presentations


Presentation on theme: "Discerning Linkage-Based Algorithms Among Hierarchical Clustering Methods Margareta Ackerman and Shai Ben-David IJCAI 2011."— Presentation transcript:

1 Discerning Linkage-Based Algorithms Among Hierarchical Clustering Methods Margareta Ackerman and Shai Ben-David IJCAI 2011

2 Clustering is one of the most widely used tools for exploratory data analysis. Social Sciences Biology Astronomy Computer Science …. All apply clustering to gain a first understanding of the structure of large data sets.

3 The Theory-Practice Gap Both statements still apply today.

4 Bridging the Theory-Practice Gap: Previous work Axioms of clustering [(Kleinberg, NIPS 02), (Ackerman & Ben-David, NIPS 08), (Meila, NIPS 08)] Clusterability [(Balcan, Blum, and Vempala, STOC 08), (Ackerman & Ben-David, AISTATS 09) ]

5 There are a wide variety of clustering algorithms, which often produce very different clusterings. How should a user decide which algorithm to use for a given application? M. Ackerman, S. Ben-David, and D. Loker Bridging the Theory-Practice Gap: Clustering algorithm selection

6 We propose a framework that lets a user utilize prior knowledge to select an algorithm Identify properties that distinguish between the input-output behaviour of different clustering algorithms The properties should be: 1) Intuitive and “user-friendly” 2) Useful for classifying clustering algorithms Our approach for clustering algorithm selection

7 A property-based classification of partitional clustering algorithms (Ackerman, Ben-David, and Loker, NIPS ‘10) A characterizing of a single-linkage with the k- stopping criteria (Zadeh and Ben-David, UAI 09) A characterization of linkage-based clustering with the k-stopping criteria (Ackerman, Ben-David, and Loker, COLT ‘10) Previous Work in Property-Based Framework

8 Extend the above property-based framework to the hierarchical clustering setting Propose two intuitive properties that uniquely indentify hierarchical linkage-based clustering algorithms Show that common hierarchical algorithms, including bisecting k-means, cannot be simulated by any linkage-based algorithm Our contributions

9 Outline Define Linkage-Based clustering Introduce two new properties of hierarchical clustering algorithms Main result Hierarchical clustering paradigms that are not linkage-based Conclusions

10 C_iD C_i A set C_i is a cluster in a dendrogram D if there exists a node in the dendrogram so that C_i is the set of its leaf descendents. Formal Setup: Dendrograms and clusterings Dendrogram:

11 C = {C 1, …, C k } D C = {C 1, …, C k } is a clustering in a dendrogram D if – C i D1≤ i ≤ k – C i is a cluster in D for all 1≤ i ≤ k, and C i ∩ C j =Ø 1≤ i<j ≤k – clusters are disjoint, C i ∩ C j = Ø for all 1≤ i<j ≤k. Formal Setup: Dendrograms and clusterings

12 Formal Setup: Hierarchical clustering algorithm A A Hierarchical Clustering Algorithm A maps X d (X,d) Input: A data set X with a distance function d, denoted (X,d)to X Output: A dendrogram of X

13 A An algorithm A is Linkage-Based if there exists a l :{(X 1, X 2,d): d X 1 u X 2 }→ R + linkage-function l :{(X 1, X 2,d): d over X 1 u X 2 }→ R + (X,d) A(X,d) such that for any (X,d), A(X,d) can be constructed as follows: X Create a single-node tree for every elements of X Linkage-Based Algorithm

14 A An algorithm A is Linkage-Based if there exists a l :{(X 1, X 2,d): d X 1 u X 2 }→ R + linkage-function l :{(X 1, X 2,d): d over X 1 u X 2 }→ R + (X,d) A(X,d) such that for any (X,d), A(X,d) can be constructed as follows: X Create a single-node tree for every elements of X Repeat the following until a single tree remains: l Merge the pair of trees whose element sets are closest according to l. Linkage-Based Algorithm Ex. Single-linkage, average-linkage, complete linkage

15 Outline Define Linkage-Based clustering Introduce two new properties of hierarchical clustering algorithms Main result Hierarchical clustering paradigms that are not linkage-based Conclusions

16 Locality Informal Definition If we select a set of disjoint clusters from a dendrogram, and run the algorithm on the union of these clusters, we obtain a result that is consistent with the original dendrogram. D = A(X,d) D’ = A(X’,d) X’={x 1, …, x 6 }

17 Outer Consistency A(X,d) C C The outer-consistent change makes the clustering C more prominent. A A(X,d’) C If A is outer-consistent, then A(X,d’) will also include the clustering C. C The outer-consistent change makes the clustering C more prominent. A A(X,d’) C If A is outer-consistent, then A(X,d’) will also include the clustering C. C(X,d) C on dataset (X,d) C(X,d’) C on dataset (X,d’) Increase pairwise between-cluster distances

18 Outline Define Linkage-Based clustering Introduce two new properties of hierarchical clustering algorithms Main result Hierarchical clustering paradigms that are not linkage-based Conclusions

19 Theorem: A hierarchical clustering function is Linkage-Based if and only if it is Local and Outer-Consistent. Our Main Result

20 Recall direction: AA If A satisfies Outer-Consistency and Locality, then A is Linkage-Based. Goal: l l A(X,d) Define a linkage function l so that the linkage-based clustering based on l outputs A(X,d) Xd (for every X and d). Brief Sketch of Proof

21 Define an operator < A : (X,Y,d 1 ) (Z,W,d 2 ) A(X u Y u Z u W,d) dd 1 d 2 XY ZW (X,Y,d 1 ) < A (Z,W,d 2 ) if when we run A on (X u Y u Z u W,d), where d extends d 1 and d 2, X and Y are merged before Z and W. Brief Sketch of Proof A(X,d) Z W X Y Prove that < A can be extended to a partial ordering by proving that it is cycle-free l R + This implies that there exists an order preserving function l that maps pairs of data sets to R +.

22 Outline Define Linkage-Based clustering Introduce two new properties of hierarchical clustering Main result Hierarchical clustering paradigms that are not linkage-based Conclusions

23 Hierarchical but Not Linkage-Based P P P -Divisive algorithms construct dendrograms top-down using a partitional 2-clustering algorithm P to determine how to split nodes. Many natural partitional 2-clustering algorithms satisfy the following property: P A partitional 2-clustering algorithm P is d ⊂ d’ Context Sensitive if there exist d ⊂ d’ so that P({x,y,z),d) = {x, {y,z}} P({x,y,z,w},d’)= {{x,y}, {z,w}}. P({x,y,z),d) = {x, {y,z}} and P({x,y,z,w},d’)= {{x,y}, {z,w}}. P A partitional 2-clustering algorithm P is d ⊂ d’ Context Sensitive if there exist d ⊂ d’ so that P({x,y,z),d) = {x, {y,z}} P({x,y,z,w},d’)= {{x,y}, {z,w}}. P({x,y,z),d) = {x, {y,z}} and P({x,y,z,w},d’)= {{x,y}, {z,w}}. Ex. K-means, min-sum, min-diameter, and further-centroids.

24 Hierarchical but Not Linkage-Based The input-output behaviour of some natural divisive algorithms is distinct from that of all linkage-based algorithms. The bisecting k-means algorithm, and other natural divisive algorithms, cannot be simulated by any linkage-based algorithm.

25 Outline Define Linkage-Based clustering Introduce two new properties of hierarchical clustering algorithms Main result Hierarchical clustering paradigms that are not linkage-based Conclusions

26 We characterize hierarchical Linkage-Based clustering in terms of two intuitive properties. Show that some natural hierarchical algorithms have different input-output behavior than any linkage-based algorithm.

27 Locality C = {C 1, …, C k }D = A(X,d) For any clustering C = {C 1, …, C k } in D = A(X,d), CD’ = A(X’ = u C i, d) C is also a clustering in D’ = A(X’ = u C i, d) C i DD’ C i roots the same sub-dendrogram in both D and D’ x,yX’xyDD’ For all x,y in X’, x occurs below y in D iff the same holds in D’. D’ = A(X’,d) X’={x 1, …, x 6 } D = A(X,d)


Download ppt "Discerning Linkage-Based Algorithms Among Hierarchical Clustering Methods Margareta Ackerman and Shai Ben-David IJCAI 2011."

Similar presentations


Ads by Google