Download presentation
Presentation is loading. Please wait.
Published byKevin Ray Modified over 9 years ago
1
Hierarchical Affinity Propagation Inmar E. Givoni, Clement Chung, Brendan J. Frey
2
outline A Binary Model for Affinity Propagation Hierarchical Affinity Propagation Experiments
3
A Binary Model for Affinity Propagation
5
The Max-Sum Update Rules 变量节点发给函数节点的消息:是变量节点 收到其他与之关联的函数节点发来的消息的 和。 where the notation ne(x)\ f is used to indicate the set of variable node x’s neighbors excluding function node f.
6
函数节点发给变量节点的消息:是函数节点的 值和其他变量节点发给它的消息累加和的最大 值。 ne( f )\x is used to indicate the set of function node f ’s neighbors excluding variable node x.
7
A binary variable model for affinity propagation
9
We derive the scalar message updates in the binary variable AP model. Recall the max-sum message update rules. The scalar message difference βi j (1) − βi j (0) is denoted by βi j. Similar notation is used for α, ρ, and η. In what follows, for each message we calculate its value for each setting of the binary variable and then take the difference.
10
The αi j messages are identical to the AP availability messages a(i, j),and the ρi j messages are identical to the AP responsibility messages r (i, j).Thus, we have recovered the original affinity propagation updates.
11
Hierarchical Affinity Propagation Goal: to solve the hierarchical clustering problem. What’s hierarchical clustering ? 层次聚类算法与之前所讲的聚类有很大不同,它 不再产生单一聚类,而是产生一个聚类层次。说 白了就是一棵层次树。 层次聚类算法可分为凝聚( agglomerative ,自底向 上)和分裂( divisive ,自顶向下)两种。自底向 上,一开始,每个数据点各自为一个类别,然后 每一次迭代选取距离最近的两个类别,把他们合 并,直到最后只剩下一个类别为止,至此一棵树 构造完成。自顶向下与之相反过程。
12
Model Goal: We propose a hierarchical exemplar based clustering objective function in terms of a high-order factor-graph, and we derive an efficient approximate loopy max-sum algorithms. We wish to find a set of L consecutive layers of clustering, where the points to be clustered in layer l are constrained to be in the exemplar set of layer l-1.
13
(a) HAP factor-graph, a single layer of the standard AP model is shown in the dotted square. (b) HAP messages.
14
Differences 1.The main difference compared to the at representation is manifested in the functions: if point i is not chosen as an exemplar at layer l-1, (i.e. if = 0), then point i will not be clustered at layer l. Alternatively, if point i is chosen as an exemplar at layer l- 1, it must choose an exemplar at layer l.
15
2 We note the 1ij messages passed in the first layer and the Lij messages passed in the top-most layer are identical to the standard AP messages for an AP layer.
16
Experiments 2D synthetic data Analysis of Synthetic HIV Sequences
17
Figure : 2D synthetic data: comparison of objective Eq. (8) achieved by HAP and its greedy counterpart (Greedy). Top:Median percent improvement of HAP over Greedy for a given number of layers used. Bottom: Scatter plots of the net similarity achieved by HAP v.s. Greedy. Experiments for which HAP obtains better results than Greedy are below the line. Total percent of settings where HAP outperforms Greedy is reported in the inset. Color in scatter-plot indicates the number of layers.
18
First, we plotted precision v.s. recall for various clustering settings Synthetic HIV data: precision-recall for HAP, Greedy, HKMC and HKMeans applied to the problem of identifying ancestral sequences from a set of 867 synthetic HIV sequences. For HKMC and HKMeans, we only plot the best precision obtained for each unique recall value.
19
Synthetic HIV data: distribution of Rand index for different experiments using HAP and Greedy. A higher Rand index indicates the solution better resembles the ground truth. Experiments for which HAP obtains better results than Greedy are below the line. The percentage of solutions that identified the correct single ancestor sequence at the top layer (layer 4) is also reported.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.