Presentation is loading. Please wait.

Presentation is loading. Please wait.

Clustering Motion or Crunching Clusters, Hiding Logs

Similar presentations


Presentation on theme: "Clustering Motion or Crunching Clusters, Hiding Logs"β€” Presentation transcript:

1 Clustering Motion or Crunching Clusters, Hiding Logs
Sa r iel Ha r -Peled UIUC

2 Introduction k-center clustering Input: P - set of n points in
Target: Cluster the points into k clusters. Useful for learning, data-mining, databases, etc. Huge field. A lot of variants. k-center clustering Cover with k balls of min-max radius. 𝑅 𝑑

3 K-Center clustering [Gonzalez, 85] Always pick the furthest point.
2-approximation with the triangle inequality.

4 Clustering Motion Demo
How to do k-center clustering for moving points? Demo

5 Clustering Motion - cont'
Problem: Staying competitive with optimal (changing) optimal k clustering. Idea: In any point in time, stop and cluster when needed. Slow... Kinetic Data Structures Large number of events Payment even if we dont use clusters. Cons: General, no assumptions.

6 Static Competitive Clustering
P(t) - set of moving points Partition P(t) into "small" number of sets s.t. (c,m,k)-static clustering Q: Is there (const, small val, k)-static clustering? 𝑆 1 𝑑 , 𝑆 2 𝑑 ,... 𝑆 π‘š 𝑑 π‘Ÿπ‘Žπ‘‘π‘–π‘’π‘  𝑆 𝑖 𝑑 β‰€π‘βˆ— π‘Ÿ π‘œπ‘π‘‘ 𝑃 𝑑 ,π‘˜

7 Result P(t) - points move with motion of degree
ΞΌ P(t) - points move with motion of degree There exists a clustering. Extension: static clustering. Boils down to a strange clustering problem that can be solved efficiently using new technique. 2 ΞΌ+1 βˆ’1, π‘˜ ΞΌ+1 ,π‘˜ Ο΅,𝑂 π‘˜ Ο΅ 𝑑 ΞΌ+1 ,π‘˜

8 Reduction Points moving linearly on the real line -> line in 2d
Clustering: Covering lines with short vertical segments π‘Ÿ π‘œπ‘π‘‘ 𝑃,𝑑 𝑑 Time axis

9 Static clustering Q: Find partition of lines? Time axis

10 Algorithm - cont' 𝑑′ 𝑑′ 𝑑′ 𝑑

11 Result P(t) - points moving linearly One can partition P(t) into sets
Q: How to compute this partition? # of clusters is not crucial. A: Find time t' that minimizes optimal k-center clustering π‘˜ 2 𝑃 1, ..., 𝑃 π‘˜ 2 π‘Ÿπ‘Žπ‘‘π‘–π‘’π‘  𝑃 𝑖 ,𝑑 ≀3 π‘Ÿ π‘œπ‘π‘‘ 𝑃,𝑑,π‘˜

12 2-Approx K-Center Clustering
[Gonzalez, 85] Requires only the triangles inequality. Can be implemented in O(nk) time. [Feder, Greene, 88] Points in cover by k balls of min-max radius Running time Optimal in the comparison model. Better than approx NP-Complete. 𝑅 𝑑 𝑂 𝑛log π‘˜

13 Result 2-approx k-center clustering in linear time for geometric settings. (hiding logs...) Algorithm - floor, randomized, hashing Extends to other clustering variants Covering by min-max width strips in the plane in time. [Agarwal, Procopiuc, 00] 𝑂 π‘˜logπ‘˜ 𝑂 𝑛log π‘˜ 𝑂 𝑛 π‘˜ 2 log 4 π‘˜

14 Lemma S -sample of size m from P If -> Any k clustering of S
-good with high probability. Known: [Alon, Dar, Parnas, Ron, 00] [Mishra, Oblinger, Pitt, 01] π‘šβ‰₯ logρ+ΞΌπ‘˜log𝑛+Ξ²log𝑛 Ο΅ β‰ˆΞ© π‘˜log𝑛 Ο΅ Ο΅

15 Algorithm S: Sample of size C: opt/approx k-clustering of S
𝑂 π‘˜log𝑛 Ο΅ S: Sample of size C: opt/approx k-clustering of S Compute the set P' not covered by C. By lemma: C': opt/approx k-clustering of P' Return as resulting clustering βˆ£π‘ƒβ€²βˆ£β‰€Ο΅π‘› 𝐢βˆͺ𝐢′

16 Point-Location Idea: Use point-location data-structure.
𝑂 𝑛 π‘˜ 3 log 𝑛 + π‘˜ 2 +𝑛log π‘˜ =𝑂 𝑛logπ‘˜

17 Result Compute 2k clusters Radius is Running time:
Same running time as [Feder and Greene, 88] ≀ 2r π‘œπ‘π‘‘ 𝑂 𝑛logπ‘˜

18 Clustering algorithm needs...
Two black boxes: A way to do k-clustering A fast way to do point-location queries in clusters Result: 2k-clusters comparable to k-opt clustering in linear / near-linear time. Generic. Bottom line: approx clustering is easy if clusters # is not crucial.

19 2-approx clustering with k clusters
Uses grid for fast O(1) point-location Long sequence of simple observations. Uses ideas from [Feder, Greene, 88] Details are technical but straightforward... Overall result: 2-approx k-center clustering in Expected linear running time. 𝑅 𝑑

20 Conclusions Proof that small static clustering of moving points exists. New trick for fast clustering using point-location Linear time algorithm for 2-approx k-center clustering. Efficient algorithm for computing the static clustering of moving points.

21 Quote To calm my nerves I calculated till evening the components of its trajectory, as well as the orbital perturbation caused by the presence of the lost wrench. I figured out that for the next six million years the sirloin, rotating about the ship in circular path, would lead the wrench, then catch up with it from behind and pass it again. The Star Diaries, Stanislaw Lem.


Download ppt "Clustering Motion or Crunching Clusters, Hiding Logs"

Similar presentations


Ads by Google