Stability Yields a PTAS for k-Median and k-Means Clustering

Slides:

Advertisements

Similar presentations

A Fast PTAS for k-Means Clustering

Advertisements

Truthful Mechanisms for Combinatorial Auctions with Subadditive Bidders Speaker: Shahar Dobzinski Based on joint works with Noam Nisan & Michael Schapira.

On allocations that maximize fairness Uriel Feige Microsoft Research and Weizmann Institute.

A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,

Approximation algorithms for geometric intersection graphs.

Triangle partition problem Jian Li Sep,2005.  Proposed by Redstar in Algorithm board in Fudan BBS.  Motivated by some network design strategy.

Generalization and Specialization of Kernelization Daniel Lokshtanov.

A sublinear Time Approximation Scheme for Clustering in Metric Spaces Author: Piotr Indyk IEEE FOCS 1999.

Fast Algorithms For Hierarchical Range Histogram Constructions

COMP 553: Algorithmic Game Theory Fall 2014 Yang Cai Lecture 21.

Approximation Algorithms for TSP

1 Backdoor Sets in SAT Instances Ryan Williams Carnegie Mellon University Joint work in IJCAI03 with: Carla Gomes and Bart Selman Cornell University.

Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.

Why do we want a good ratio anyway? Approximation stability and proxy objectives Avrim Blum Carnegie Mellon University Based on work joint with Pranjal.

A (1+  )-Approximation Algorithm for 2-Line-Center P.K. Agarwal, C.M. Procopiuc, K.R. Varadarajan Computational Geometry 2003.

Infinite Horizon Problems

Discrete geometry Lecture 2 1 © Alexander & Michael Bronstein

1 Testing of clustering Article by : Noga Alon, Seannie Dar, Michal Parnas and Dana Ron Presented by: Nir Eitan.

Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol

CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.

Polynomial Time Approximation Scheme for Euclidian Traveling Salesman

A general approximation technique for constrained forest problems Michael X. Goemans & David P. Williamson Presented by: Yonatan Elhanani & Yuval Cohen.

Finding Compact Structural Motifs Presented By: Xin Gao Authors: Jianbo Qian, Shuai Cheng Li, Dongbo Bu, Ming Li, and Jinbo Xu University of Waterloo,

Clustering Color/Intensity

Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.

Integer Programming Difference from linear programming –Variables x i must take on integral values, not real values Lots of interesting problems can be.

A Discriminative Framework for Clustering via Similarity Functions Maria-Florina Balcan Carnegie Mellon University Joint with Avrim Blum and Santosh Vempala.

1 Introduction to Approximation Algorithms Lecture 15: Mar 5.

A Theory of Learning and Clustering via Similarity Functions Maria-Florina Balcan 09/17/2007 Joint work with Avrim Blum and Santosh Vempala Carnegie Mellon.

Approximation Algorithms

Computational aspects of stability in weighted voting games Edith Elkind (NTU, Singapore) Based on joint work with Leslie Ann Goldberg, Paul W. Goldberg,

Radial Basis Function Networks

Minimizing Makespan and Preemption Costs on a System of Uniform Machines Hadas Shachnai Bell Labs and The Technion IIT Tami Tamir Univ. of Washington Gerhard.

Approximation Algorithms for Stochastic Combinatorial Optimization Part I: Multistage problems Anupam Gupta Carnegie Mellon University.

1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.

1 The TSP : NP-Completeness Approximation and Hardness of Approximation All exact science is dominated by the idea of approximation. -- Bertrand Russell.

1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.

Great Theoretical Ideas in Computer Science.

Approximation Algorithms for Knapsack Problems 1 Tsvi Kopelowitz Modified by Ariel Rosenfeld.

Packing Rectangles into Bins Nikhil Bansal (CMU) Joint with Maxim Sviridenko (IBM)

Speaker: Yoni Rozenshein Instructor: Prof. Zeev Nutov.

Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.

Princeton University COS 423 Theory of Algorithms Spring 2001 Kevin Wayne Approximation Algorithms These lecture slides are adapted from CLRS.

1 Prune-and-Search Method 2012/10/30. A simple example: Binary search sorted sequence : (search 9) step 1  step 2  step 3  Binary search.

1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.

Finding Low Error Clusterings TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA Maria-Florina Balcan.

Approximation Algorithms for TSP Tsvi Kopelowitz 1.

Harnessing implicit assumptions in problem formulations: Approximation-stability and proxy objectives Avrim Blum Carnegie Mellon University Based on work.

Hedonic Clustering Games Moran Feldman Joint work with: Seffi Naor and Liane Lewin-Eytan.

Unique Games Approximation Amit Weinstein Complexity Seminar, Fall 2006 Based on: “Near Optimal Algorithms for Unique Games" by M. Charikar, K. Makarychev,

The Effectiveness of Lloyd-type Methods for the k-means Problem Chaitanya Swamy University of Waterloo Joint work with Rafi Ostrovsky, Yuval Rabani, Leonard.

1 Approximation algorithms Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij TexPoint fonts used in EMF. Read the TexPoint manual.

Approximating Buy-at-Bulk and Shallow-Light k-Steiner Trees Mohammad T. Hajiaghayi (CMU) Guy Kortsarz (Rutgers) Mohammad R. Salavatipour (U. Alberta) Presented.

Correlation Clustering Nikhil Bansal Joint Work with Avrim Blum and Shuchi Chawla.

Correlation Clustering Shuchi Chawla Carnegie Mellon University Joint work with Nikhil Bansal and Avrim Blum.

Topics in Algorithms 2005 The Turnpike Problem Ramesh Hariharan.

1 The instructor will be absent on March 29 th. The class resumes on March 31 st.

Approximation Algorithms by bounding the OPT Instructor Neelima Gupta

The geometric GMST problem with grid clustering Presented by 楊劭文, 游岳齊, 吳郁君, 林信仲, 萬高維 Department of Computer Science and Information Engineering, National.

Clustering Data Streams A presentation by George Toderici.

COSC 3101A - Design and Analysis of Algorithms 14 NP-Completeness.

Sketching complexity of graph cuts Alexandr Andoni joint work with: Robi Krauthgamer, David Woodruff.

Clustering – Definition and Basic Algorithms Seminar on Geometric Approximation Algorithms, spring 11/12.

Linear program Separation Oracle. Rounding We consider a single-machine scheduling problem, and see another way of rounding fractional solutions to integer.

Correlation Clustering

Approximation Algorithms for TSP

k-center Clustering under Perturbation Resilience

Embedding Metrics into Geometric Spaces

Presentation transcript:

Stability Yields a PTAS for k-Median and k-Means Clustering Pranjal Awasthi, Avrim Blum, Or Sheffet Carnegie Mellon University November 3rd, 2010

Stability Yields a PTAS for k-Median and k-Means Clustering Introduce k-Median / k-Means problems. Define stability Previous notion [ORSS06] Weak Deletion Stability ¯-distributed instances The algorithm for k-Median Conclusion + open problems.

Clustering In Real Life Clustering: come up with desired partition You’re comcast, looking to build an infrastructure in MV.

Clustering in a Metric Space Clustering: come up with desired partition Input n points A distance function d:n£n! R¸0 satisfying: Reflexive: 8 p, d(p,p) = 0 Symmetry: 8 p,q, d(p,q) = d(q,p) Triangle Inequality: 8 p,q,r, d(p,q) · d(p,r)+d(r,q) k-partition q p r

Clustering in a Metric Space Clustering: come up with desired partition Input: n points A distance function d:n£n! R¸0 satisfying: Reflexive: 8 p, d(p,p) = 0 Symmetry: 8 p,q, d(p,q) = d(q,p) Triangle Inequality: 8 p,q,r, d(p,q) · d(p,r)+d(r,q) k-partition k is large, e.g. k=polylog(n)

k-Median Input: 1. n points in a finite metric space 2. k Goal: Partition into k disjoint subsets: C*1, C*2 , … , C*k Choose a center per subset Cost: cost(C*i )= x d(x,c*i) Cost of partition: i cost(C*i) Given centers ) Easy to get best partition Given partition ) Easy to get best centers

k-Means Input: 1. n points in Euclidean space 2. k Goal: Partition into k disjoint subsets : C*1, C*2 , … , C*k Choose a center per subset Cost: cost(C*i )= x d2(x, c*i) Cost of partition: i cost(C*i) Given centers ) Easy to get best partition Given partition ) Easy to get best centers

Polynomial Time Approximation Scheme We Would Like To… Solve k–median/ k-means problems. NP-hard to get OPT (= cost of optimal partition) Find a c-approximation algorithm A poly-time algorithm guaranteed to output a clustering whose cost · c OPT Ideally, find a PTAS Get a c-approximation algorithm where c = (1+²), for any ²>0. Runtime can be exp(1/²) c OPT Alg 2OPT Alg 1.5 OPT Alg 1.1 OPT Alg OPT Polynomial Time Approximation Scheme

Related Work We focus on large k (e.g. k=polylog(n)) k-Median k-Means Easy (try all centers) in time nk PTAS, exponential in (k/²) [KSS04] Small k (3+²)-apx [GK98, CGTS99, AGKMMP01, JMS02, dlVKKR03] (1.367...)-apx hardness [GK98, JMS02] General k 9-apx [OR00, BHPI02, dlVKKR03, ES04, HPM04, KMNPSW02] No PTAS! Euclidean k-Median [ARR98], PTAS if dimension is small (loglog(n)c) [ORSS06] Special case We focus on large k (e.g. k=polylog(n)) Runtime goal: poly(n,k)

World All possible instances

ORSS Result (k-Means) You’re comcast, looking to build an infrastructure in MV.

ORSS Result (k-Means) Why use 5 sites? You’re comcast, looking to build an infrastructure in MV. Why use 5 sites?

ORSS Result (k-Means) You’re comcast, looking to build an infrastructure in MV.

ORSS Result (k-Means) You’re comcast, looking to build an infrastructure in MV.

ORSS Result (k-Means) You’re comcast, looking to build an infrastructure in MV.

ORSS Result (k-Means) Our Result (k-Means) Instance is stable if OPT(k-1) > (1/®)2 OPT(k) (require 1/® > 10) Give a (1+O(®))-approximation. Our Result (k-Means) Instance is stable if OPT(k-1) > (1+®) OPT(k) (require ® > 0) Give a PTAS ((1+²)-approximation). Runtime: poly(n,k) exp(1/®,1/²)

Philosophical Note Stable instances: 9®>0 s.t. OPT(k-1) > (1+®) OPT(k) Not stable instances: 8®>0 s.t. OPT(k-1) · (1+®) OPT(k) A (1+®)-approximation can return a (k-1)-clustering. Any PTAS can return a (k-1)-clustering. It is not a k-clustering problem, It is a (k-1)-clustering problem! If we believe our instance inherently has k clusters “Necessary condition“ to guarantee: PTAS will return a “meaningful” clustering. Our result: It’s a sufficient condition to get a PTAS.

Any (k-1) clustering is significantly costlier than OPT(k) World All possible instances Any (k-1) clustering is significantly costlier than OPT(k) ORSS Stable

A Weaker Guarantee Why use 5 sites? You’re comcast, looking to build an infrastructure in MV. Why use 5 sites?

A Weaker Guarantee You’re comcast, looking to build an infrastructure in MV.

A Weaker Guarantee You’re comcast, looking to build an infrastructure in MV.

(1+®)-Weak Deletion Stability Consider OPT(k). Take any cluster C*i, associate its points with c*j. This increases the cost to at least (1+®)OPT(k). An obvious relaxation of ORSS-stability. Our result: suffices to get a PTAS. ) c*j c*j c*i

Merging any two clusters in OPT(k) increases the cost significantly World All possible instances ORSS Stable Weak-Deletion Stable Merging any two clusters in OPT(k) increases the cost significantly

¯-Distributed Instances For every cluster C*i, and every p not in C*i, we have: p c*i We show that: k-median: (1+®)-weak deletion stability ) (®/2)-distributed. k-means: (1+®)-weak deletion stability ) (®/4)-distributed.

Claim: (1+®)-Weak Deletion Stability ) (®/2)-Distributed p c*i c*j ®OPT · x d(x, c*j) - x d(x, c*i) · x [d(x, c*i) + d(c*i, c*j)] - x d(x, c*i) = x d(c*i, c*j) = |C*i| d(c*i, c*j) ) ®(OPT/|C*i|) · d(c*i, c*j) · d(c*i, p) + d(p, c*j) · 2d(c*i, p)

World ORSS Stable Weak-Deletion Stable ¯-Distributed All possible instances ORSS Stable Weak-Deletion Stable ¯-Distributed In optimal solution: large distance between a center to any “outside” point

Main Result We give a PTAS for ¯-distributed k-median and k-means instances. Running time: There are NP-hard ¯-distributed instances. (Superpolynomial dependence on 1/² is unavoidable!)

Stability Yields a PTAS for k-Median and k-Means Clustering Introduce k-Median / k-Means problems. Define stability PTAS for k-Median High level description Intuition (“had only we known more…”) Description Conclusion + open problems.

k-Median Algorithm’s Overview Right definition of “core”. Get the core of each cluster. L can’t get too big. Input: Metric, k, ¯, OPT 1. Initialization stage Initialize a list, L. L = set of “suspected” clusters “cores”. 2. Population stage Populate L with subsets of points. 3. Center-Retrieving stage For each component in L: Choose center (pt that minimizes cost) For any choice of k components in L: Evaluate k-median cost with resp. k centers

k-Median Algorithm’s Overview Input: Metric, k, ¯, OPT 0. Handle “extreme” clusters (Brute-force guessing of some clusters’ centers) 1. Populate L with components 2. Pick best center in each component 3. Try all possible k-centers L := List of “suspected” clusters’ “cores”

k-Median Algorithm’s Overview Input: Metric, k, ¯, OPT 0. Handle “extreme” clusters (Brute-force guessing of some clusters’ centers) 1. Populate L with components 2. Pick best center in each component 3. Try all possible k-centers Right definition of “core”. Get the core of each cluster. L can’t get too big.

Intuition: “Mind the Gap” We know: In contrast, an “average” cluster contributes: So for an “average” point p, in an “average” cluster C*i,

Intuition: “Mind the Gap” We know: Denote the core of a cluster C*i c*i

Intuition: “Mind the Gap” We know: Denote the core of a cluster C*i Formally, call cluster C*i cheap if Assume all clusters are cheap. In general: we brute-force guess O(1/¯²) centers of expensive clusters in Stage 0.

Intuition: “Mind the Gap” We know: Denote the core of a cluster C*i Formally, call cluster C*i cheap if Markov: At most (²/4) fraction of the points of a cheap cluster, lie outside the core.

Markov Inequality Claim: At most (²/4) fraction of the points of a cheap cluster, lie outside the core. Proof by contradiction. ) Cluster isn’t cheap.

Intuition: “Mind the Gap” We know: Denote the core of a cluster C*i Formally, call cluster C*i cheap if Markov: At least half of the points of a cheap cluster lie inside the core.

Magic (r/4) Ball “Heavy”: If p belongs to the core ) B(p, r/4) contains ¸ |C*i|/2 pts. Denote r = ¯(OPT/|C*i|). “Heavy”: Mass ¸ |C*i|/2 r/4 > r · r/8 c*i p

All points in the core are merged into one set! Magic (r/4) Ball Draw a ball of radius r/4 around all points. Unite “heavy” balls whose centers overlap. Denote r = ¯(OPT/|C*i|). > r · r/8 c*i All points in the core are merged into one set!

pts from other clusters? Magic (r/4) Ball Draw a ball of radius r/4 around all points. Unite “heavy” balls whose centers overlap. Denote r = ¯(OPT/|C*i|). > r · r/8 c*i Could we merge core pts with pts from other clusters?

r/4 = r/2 - r/4 · d(x,c*i) · 3r/4 + r/4 = r Magic (r/4) Ball Draw a ball of radius r/4 around all points. Unite “heavy” balls whose centers overlap. Denote r = ¯(OPT/|C*i|). x p x > r · r/8 c*i r/2 · d(p,c*i) · 3r/4 r/4 = r/2 - r/4 · d(x,c*i) · 3r/4 + r/4 = r

x falls outside the core Magic (r/4) Ball Draw a ball of radius r/4 around all points. Unite “heavy” balls whose centers overlap. Denote r = ¯(OPT/|C*i|). x p > r · r/8 c*i r/4 · d(x,c*i) · r x falls outside the core x belongs to C*i

More than |C*i|/2 pts fall outside the core! Magic (r/4) Ball Draw a ball of radius r/4 around all points. Unite “heavy” balls whose centers overlap. Denote r = ¯(OPT/|C*i|). x p > r · r/8 c*i r/4 · d(x,c*i) · r More than |C*i|/2 pts fall outside the core! )(

Finding the Right Radius Draw a ball of radius r/4 around all points. Unite “heavy” balls whose centers overlap. Denote r = ¯(OPT/|C*i|). Problem: we don’t know |C*i| Solution: Try all sizes, in order! Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Complication: When s gets small (s=4,3,2,1) we collect many “leftovers” of one cluster. Solution: once we add a subset to L, we remove close-by points.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L.

Population Stage Remainder of the proof: Set s = n, n-1, n-2, …, 1 Set rs = ¯(OPT/s) Draw a ball of radius r/4 around each point Unite balls containing ¸ s/2 pts whose centers overlap Once a set ¸ s/2 is found Put this set in L Remove all points in a (r/2)-”buffer zone” from L. Remainder of the proof: Even with “buffer zones” - still collect cores. #{Components without core pts} in L is O(1/¯) cost(k centers from cores) · (1+²)OPT

A Note About k-Means Roughly the same algorithm, consts » squared. Problem: Can’t guess centers for expensive clusters! Solution: A random sample of O(1/²) pts from each cluster approximates the center of mass. Brute force guess O(1/²) pts from O(1/¯²) expensive clusters. Better solution: Randomly sample O(1/²) pts from expensive clusters whose size ¸ poly(1/k) fraction of the instance. Slight complication: introduce intervals. Expected runtime:

Conclusion World: 8 ²>0, a (1+²)-approximation algorithm for ¯-distributed instances of k-median / k-means. Improve constants? Other clustering objectives (k-centers)? ORSS Stable Weak-Deletion Stable ¯-Distributed

Take Home Message Life ( , ) gives you a k-median instance. Stability = A belief that a PTAS is meaningful This allows us to introduce a PTAS! To what other NP-hard problems similar logic applies? - “Can you solve it?” - “NO!!!” Stability gives us an “Archimedean Point” that allows us to bypass NP-hardness. But that’s not new!

Thank you!

World BBG Stable+ ORSS Stable Weak-Deletion Stable ¯-Distributed All possible instances BBG Stable+ ORSS Stable Weak-Deletion Stable ¯-Distributed

Our (1+®)-approx algorithm outputs a meaningful k-clustering BBG Result We have target clustering. k-median is a proxy: Target is close to OPT(k). Problem: k-median is NP-hard. Solution: Use approximation alg. We would like: Our (1+®)-approx algorithm outputs a meaningful k-clustering Proxy: your goal is to retrieve the target clustering. The fact that k-Median and the target are close is something you assume.

Our (1+®)-approx algorithm outputs a meaningful k-clustering BBG Result We have target clustering. k-median is a proxy: Target is close to OPT(k). Problem: k-median is NP-hard. Solution: Use approximation alg. We would like: Our (1+®)-approx algorithm outputs a meaningful k-clustering Proxy: your goal is to retrieve the target clustering. The fact that k-Median and the target are close is something you assume.

BBG Result We have target clustering. k-median is a proxy: Target is close to OPT(k). Problem: k-median is NP-hard. Solution: Use approximation alg. Implicit assumption: Any k-clustering with cost at most (1+®)OPT is ±-close (pointwise) to target Proxy: your goal is to retrieve the target clustering. The fact that k-Median and the target are close is something you assume.

BBG Result Our result: if all clusters’ sizes are (±n/®) Instance is (BBG) stable: Any two k-partitions with cost · (1+®)OPT(k) differ over no more than (2±)-fraction of the input Give algorithm to get O(±/®)-close to the target. Additionally (k-median): if all clusters’ sizes are (±n/®) then get ±-close to the target. Our result: if BBG-stability & clusters are >2±n then PTAS for k-median (implies: get ±-close to the target). Mention that in k-means they get \delta/\alpha close, whereas we get \delta-close.

Claim: BBG-Stability & Large Clusters ) (1+®)-Weak Deletion Stability We know: Any two k-partitions with cost · (1+®)OPT(k) differ over · 2± fraction of the input All clusters contain >2±n points Take optimal k-clustering. Take C*i, move all points but c*i to C*j. New partition and OPT differ on >2±n pts ) cost(OPTi!j) ¸ cost( ) ¸ (1+®)OPT(k) * Because clusters are large this counting argument is possible. (any k-1 clustering is far from the target clustering.) * Might skip this slide. ) c*j c*j c*i c*i

World BBG Stable+ ORSS Stable Weak-Deletion Stable ¯-Distributed All possible instances BBG Stable+ ORSS Stable Weak-Deletion Stable ¯-Distributed