Randomized Composable Core-sets for Submodular Maximization Morteza Zadimoghaddam and Vahab Mirrokni Google Research New York.

Slides:

Advertisements

Similar presentations

Submodular Set Function Maximization via the Multilinear Relaxation & Dependent Rounding Chandra Chekuri Univ. of Illinois, Urbana-Champaign.

Advertisements

Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.

Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.

C&O 355 Lecture 23 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.

Size-estimation framework with applications to transitive closure and reachability Presented by Maxim Kalaev Edith Cohen AT&T Bell Labs 1996.

Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.

Fast Algorithms For Hierarchical Range Histogram Constructions

Parallel Double Greedy Submodular Maxmization Xinghao Pan, Stefanie Jegelka, Joseph Gonzalez, Joseph Bradley, Michael I. Jordan.

Maximizing the Spread of Influence through a Social Network

Learning Submodular Functions Nick Harvey University of Waterloo Joint work with Nina Balcan, Georgia Tech.

Algorithms for Max-min Optimization

Reducing the collection of itemsets: alternative representations and combinatorial problems.

Instructor Neelima Gupta Table of Contents Lp –rounding Dual Fitting LP-Duality.

Randomized Algorithms and Randomized Rounding Lecture 21: April 13 G n 2 leaves

Sensor placement applications Monitoring of spatial phenomena Temperature Precipitation... Active learning, Experiment design Precipitation data from Pacific.

Approximation Algorithms

1 Budgeted Nonparametric Learning from Data Streams Ryan Gomes and Andreas Krause California Institute of Technology.

Zoë Abrams, Ashish Goel, Serge Plotkin Stanford University Set K-Cover Algorithms for Energy Efficient Monitoring in Wireless Sensor Networks.

Data Flow Analysis Compiler Design Nov. 8, 2005.

The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.

1 Introduction to Approximation Algorithms Lecture 15: Mar 5.

Approximation Algorithms: Bristol Summer School 2008 Seffi Naor Computer Science Dept. Technion Haifa, Israel TexPoint fonts used in EMF. Read the TexPoint.

Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.

Approximation Algorithms for Stochastic Combinatorial Optimization Part I: Multistage problems Anupam Gupta Carnegie Mellon University.

Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.

Yossi Azar Tel Aviv University Joint work with Ilan Cohen Serving in the Dark 1.

Approximation Algorithms for NP-hard Combinatorial Problems Magnús M. Halldórsson Reykjavik University

Approximation algorithms for sequential testing of Boolean functions Lisa Hellerstein Polytechnic Institute of NYU Joint work with Devorah Kletenik (Polytechnic.

1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.

1 On Completing Latin Squares Iman Hajirasouliha Joint work with Hossein Jowhari, Ravi Kumar, and Ravi Sundaram.

Heuristic Optimization Methods Greedy algorithms, Approximation algorithms, and GRASP.

Approximation Algorithms for Prize-Collecting Forest Problems with Submodular Penalty Functions Chaitanya Swamy University of Waterloo Joint work with.

Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.

Online Social Networks and Media

A Membrane Algorithm for the Min Storage problem Dipartimento di Informatica, Sistemistica e Comunicazione Università degli Studi di Milano – Bicocca WMC.

Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.

Submodular Maximization with Cardinality Constraints Moran Feldman Based On Submodular Maximization with Cardinality Constraints. Niv Buchbinder, Moran.

Lecture.6. Table of Contents Lp –rounding Dual Fitting LP-Duality.

Improved Competitive Ratios for Submodular Secretary Problems ? Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.

A Unified Continuous Greedy Algorithm for Submodular Maximization Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.

Maximization Problems with Submodular Objective Functions Moran Feldman Publication List Improved Approximations for k-Exchange Systems. Moran Feldman,

Vasilis Syrgkanis Cornell University

Non-Preemptive Buffer Management for Latency Sensitive Packets Moran Feldman Technion Seffi Naor Technion.

Deterministic Algorithms for Submodular Maximization Problems Moran Feldman The Open University of Israel Joint work with Niv Buchbinder.

Aspects of Submodular Maximization Subject to a Matroid Constraint Moran Feldman Based on A Unified Continuous Greedy Algorithm for Submodular Maximization.

Maximizing Symmetric Submodular Functions Moran Feldman EPFL.

TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.

Polyhedral Optimization Lecture 5 – Part 3 M. Pawan Kumar Slides available online

Clustering Data Streams A presentation by George Toderici.

Unconstrained Submodular Maximization Moran Feldman The Open University of Israel Based On Maximizing Non-monotone Submodular Functions. Uriel Feige, Vahab.

Approximation algorithms for combinatorial allocation problems

Clustering Data Streams

Independent Cascade Model and Linear Threshold Model

Data Driven Resource Allocation for Distributed Learning

Monitoring rivers and lakes [IJCAI ‘07]

Vitaly Feldman and Jan Vondrâk IBM Research - Almaden

Moran Feldman The Open University of Israel

Distributed Submodular Maximization in Massive Datasets

Framework for the Secretary Problem on the Intersection of Matroids

Data Integration with Dependent Sources

k-center Clustering under Perturbation Resilience

CIS 700: “algorithms for Big Data”

Coverage Approximation Algorithms

CSCI B609: “Foundations of Data Science”

Topic 3: Prob. Analysis Randomized Alg.

Submodular Maximization Through the Lens of the Multilinear Relaxation

Submodular Maximization in the Big Data Era

Independent Cascade Model and Linear Threshold Model

Submodular Maximization with Cardinality Constraints

Guess Free Maximization of Submodular and Linear Sums

Presentation transcript:

Randomized Composable Core-sets for Submodular Maximization Morteza Zadimoghaddam and Vahab Mirrokni Google Research New York

Outline Core-sets and their Applications Why submodular functions? A simple algorithm for monotone and non- monotone submodular functions. Improving approximation factor for monotone submodular functions.

Processing Big Data Input is too large to fit in one machine Extract and process a compact representation: – Sampling: focus only on a small subset of data – Sketching: compute a small summary of data, e.g. mean, variance, … – Mergeable Summaries: if multiple summaries can be merged while preserving accuracy: See [Ahn, Guha, McGregor PODS,SODA2012], [Agarwal et al. PODS2012]. – Composable core-sets [Indyk et al. 2014]

Composable Core-sets Partition input into several parts T 1, T 2, …, T m In each part, select a subset S i T i Take the union of selected sets:S=S 1 S 2 … S m Solve the problem on S Evaluation: We want set S to represent the original big input well, and preserve the optimum solution approximately.

Core-sets Applications Diversity and Coverage Maximization; Indyk, Mahabadi, Mahdian, Mirrokni [PODS 2014] k-means and k-median; Balcan et al. NIPS2013 Distributed Balanced Clustering; Bateni et al. NIPS 2014

Submodular Functions A non-negative set function f defined on subsets of a ground set N, i.e. f: 2 N R + {0} f is submodular iff for any two subsets A and B – f(A) + f(B) ≥ f(A B) + f(A B) Alternative definition: f is submodular iff for any two subsets A B, and element x: – f(A {x}) – f(A) ≥ f(B {x}) - f(B)

Submodular Maximization with Cardinality Constraints Given a cardinality constraint k, find a size k subset of N with maximum f value Machine Learning Applications: Exemplar based clustering, active, set selections, graph cuts, … [Mirzasoleiman, Karbasi, Sarkar, Krause NIPS 2013] Applications for search engines: coverage utility functions

Submodular Maximization via MapReduce Fast Greedy Algorithms in MapReduce and Streaming; Kumar et al. [SPAA 2013] They get 1/2 approximation factor with super constant number of rounds

Searched Keywords with Multiple Meanings

Formal Definition of Composable Core-sets Define, e.g. f k (N) is the value of optimum solution. Define ALG(T) to be the output of algorithm ALG on input set T. Suppose |ALG(T)| ≤ k. ALG is α-approximate composable core-set iff for any collection of sets T 1, T 2, …, T m we have

Applications of composable core-sets Distributed Approximation: – Distribute input between m machines, – ALG selects set S i = ALG(T i ) in machine 1 ≤ i ≤ m, – Gather the union of selected items, S 1 S 2 … S m, on a single machine, and select k elements. Streaming Models: Partition the sequence of elements, and simulate the above procedure. A class of nearest neighbor search problems

Bad News! Theorem[Indyk et al. 2013] There exists no better than approximate composable core-set for submodular maximization

Randomization comes to rescue Instead of working with worst case collection of sets T 1, T 2, …, T m, suppose we have a random partitioning of the input. We say ALG is α-approximate randomized composable core-set iff where the expectation is taken over the random choice of {T 1, T 2, …, T m }

General Framework Input Set N Machine 1 Machine m Machine 2 T1T1 T2T2 TmTm Run ALG in each machine Selected elements S1S1 S2S2 SmSm output set Run ALG’ to find the final size k output set

Good news! Theorem [Mirrokni, Z]: There exists a class of O(1)-approximate randomized composable core-sets for monotone and non-monotone submodular maximization. In particular, algorithm Greedy is 1/3- approximate randomized core-set for monotone f, and (1/3-1/3m)-approximate for non-monotone f.

Family of β-nice algorithms ALG is β-nice if for any set T and element x T \ ALG(T) we have: – ALG(T) = ALG(T\{x}) – Δ(x, ALG(T)) is at most βf(ALG(T))/k where Δ(x, A) is the marginal value of adding x to set A, i.e. Δ(x, A) = f(A {x})-f(A) Theorem: A β-nice algorithm is (1/(2+β))-approx randomized composable core-sets for monotone f and ((1-1/m)/(2+β))-approx for non-monotone.

Algorithm Greedy Given input set T, Greedy returns a size k output set S as follows: – Start with an empty set For k iterations, find an item x T with maximum marginal value to S, Δ(x, S), and add x to S. Remark: Greedy is a 1-nice algorithm. In the rest, we analyze algorithm Greedy for a monotone submodular function f.

Analysis Let OPT be the size k subset with maximum value of f. Let OPT’ be OPT (S 1 S 2 … S m ), and OPT’’ be OPT\OPT’ We prove that E[max{f(OPT’), f(S 1 ), f(S 2 ), …, f(S m )}] ≥ f(OPT)/3

Linearizing marginal contributions of elements in OPT Consider an arbitrary permutation π on elements of OPT For each x OPT, define OPT x to be elements of OPT that appear before x in π By definition of Δ values, we have: f(OPT) = x OPT Δ(x, OPT x )

Lower bounding f(OPT ’ ) f(OPT’) is x OPT’ Δ(x, OPT x OPT’) Using submodularity, we have: Δ(x, OPT x OPT’) ≥ Δ(x, OPT x ) Therefore: f(OPT ’ ) ≥ x OPT’ Δ(x, OPT x ) It suffices to upper bound x OPT’’ Δ(x, OPT x )

Proof Scheme OPT OPT’ ’ OPT’ Selected Not Selected ≥ x OPT’ Δ(x, OPT x ) Suffices to upper bound x OPT’’ Δ(x, OPT x ) f(OPT) = x OPT Δ(x, OPT x ) For each x in T i OPT’’ : Δ(x, S i ) ≤ f(S i )/k 1 ≤ i ≤ m x in OPT’’ Ti Δ(x, S i ) ≤ max i {f(S i )} How large can Δ(x, OPT x ) - Δ(x, S i ) be? Goal: Lower bound max{f(OPT’), f(S 1 ), f(S 2 ), …, f(S m )}

Upper bounding Δ reductions Δ(x, OPT x ) - Δ(x, S i ) ≤ Δ(x, OPT x ) - Δ(x, OPT x S i ) x in OPT Δ(x, OPT x ) - Δ(x, OPT x S i ) = f(OPT) – (f(OPT S i ) – f(S i )) ≤ f(S i ) in worst case: 1 ≤ i ≤ m x in OPT’’ Ti Δ(x, OPT x ) - Δ(x, S i ) ≤ 1 ≤ i ≤ m f(S i ) in expectation: 1 ≤ i ≤ m x in OPT’’ Ti Δ(x, OPT x ) - Δ(x, S i ) ≤ 1 ≤ i ≤ m f(S i )/m Conclusion: E[f(OPT’)] ≥ f(OPT) - max i {f(S i )} - Average i {f(S i )} Greedy is a 1/3-approximate randomized core-set

Distributed Approximation Factor Input Set N Machine 1 Machine m Machine 2 T1T1 T2T2 TmTm Run Greedy in each machine Selected elements S1S1 S2S2 SmSm output set Run Greedy to find the final size k output set with value ≥ (1-1/e)f(OPT)/3 There exists a solution with f(OPT)/3 value. Take the maximum of max i {f(S i )} and Greedy(S 1 S 2 … S m ) to achieve 0.27 approximation factor

Improving Approximation Factors for Monotone Submodular Functions Hardness Result [Mirrokni, Z]: With output sizes (|S i |) ≤ k, Greedy, and locally optimum algorithms are not better than ½ approximate randomized core-sets. Can we increase the output sizes and get better results?

-approximate Randomized Core-set Positive Result [Mirrokni, Z]: If we increase the output sizes to be 4k, Greedy will be (2-√2)- o(1) ≥ approximate randomized core-set for a monotone submodular function. Remark: In this result, we send each item to C random machines instead of one. As a result, the approximation factors are reduced by a O(ln(C)/C) term.

0.545 Distributed Approximation Factor If we run Greedy again at the second stage, the distributed approximation factor will be 0.585(1-1/e) < Positive Distributed Result [Mirrokni, Z]: We propose a new algorithm PseudoGreedy for the with distributed approximation factor We still use Greedy in the first stage to return 4k elements in each machine.

Improved Distributed Approximation Factor Input Set N Machine 1 Machine m Machine 2 T1T1 T2T2 TmTm Run Greedy, and return 4k items in each machine Selected elements S1S1 S2S2 SmSm output set Run PseudoGreedy to find the final size k output set with value ≥ 0.545f(OPT) There exists a size k subset with value ≥ 0.585f(OPT)

Analysis: Greedy is -approximate Core-set OPT OPT1 := {x | x in OPT and x in Greedy(T 1 {x})} K 1 := |OPT1| OPT1 := {x | x in OPT and x in Greedy(T 1 {x})} K 1 := |OPT1| OPT2 := {x | x in OPT and x not in Greedy(T 1 {x})} K 2 := |OPT2| OPT2 := {x | x in OPT and x not in Greedy(T 1 {x})} K 2 := |OPT2| Assume that OPT1 OPT’ with an O(ln(C)/C) error term. Use OPT2 to lower bound marginal gains of selected items in the first machine, S 1.

Analysis (cont’d) Greedy starts with an empty set, and adds items y 1, y 2, …, y 4k to S 1 Z 1 = {y 1 } Z 2 = {y 1, y 2 } Z i = {y 1, y 2, …, y i } S 1 = Z 4k = {y 1, y 2, …, y 4k } Marginal gain of y i ≥ x in OPT2 Δ(x, Z i-1 )/K 2 ≥ x in OPT2 Δ(x, OPT x Z i-1 )/K 2 x in OPT2 Δ(x, OPT x Z i-1 ) x in OPT2 Δ(x, OPT x Z i ) x in OPT1 Δ(x, OPT x Z i-1 ) x in OPT1 Δ(x, OPT x Z i ) Adding y i reduces Δ values aiai bibi Linear Program Variables

Analysis (cont’d): Factor Revealing LP x in OPT2 Δ(x, OPT x Z i-1 ) x in OPT2 Δ(x, OPT x Z i ) x in OPT1 Δ(x, OPT x Z i-1 ) x in OPT1 Δ(x, OPT x Z i ) Define a i := minus Define b i := minus For each 1 ≤ i ≤ 4k Assume f(OPT) =1 α := x in OPT2 Δ(x, OPT x ) β := f k (S 1 OPT1) Assume f(OPT) =1 α := x in OPT2 Δ(x, OPT x ) β := f k (S 1 OPT1) Linear Program LP k,k2 : Minimize β a i + b i ≥ (α - 1 ≤ i’ ≤ i a i’ )/K 2 1 ≤ i ≤ 4k 1 - α ≥ 1 ≤ i ≤ 4k b i β ≥ 1 – α + i in L a j set L {1, 2, …, 4k} with |L|=K 2 Linear Program LP k,k2 : Minimize β a i + b i ≥ (α - 1 ≤ i’ ≤ i a i’ )/K 2 1 ≤ i ≤ 4k 1 - α ≥ 1 ≤ i ≤ 4k b i β ≥ 1 – α + i in L a j set L {1, 2, …, 4k} with |L|=K 2

Analysis: wrapping up LP k,k2 is lower bounded by β in the following mathematical program: 0 ≤ α, β, λ, r ≤ β ≥ 1 – α + (1 – e -r ) α + (1 - r) λ 1 – α ≥ (e -r α - λ) 2 /2λ 0 ≤ α, β, λ, r ≤ β ≥ 1 – α + (1 – e -r ) α + (1 - r) λ 1 – α ≥ (e -r α - λ) 2 /2λ Minimum occurs at: r = 0, λ = 1 -, α =, and β = 2 - Minimum occurs at: r = 0, λ = 1 -, α =, and β = 2 -

Improved Distributed Approximation Factor Input Set N Machine 1 Machine m Machine 2 T1T1 T2T2 TmTm Run Greedy, and return 4k items in each machine Selected elements S1S1 S2S2 SmSm output set Run PseudoGreedy to find the final size k output set with value ≥ 0.545f(OPT) There exists a size k subset with value ≥ 0.585f(OPT)

Algorithm PseudoGreedy Forall 1 ≤ K 2 ≤ k – Set K’ := K 2 /4 – Set K 1 := k – K 2 – Partition the first 8K’ items of S 1 into sets {A 1, …,A 8 } – For each L {1, …, 8} Let S’ be union of A i where i is in L Among selected items, insert K 1 + (4 - |L|)K’ items to S’ greedily – If (f(S’) > f(S)) then S := S’ Return S

Small Size Core-sets Each machine can only select k’ << k items For random partitions, both distributed and core-set approximations factors are For worst case, they are

Thank you! Questions?