Distributed Submodular Maximization in Massive Datasets

Slides:

Advertisements

Similar presentations

Beyond Convexity – Submodularity in Machine Learning

Advertisements

Submodularity for Distributed Sensing Problems Zeyn Saigol IR Lab, School of Computer Science University of Birmingham 6 th July 2010.

Submodular Set Function Maximization via the Multilinear Relaxation & Dependent Rounding Chandra Chekuri Univ. of Illinois, Urbana-Champaign.

Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.

Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.

Parallel Double Greedy Submodular Maxmization Xinghao Pan, Stefanie Jegelka, Joseph Gonzalez, Joseph Bradley, Michael I. Jordan.

Approximation Algorithms for Capacitated Set Cover Ravishankar Krishnaswamy (joint work with Nikhil Bansal and Barna Saha)

Variational Inference in Bayesian Submodular Models

Greedy Algorithms for Matroids Andreas Klappenecker.

1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.

Sensor placement applications Monitoring of spatial phenomena Temperature Precipitation... Active learning, Experiment design Precipitation data from Pacific.

Implicit Hitting Set Problems Richard M. Karp Harvard University August 29, 2011.

Approximation Algorithms

Computability and Complexity 24-1 Computability and Complexity Andrei Bulatov Approximation.

Integer Programming Difference from linear programming –Variables x i must take on integral values, not real values Lots of interesting problems can be.

Efficiently handling discrete structure in machine learning Stefanie Jegelka MADALGO summer school.

1 Introduction to Approximation Algorithms Lecture 15: Mar 5.

Approximation Algorithms: Bristol Summer School 2008 Seffi Naor Computer Science Dept. Technion Haifa, Israel TexPoint fonts used in EMF. Read the TexPoint.

Active Learning for Probabilistic Models Lee Wee Sun Department of Computer Science National University of Singapore LARC-IMS Workshop.

Approximation Algorithms for Stochastic Combinatorial Optimization Part I: Multistage problems Anupam Gupta Carnegie Mellon University.

1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.

DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.

Greedy Algorithms and Matroids Andreas Klappenecker.

Randomized Composable Core-sets for Submodular Maximization Morteza Zadimoghaddam and Vahab Mirrokni Google Research New York.

Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.

CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.

5 Maximizing submodular functions Minimizing convex functions: Polynomial time solvable! Minimizing submodular functions: Polynomial time solvable!

Implicit Hitting Set Problems Richard M. Karp Erick Moreno Centeno DIMACS 20 th Anniversary.

Submodular Maximization with Cardinality Constraints Moran Feldman Based On Submodular Maximization with Cardinality Constraints. Niv Buchbinder, Moran.

A Unified Continuous Greedy Algorithm for Submodular Maximization Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.

Maximization Problems with Submodular Objective Functions Moran Feldman Publication List Improved Approximations for k-Exchange Systems. Moran Feldman,

Vasilis Syrgkanis Cornell University

Deterministic Algorithms for Submodular Maximization Problems Moran Feldman The Open University of Israel Joint work with Niv Buchbinder.

Matroids, Secretary Problems, and Online Mechanisms Nicole Immorlica, Microsoft Research Joint work with Robert Kleinberg and Moshe Babaioff.

Aspects of Submodular Maximization Subject to a Matroid Constraint Moran Feldman Based on A Unified Continuous Greedy Algorithm for Submodular Maximization.

Maximizing Symmetric Submodular Functions Moran Feldman EPFL.

Submodularity Reading Group Matroids, Submodular Functions M. Pawan Kumar

TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.

Polyhedral Optimization Lecture 5 – Part 3 M. Pawan Kumar Slides available online

Approximation Algorithms based on linear programming.

TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.

Unconstrained Submodular Maximization Moran Feldman The Open University of Israel Based On Maximizing Non-monotone Submodular Functions. Uriel Feige, Vahab.

Submodularity Reading Group Matroids, Submodular Functions M. Pawan Kumar

Approximation algorithms

Monitoring rivers and lakes [IJCAI ‘07]

Near-optimal Observation Selection using Submodular Functions

Vitaly Feldman and Jan Vondrâk IBM Research - Almaden

Approximating the MST Weight in Sublinear Time

Moran Feldman The Open University of Israel

Approximation algorithms

Maximum Matching in the Online Batch-Arrival Model

The Price of information in combinatorial optimization

Chapter 5. Optimal Matchings

Framework for the Secretary Problem on the Intersection of Matroids

Chapter 6. Large Scale Optimization

3.5 Minimum Cuts in Undirected Graphs

Coverage Approximation Algorithms

Lecture 11 Overview Self-Reducibility.

Lecture 11 Overview Self-Reducibility.

Submodular Maximization Through the Lens of the Multilinear Relaxation

I.4 Polyhedral Theory (NW)

On Approximating Covering Integer Programs

Lecture 6: Counting triangles Dynamic graphs & sampling

I.4 Polyhedral Theory.

Submodular Maximization in the Big Data Era

Submodular Maximization with Cardinality Constraints

Multiple Products.

Chapter 6. Large Scale Optimization

Chapter 2. Simplex method

Guess Free Maximization of Submodular and Linear Sums

Presentation transcript:

Distributed Submodular Maximization in Massive Datasets Alina Ene Joint work with Rafael Barbosa, Huy L. Nguyen, Justin Ward

Combinatorial Optimization Given A set of objects V A function f on subsets of V A collection of feasible subsets I Find A feasible subset of I that maximizes f Goal Abstract/general f and I Capture many interesting problems Allow for efficient algorithms

Submodularity We say that a function is submodular if: We say that is monotone if: Alternatively, f is submodular if: for all and Submodularity captures diminishing returns.

Submodularity Examples of submodular functions: The number of elements covered by a collection of sets Entropy of a set of random variables The capacity of a cut in a directed or undirected graph Rank of a set of columns of a matrix Matroid rank functions Log determinant of a submatrix

Example: Multimode Sensor Coverage We have distinct locations where we can place sensors Each sensor can operate in different modes, each with a distinct coverage profile Find sensor locations, each with a single mode to maximize coverage

Example: Identifying Representatives In Massive Data

Example: Identifying Representative Images We are given a huge set X of images. Each image is stored multidimensional vector. We have a function d giving the difference between two images. We want to pick a set S of at most k images to minimize the loss function: Suppose we choose a distinguished vector e0 (e.g. 0 vector), and set: The function f is submodular. Our problem is then equivalent to maximizing f under a single cardinality constraint.

Need for Parallelization Datasets grow very large TinyImages has 80M images Kosarak has 990K sets Need multiple machines to fit the dataset Use parallel frameworks such as MapReduce

Problem Definition Given set V and submodular function f Hereditary constraint I (cardinality at most k, matroid constraint of rank k, … ) Find a subset that satisfies I and maximizes f Parameters n = |V| k : max size of feasible solutions m : number of machines

Greedy Algorithm Initialize S = {} While there is some element x that can be added to S: Add to S the element x that maximizes the marginal gain Return S

Greedy Algorithm Approximation Guarantee: Runtime: O(nk) 1 - 1/e for a cardinality constraint 1/2 for a matroid constraint Runtime: O(nk) Need to recompute marginals each time an element is added Not good for large data sets

Mirzasoleiman, Karbasi, Sarkar, Krause '13 Distributed Greedy

Performance of Distributed Greedy Mirzasoleiman, Karbasi, Sarkar, Krause '13 Performance of Distributed Greedy Only requires 2 rounds of communication Approximation ratio is: (where m is number of machines) If we use the optimal algorithm on each machine in both phases, we can still only get:

Performance of Distributed Greedy If we use the optimal algorithm on each machine in both phases, we can still only get: In fact, we can show that using greedy gives: Why? The problem doesn't have optimal substructure. Better to run greedy in round 1 instead of the optimal algorithm.

Revisiting the Analysis Can construct bad examples for Greedy/optimal Lower bound for any poly(k) coresets (Indyk et al. ’14) Yet the distributed greedy algorithm works very well on real instances Why?

Power of Randomness Randomized distributed Greedy Distribute the elements of V randomly in round 1 Select the best solution found in rounds 1 & 2 Theorem: If Greedy achieves a C approximation, randomized distributed Greedy achieves a C/2 approximation in expectation.

Intuition If elements in OPT are selected in round 1 with high probability Most of OPT is present in round 2 so solution in round 2 is good If elements in OPT are selected in round 1 with low probability OPT is not very different from typical solution so solution in round 1 is good

Analysis (Preliminaries) Greedy Property: Suppose: x is not selected by greedy on S∪{x} y is not selected by greedy on S∪{y} Then: x and y are not selected by greedy on S∪{x,y} Lovasz extension : convex function on [0,1]V that agrees with on integral vectors.

Analysis (Sketch) Let X be a random 1/m sample of V For e in OPT, let pe be the probability (over choice of X) that e is selected by Greedy on X∪{e} Then, expected value of elements of OPT on the final machine is On the other hand, expected value of rejected elements is

Analysis (Sketch) The final greedy solution T satisfies: The best single machine solution S satisfies: Altogether, we get an approximation in expectation of:

Generality What do we need for the proof? Monotonicity and submodularity of f Heredity of constraint Greedy property The result holds in general any time greedy is an -approximation for a hereditary, constrained submodular maximization problem.

Non-monotone Functions In the first round, use Greedy on each machine In the second round, use any algorithm on the last machine We still obtain a constant factor approximation for most problems

Tiny Image Experiments (n = 1M, m = 100)

Matroid Coverage Experiments Matroid Coverage (n=900, r=5) Matroid Coverage (n=100, r=100) It's better to distribute ellipses from each location across several machines!

Future Directions Can we relax the greedy property further? What about non-greedy algorithms? Can we speed up the final round, or reduce the number machines required? Better approximation guarantees?