Unconstrained Submodular Maximization Moran Feldman The Open University of Israel Based On Maximizing Non-monotone Submodular Functions. Uriel Feige, Vahab.

Slides:

Advertisements

Similar presentations

On allocations that maximize fairness Uriel Feige Microsoft Research and Weizmann Institute.

Advertisements

Combinatorial Auctions with Complement-Free Bidders – An Overview Speaker: Michael Schapira Based on joint works with Shahar Dobzinski & Noam Nisan.

Submodular Set Function Maximization via the Multilinear Relaxation & Dependent Rounding Chandra Chekuri Univ. of Illinois, Urbana-Champaign.

C&O 355 Lecture 23 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.

Introduction to Algorithms

Approximating Maximum Edge Coloring in Multigraphs

Optimal Marketing Strategies over Social Networks Jason Hartline (Northwestern), Vahab Mirrokni (Microsoft Research) Mukund Sundararajan (Stanford)

Approximation Algorithms

The Submodular Welfare Problem Lecturer: Moran Feldman Based on “Optimal Approximation for the Submodular Welfare Problem in the Value Oracle Model” By.

Limitations of VCG-Based Mechanisms Shahar Dobzinski Joint work with Noam Nisan.

1 Introduction to Linear and Integer Programming Lecture 9: Feb 14.

Recent Development on Elimination Ordering Group 1.

Pushkar Tripathi Georgia Institute of Technology Approximability of Combinatorial Optimization Problems with Submodular Cost Functions Based on joint work.

Implicit Hitting Set Problems Richard M. Karp Harvard University August 29, 2011.

Approximation Algorithms

Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)

Testing Metric Properties Michal Parnas and Dana Ron.

Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.

(work appeared in SODA 10’) Yuk Hei Chan (Tom)

A General Approach to Online Network Optimization Problems Seffi Naor Computer Science Dept. Technion Haifa, Israel Joint work: Noga Alon, Yossi Azar,

Packing Element-Disjoint Steiner Trees Mohammad R. Salavatipour Department of Computing Science University of Alberta Joint with Joseph Cheriyan Department.

Approximation Algorithms: Bristol Summer School 2008 Seffi Naor Computer Science Dept. Technion Haifa, Israel TexPoint fonts used in EMF. Read the TexPoint.

Fast Algorithms for Submodular Optimization

APPROXIMATION ALGORITHMS VERTEX COVER – MAX CUT PROBLEMS

Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.

1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.

An Algorithm for the Coalitional Manipulation Problem under Maximin Michael Zuckerman, Omer Lev and Jeffrey S. Rosenschein (Simulations by Amitai Levy)

TECH Computer Science NP-Complete Problems Problems  Abstract Problems  Decision Problem, Optimal value, Optimal solution  Encodings  //Data Structure.

Randomized Composable Core-sets for Submodular Maximization Morteza Zadimoghaddam and Vahab Mirrokni Google Research New York.

Online Social Networks and Media

A Membrane Algorithm for the Min Storage problem Dipartimento di Informatica, Sistemistica e Comunicazione Università degli Studi di Milano – Bicocca WMC.

CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin.

Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.

Approximation Algorithms Department of Mathematics and Computer Science Drexel University.

Implicit Hitting Set Problems Richard M. Karp Erick Moreno Centeno DIMACS 20 th Anniversary.

Submodular Maximization with Cardinality Constraints Moran Feldman Based On Submodular Maximization with Cardinality Constraints. Niv Buchbinder, Moran.

Frequency Capping in Online Advertising Moran Feldman Technion Joint work with: Niv Buchbinder,The Open University of Israel Arpita Ghosh,Yahoo! Research.

Improved Competitive Ratios for Submodular Secretary Problems ? Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.

Hedonic Clustering Games Moran Feldman Joint work with: Seffi Naor and Liane Lewin-Eytan.

Unique Games Approximation Amit Weinstein Complexity Seminar, Fall 2006 Based on: “Near Optimal Algorithms for Unique Games" by M. Charikar, K. Makarychev,

A Unified Continuous Greedy Algorithm for Submodular Maximization Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.

Maximization Problems with Submodular Objective Functions Moran Feldman Publication List Improved Approximations for k-Exchange Systems. Moran Feldman,

Non-Preemptive Buffer Management for Latency Sensitive Packets Moran Feldman Technion Seffi Naor Technion.

Deterministic Algorithms for Submodular Maximization Problems Moran Feldman The Open University of Israel Joint work with Niv Buchbinder.

Aspects of Submodular Maximization Subject to a Matroid Constraint Moran Feldman Based on A Unified Continuous Greedy Algorithm for Submodular Maximization.

Maximizing Symmetric Submodular Functions Moran Feldman EPFL.

Algorithms for hard problems Introduction Juris Viksna, 2015.

Approximation Algorithms for Combinatorial Auctions with Complement-Free Bidders Speaker: Shahar Dobzinski Joint work with Noam Nisan & Michael Schapira.

Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.

TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.

Introduction to Multiple-multicast Routing Chu-Fu Wang.

Models of Greedy Algorithms for Graph Problems Sashka Davis, UCSD Russell Impagliazzo, UCSD SIAM SODA 2004.

Non-LP-Based Approximation Algorithms Fabrizio Grandoni IDSIA

Approximation Algorithms based on linear programming.

TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.

Approximation algorithms for combinatorial allocation problems

Impact of Interference on Multi-hop Wireless Network Performance

The NP class. NP-completeness

Algorithms for Big Data: Streaming and Sublinear Time Algorithms

Independent Cascade Model and Linear Threshold Model

Moran Feldman The Open University of Israel

Distributed Submodular Maximization in Massive Datasets

3.5 Minimum Cuts in Undirected Graphs

Coverage Approximation Algorithms

Submodular Maximization Through the Lens of the Multilinear Relaxation

Submodular Maximization in the Big Data Era

Independent Cascade Model and Linear Threshold Model

Submodular Maximization with Cardinality Constraints

Guess Free Maximization of Submodular and Linear Sums

Presentation transcript:

Unconstrained Submodular Maximization Moran Feldman The Open University of Israel Based On Maximizing Non-monotone Submodular Functions. Uriel Feige, Vahab S. Mirrokni and Jan Vondrák, SIAM J. Comput A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization. Niv Buchbinder, Moran Feldman, Joseph (Seffi) Naor and Roy Schwartz, SIAM J. Comput Deterministic Algorithms for Submodular Maximization Problems. Niv Buchbinder and Moran Feldman, SODA 2016 (to appear).

Motivation: Adding Dessert Meal 1Meal 2 Ground set N of elements (dishes). Valuation function f : 2 N  ℝ (a value for each meal). Submodularity: f(A + u) – f(A) ≥ f(B + u) – f(B) ∀ A  B  N, u  B. Ground set N of elements (dishes). Valuation function f : 2 N  ℝ (a value for each meal). Submodularity: f(A + u) – f(A) ≥ f(B + u) – f(B) ∀ A  B  N, u  B. Alternative Definition f(A) + f(B) ≥ f(A  B) + f(A  B) ∀ A, B  N. Alternative Definition f(A) + f(B) ≥ f(A  B) + f(A  B) ∀ A, B  N.

Another Example N

Algorithms should be polynomial in |N|. Representation of f might be very large. Assume access via a value oracle: Given a subset A  N, returns f(A). Algorithms should be polynomial in |N|. Representation of f might be very large. Assume access via a value oracle: Given a subset A  N, returns f(A). Subject of this Talk Unconstrained Submodular Maximization A basic submodular optimization problem. Given a non-negative submodular function f : 2 N  ℝ, find a set A  N maximizing f(A). Study the approximability of this problem. Unconstrained Submodular Maximization A basic submodular optimization problem. Given a non-negative submodular function f : 2 N  ℝ, find a set A  N maximizing f(A). Study the approximability of this problem. 4

Motivation: Generalizes Max-DiCut Max-DiCut Instance: a directed graph G = (V, E) with capacities c e  0 on the arcs. Objective: find a set S  V of nodes maximizing (the total capacity of the arcs crossing the cut). Max-DiCut Instance: a directed graph G = (V, E) with capacities c e  0 on the arcs. Objective: find a set S  V of nodes maximizing (the total capacity of the arcs crossing the cut). Capacity: 2Marginal gain: 0Marginal gain: -1 5

History of the Problem 6 Randomized Approximation Algorithms 0.4 – non-oblivious local search [Feige et al. 07] 0.41 – simulated annealing [Oveis Gharan and Vondrak 11] 0.42 – structural continuous greedy [Feldman et al. 11] 0.5 – double greedy [Buchbinder et al. 12] Randomized Approximation Algorithms 0.4 – non-oblivious local search [Feige et al. 07] 0.41 – simulated annealing [Oveis Gharan and Vondrak 11] 0.42 – structural continuous greedy [Feldman et al. 11] 0.5 – double greedy [Buchbinder et al. 12] Deterministic Approximation Algorithms 0.33 – local search [Feige et al. 07] 0.4 – recurisve local search [Dobzinski and Mor 15] 0.5 – derandomized double greedy [Buchbinder and Feldman 16] Deterministic Approximation Algorithms 0.33 – local search [Feige et al. 07] 0.4 – recurisve local search [Dobzinski and Mor 15] 0.5 – derandomized double greedy [Buchbinder and Feldman 16] Approximation Hardness 0.5 – information theoretic based [Feige et al. 07] Approximation Hardness 0.5 – information theoretic based [Feige et al. 07]

Generic Double Greedy Algorithm Running example: u1u1 u2u2 u3u3 u4u4 u5u5 u6u6 … Y = X = u1u1 u3u3 u4u4 unun Initially: X = , Y = N = {u 1, u 2, …, u n }. For i = 1 to n do: Either add u i to X, or remove it from Y. Return X (= Y). Initially: X = , Y = N = {u 1, u 2, …, u n }. For i = 1 to n do: Either add u i to X, or remove it from Y. Return X (= Y). 7

Simple Decision Rule 8 a i = f(X + u i ) – f(X) is the change from adding u i to X. b i = f(Y - u i ) – f(Y) is the change from removing u i from Y. If a i  b i, add u i to X. Otherwise, remove u i from Y. a i = f(X + u i ) – f(X) is the change from adding u i to X. b i = f(Y - u i ) – f(Y) is the change from removing u i from Y. If a i  b i, add u i to X. Otherwise, remove u i from Y. Intuitively, we want to maximize f(X) + f(Y). In each iteration we have two options: add u i to X, or remove it from Y. We choose the one increasing the objective by more. Intuitively, we want to maximize f(X) + f(Y). In each iteration we have two options: add u i to X, or remove it from Y. We choose the one increasing the objective by more.

Analysis Roadmap 9 HYB - A hybrid solution Starts as OPT, and ends as X (= Y). If X and Y agree on u i, HYB also agrees with them. Otherwise, HYB agrees with OPT. HYB - A hybrid solution Starts as OPT, and ends as X (= Y). If X and Y agree on u i, HYB also agrees with them. Otherwise, HYB agrees with OPT. iterations f(HYB) [f(X) + f(Y)]/2 f(OPT) The output of the algorithm. Gain Damage Assume in every iteration: Gain ≥ c ∙ Damagefor some c > 0. Assume in every iteration: Gain ≥ c ∙ Damagefor some c > 0. Ratio 1 c Output Assume in every iteration: Gain ≥ c ∙ Damagefor some c > 0. Assume in every iteration: Gain ≥ c ∙ Damagefor some c > 0.

Simple Decision Rule - Gain 10 If a i  b i, we add u i to X, and f(X) increases by a i. If a i < b i, we remove u i from Y, and f(Y) increases by b i. If a i  b i, we add u i to X, and f(X) increases by a i. If a i < b i, we remove u i from Y, and f(Y) increases by b i. f(X) + f(Y) increases by max{a i, b i }. Lemma The gain is always non-negative. Lemma The gain is always non-negative.

Gain Non-negativity - Proof 11 In Out u1u1 u2u2 u3u3 u4u4 u5u5 u6u6 unun … u7u7 X YY a5a5 b5b5 (-b 5 ) By submodularity: a 5 ≥ (-b 5 ) a 5 + b 5 ≥ 0 max{a 5, b 5 } ≥ 0

Simple Decision Rule - Damage 12 When the algorithm makes the “right” decision The algorithm adds to X an element u i  OPT, or The algorithm removes from Y an element u i  OPT. HYB does not change. No damage. When the algorithm makes the “right” decision The algorithm adds to X an element u i  OPT, or The algorithm removes from Y an element u i  OPT. HYB does not change. No damage. Summary Gain ≥ 0 Damage = 0 Summary Gain ≥ 0 Damage = 0 Gain ≥ c ∙ Damagefor every c > 0.

Wrong Decision - Damage Control 13 In Out u1u1 u2u2 u3u3 u4u4 u5u5 u6u6 unun … u7u7 X Y a5a5 b5b5 HYB (-Damage) HYB Damage By submodularity: a 5 ≥ Damage Lemma When making a wrong decision, the damage is at most the a i or b i corresponding to the other decision.

Doing the Math 14 When the algorithm makes the “wrong” decision The damage is upper bounded by either a i or b i. The gain is. When the algorithm makes the “wrong” decision The damage is upper bounded by either a i or b i. The gain is. (i.e., c = ½). Approximation Ratio

Intuition 15 If a i is much larger than b i (or the other way around). Even if our decision rule makes a wrong decision: The gain a i /2 is much larger than the damage b i. Allows a larger c. Even if our decision rule makes a wrong decision: The gain a i /2 is much larger than the damage b i. Allows a larger c. If a i and b i are close. Both decisions result in a similar gain. Making the wrong decision is problematic. We should give each decision some probability. Both decisions result in a similar gain. Making the wrong decision is problematic. We should give each decision some probability.

Randomized Decision Rule 16 If b i ≤ 0, add u i to X. If a i ≤ 0, remove u i from Y. Otherwise:  With probability add u i to X.  Otherwise (with probability ) remove u i from Y. If b i ≤ 0, add u i to X. If a i ≤ 0, remove u i from Y. Otherwise:  With probability add u i to X.  Otherwise (with probability ) remove u i from Y. For simplicity, assume this case. Gain Analysis [Gain] =

Randomized Decision Rule - Damage 17 If u i  OPT:[Damage] ≤ If u i  OPT:[Damage] ≤ If u i  OPT:[Damage] ≤ If u i  OPT:[Damage] ≤ Damage from making the “right” decision. Damage from making the “wrong” decision. Approximation Ratio [Damage] ≤ Approximation ratio: [Damage] ≤ Approximation ratio: = [Gain] c = 1

Derandomization – First Attempt 18 Idea: The state of the random algorithm is a pair (X, Y). Explicitly store the distribution over the current states of the algorithm. Idea: The state of the random algorithm is a pair (X, Y). Explicitly store the distribution over the current states of the algorithm. ( , N, 1) (X, Y, p) (X, Y, 1-p) (X, Y, q 1 ) (X, Y, q 4 ) (X, Y, q 2 ) (X, Y, q 3 ) The number of states can double after every iteration. Can require an exponential time. The number of states can double after every iteration. Can require an exponential time.

Notation 19 S = (X, Y) (X+u i, Y) (X, Y-u i ) a i (S) and b i (S) – The a i and b i corresponding to state S. z(S)z(S) The probability of adding u i. w(S)w(S) The probability of removing u i. We want to select these smartly. Think of them as variables.

Gain and Damage 20 Gain at state S: Damage at state S: If u i  OPT: If u i  OPT: Damage at state S: If u i  OPT: If u i  OPT: In the randomized algorithm, for every state S we required: Gain(S) ≥ c ∙ Damage in (S) Gain(S) ≥ c ∙ Damage out (S) In the randomized algorithm, for every state S we required: Gain(S) ≥ c ∙ Damage in (S) Gain(S) ≥ c ∙ Damage out (S) We found z(S) and w(S) for which these inequalities hold with c = 1. A linear function of z(S) and w(S). Again, linear functions of z(S) and w(S).

Expectation to the Rescue 21 It is enough for the inequalities to hold in expectation over S. S [Gain(S)] ≥ c ∙ S [Damage in (S)]. S [Gain(S)] ≥ c ∙ S [Damage out (S)]. It is enough for the inequalities to hold in expectation over S. S [Gain(S)] ≥ c ∙ S [Damage in (S)]. S [Gain(S)] ≥ c ∙ S [Damage out (S)]. The requirements from z(S) and w(S) can be stated as an LP. Every algorithm using probabilities z(S) and w(S) obeying this LP has the approximation ratio corresponding to c. The requirements from z(S) and w(S) can be stated as an LP. Every algorithm using probabilities z(S) and w(S) obeying this LP has the approximation ratio corresponding to c. S [Gain(S)] ≥ c ∙ S [Damage in (S)] S [Gain(S)] ≥ c ∙ S [Damage out (S)] z(S) + w(S) = 1  S z(S), w(S)  0  S The expectation over linear functions of z(S) and w(S) is also a function of this kind.

Strategy 22 S = (X, Y) (X+u i, Y) (X, Y-u i ) z(S)z(S) w(S)w(S) If z(S) or w(S) is 0, then only one state results from S. The number of states in the next iteration is equal to the number of non-zero variables in our LP solution. We want an LP solution with few non-zero variables. The number of states in the next iteration is equal to the number of non-zero variables in our LP solution. We want an LP solution with few non-zero variables.

Finding a good solution 23 S [Gain(S)] ≥ c ∙ S [Damage in (S)] S [Gain(S)] ≥ c ∙ S [Damage out (S)] z(S) + w(S) = 1  S z(S), w(S)  0  S S [Gain(S)] ≥ c ∙ S [Damage in (S)] S [Gain(S)] ≥ c ∙ S [Damage out (S)] z(S) + w(S) = 1  S z(S), w(S)  0  S Has a solution (for c = 1):, Has a solution (for c = 1):, Bounded. A basic feasible solution contains at most one non-zero variable for every constraint: One non-zero variable for every current state. Two additional non-zero variables. A basic feasible solution contains at most one non-zero variable for every constraint: One non-zero variable for every current state. Two additional non-zero variables. The size of the distribution can increase by at most 2 at every iteration.

In Conclusion 24 Algorithm Explicitly stores a distribution over states. In every iteration:  Uses an LP to calculate the probabilities to move from one state to another.  Calculates the distribution for the next iteration based on these probabilities. Algorithm Explicitly stores a distribution over states. In every iteration:  Uses an LP to calculate the probabilities to move from one state to another.  Calculates the distribution for the next iteration based on these probabilities. Performance The approximation ratio is ½ (for c = 1). The size of the distribution grows linearly – polynomial time algorithm. Performance The approximation ratio is ½ (for c = 1). The size of the distribution grows linearly – polynomial time algorithm. This LP can in fact be solved in a near-linear time, resulting in a near-quadratic time complexity.

Hardness – Starting Point 25 Consider the cut function of the complete graph: For every set S: f(S) = |S|  (n - |S|). The maximum value is. For every set S: f(S) = |S|  (n - |S|). The maximum value is.

A Distribution of Hard Instances 26 Consider the cut function of the complete bipartite graph with edge weights 2: AB For every set S: The maximum value is. For every set S: The maximum value is. (A, B) is a random partition of the vertices into two equal sets.

The deterministic algorithm: w.h.p. makes the same series of queries for both inputs. w.h.p. cannot distinguish the two inputs. has an approximation ratio of at most ½ + o(1). The deterministic algorithm: w.h.p. makes the same series of queries for both inputs. w.h.p. cannot distinguish the two inputs. has an approximation ratio of at most ½ + o(1). Deterministic Algorithms 27 Given the complete graph input, a deterministic algorithm makes a series of queries: Q 1, Q 2, …, Q m. For every set Q i : Value (complete graph) |Q i |  (n - |Q i |) Value (complete graph) |Q i |  (n - |Q i |) Value (bipartite complete graph) w.h.p. |Q i |  (n - |Q i |) Value (bipartite complete graph) w.h.p. |Q i |  (n - |Q i |) For the bipartite complete graph W.h.p. |Q i  A|  |Q i  B| We assume |Q i  A| = |Q i  B| = |Q i | / 2 The value of Q i : For the bipartite complete graph W.h.p. |Q i  A|  |Q i  B| We assume |Q i  A| = |Q i  B| = |Q i | / 2 The value of Q i :

Sealing the Deal 28 Hardness for Randomized Algorithms Our distribution is hard for every deterministic algorithm. Hardness for randomized algorithms from Yao’s principle. Hardness for Randomized Algorithms Our distribution is hard for every deterministic algorithm. Hardness for randomized algorithms from Yao’s principle. Getting Rid of the Assumption A query set Q cannot separate the inputs when |Q  A| = |Q  B|. This should be true also when |Q  A|  |Q  B|. The bipartite graph input should be modified to have f(Q) = |Q|  (n - |Q|) whenever |Q  A|  |Q  B|. Getting Rid of the Assumption A query set Q cannot separate the inputs when |Q  A| = |Q  B|. This should be true also when |Q  A|  |Q  B|. The bipartite graph input should be modified to have f(Q) = |Q|  (n - |Q|) whenever |Q  A|  |Q  B|.

The Modified Function When –εn ≤ |S  A| – |S  B| ≤ εn (for an arbitrary ε > 0): Otherwise: The Modified Function When –εn ≤ |S  A| – |S  B| ≤ εn (for an arbitrary ε > 0): Otherwise: Getting Rid of the Assumption (cont.) 29 The extra terms:  keep the function submodular.  decrease the maximum value by O(εn 2 ). Resulting in a hardness of ½ + ε. The extra terms:  keep the function submodular.  decrease the maximum value by O(εn 2 ). Resulting in a hardness of ½ + ε.