Download presentation
Presentation is loading. Please wait.
Published byEvangeline Long Modified over 8 years ago
1
Deterministic Algorithms for Submodular Maximization Problems Moran Feldman The Open University of Israel Joint work with Niv Buchbinder.
2
Submodular Functions Definition Given a ground set N, a set function f : 2 N R assigns a number to every subset of the ground set. A set function is submodular if: o f(A + u) – f(A) ≥ f(B + u) – f(B) ∀ A B N, u B or o f(A) + f(B) ≥ f(A B) + f(A B) ∀ A, B N. Definition Given a ground set N, a set function f : 2 N R assigns a number to every subset of the ground set. A set function is submodular if: o f(A + u) – f(A) ≥ f(B + u) – f(B) ∀ A B N, u B or o f(A) + f(B) ≥ f(A B) + f(A B) ∀ A, B N. Submodular functions can be found in: Combinatorics (2 examples soon) Machine Learning Submodular functions can be found in: Combinatorics (2 examples soon) Machine Learning 2 Image Processing Algorithmic Game Theory
3
Example 1: Cut Function A directed graph G = (V, E) with capacities c e 0 on the arcs. For every S V: Observation: f(S) is a non-negative submodular function. A directed graph G = (V, E) with capacities c e 0 on the arcs. For every S V: Observation: f(S) is a non-negative submodular function. 3 f(S) = 3
4
Example 2: Coverage Function Elements E = {e 1, e 2, …, e n } and sets s 1, s 2, …, s m E For every S = {s i 1, s i 2, …, s i k }: Observation: f(S) is a non-negative (monotone) submodular function. Elements E = {e 1, e 2, …, e n } and sets s 1, s 2, …, s m E For every S = {s i 1, s i 2, …, s i k }: Observation: f(S) is a non-negative (monotone) submodular function. 4 S1S1 S2S2 S5S5 S3S3 S4S4
5
Submodular Maximization with a Cardinality Constraint Given a non-negative submodular function f : 2 N ℝ and an integer k, find a set S of size at most k maximizing f(S). Generalizes: Max-k-Coverage and Max-Cut with specified cut size. Submodular Maximization with a Cardinality Constraint Given a non-negative submodular function f : 2 N ℝ and an integer k, find a set S of size at most k maximizing f(S). Generalizes: Max-k-Coverage and Max-Cut with specified cut size. Submodular Maximization Problems Unconstrained Submodular Maximization Given a non-negative submodular function f : 2 N ℝ, find a set S N maximizing f(S). Generalizes: Max-(Directed)-Cut. Unconstrained Submodular Maximization Given a non-negative submodular function f : 2 N ℝ, find a set S N maximizing f(S). Generalizes: Max-(Directed)-Cut. 5 Other Constraints Exactly k elements, matroid, knapsack… Other Constraints Exactly k elements, matroid, knapsack…
6
Our Main Question 6 Reasons for “not necessary” Most approximation algorithms can be derandomized. “Not necessary” is the default … Reasons for “not necessary” Most approximation algorithms can be derandomized. “Not necessary” is the default … Reasons for “necessary” Currently most (best) known algorithms are randomized. Algorithms are assumed to access the function via a value oracle. This makes it difficult to apply standard techniques (e.g., conditional expectations). Algorithms based on the multilinear extension are inherently randomized. Reasons for “necessary” Currently most (best) known algorithms are randomized. Algorithms are assumed to access the function via a value oracle. This makes it difficult to apply standard techniques (e.g., conditional expectations). Algorithms based on the multilinear extension are inherently randomized. Algorithms should be polynomial in |N|. Representation of f might be very large. Assume access via a value oracle: Given a subset A N, returns f(A). Algorithms should be polynomial in |N|. Representation of f might be very large. Assume access via a value oracle: Given a subset A N, returns f(A).
7
History and Results: Unconstrained Maximization 7 Randomized Approximation Algorithms 0.4 – non-oblivious local search [Feige et al. 07] 0.41 – simulated annealing [Oveis Gharan and Vondrak 11] 0.42 – structural continuous greedy [Feldman et al. 11] 0.5 – double greedy [Buchbinder et al. 12] Randomized Approximation Algorithms 0.4 – non-oblivious local search [Feige et al. 07] 0.41 – simulated annealing [Oveis Gharan and Vondrak 11] 0.42 – structural continuous greedy [Feldman et al. 11] 0.5 – double greedy [Buchbinder et al. 12] Deterministic Approximation Algorithms 0.33 – local search [Feige et al. 07] 0.4 – recurisve local search [Dobzinski and Mor 15] 0.5 – derandomized double greedy [this work] Deterministic Approximation Algorithms 0.33 – local search [Feige et al. 07] 0.4 – recurisve local search [Dobzinski and Mor 15] 0.5 – derandomized double greedy [this work] Approximation Hardness 0.5 – information theoretic based [Feige et al. 07] Approximation Hardness 0.5 – information theoretic based [Feige et al. 07]
8
History and Results: Cardinality Constraint 8 Approximation Algorithms 0.25 – local search [Lee et al. 10] 0.309 – fractional local search [Vondrak 13] 0.325 – simulated annealing [Oveis Gharan and Vondrak 11] 0.367 – measured continuous greedy [Feldman et al. 11] 0.367 (faster) – random greedy [Buchbinder et al. 14] 0.371 – “wide” random greedy [Buchbinder et al. 14] Approximation Algorithms 0.25 – local search [Lee et al. 10] 0.309 – fractional local search [Vondrak 13] 0.325 – simulated annealing [Oveis Gharan and Vondrak 11] 0.367 – measured continuous greedy [Feldman et al. 11] 0.367 (faster) – random greedy [Buchbinder et al. 14] 0.371 – “wide” random greedy [Buchbinder et al. 14] Our Result 0.367 (e -1 ) – derandomized random greedy [this work] Our Result 0.367 (e -1 ) – derandomized random greedy [this work] Approximation Hardness 0.5 – for unconstrained maximization [Feige et al. 07] 0.491 – symmetry gap [Oveis Gharan and Vondrak 11] Approximation Hardness 0.5 – for unconstrained maximization [Feige et al. 07] 0.491 – symmetry gap [Oveis Gharan and Vondrak 11] Deterministic
9
The Profile of the Algorithms 9 Works in iterations. In every iteration: Starts with some state S. Randomly switches to a new state from a set N(S). For every S’ N(S), let p(S, S’) be the probability that the algorithm switches from S to S’. In every iteration: Starts with some state S. Randomly switches to a new state from a set N(S). For every S’ N(S), let p(S, S’) be the probability that the algorithm switches from S to S’. The analysis works whenever the probabilities p(S, S’) obey k linear constraints that might depend on S, where k is polynomial:
10
Derandomization – Naïve Attempt 10 Idea Explicitly store the distribution over the current state of the algorithm. Idea Explicitly store the distribution over the current state of the algorithm. (S 0, 1) (S 1, p) (S 2, 1 - p) (S 3, q 1 ) (S 6, q 4 ) (S 4, q 2 ) (S 5, q 3 ) The initial state The number of states can increase exponentially with the iterations.
11
Strategy 11 S S S1S1 S1S1 S3S3 S3S3 p(S, S 1 ) The state S i gets to the distribution of the next iteration only if p(S, S i ) > 0. We want probabilities that: obey the constraints. are mostly zeros. The state S i gets to the distribution of the next iteration only if p(S, S i ) > 0. We want probabilities that: obey the constraints. are mostly zeros. S2S2 S2S2 p(S, S 2 )p(S, S 3 )
12
Expectation to the Rescue 12 The analysis of the algorithm works when: (D is the current distribution). The analysis of the algorithm works when: (D is the current distribution). Often it is enough for the constraints to hold in expectation over D.
13
Expectation to the Rescue (cont.) 13 Some Justifications We now require the analysis to work only for the expected output set. Can often follow from the linearity of the expectation. The new constraints are defined using multiple states (and their probabilities): Not natural/accessible for the randomized algorithm. True for the two algorithms we derandomize. Some Justifications We now require the analysis to work only for the expected output set. Can often follow from the linearity of the expectation. The new constraints are defined using multiple states (and their probabilities): Not natural/accessible for the randomized algorithm. True for the two algorithms we derandomize.
14
Finding a good solution Has a solution (the probabilities used by the original algorithm). Bounded. A basic feasible solution contains at most one non-zero variable for every constraint: One non-zero variable for every current state. k additional non-zero variables. A basic feasible solution contains at most one non-zero variable for every constraint: One non-zero variable for every current state. k additional non-zero variables. The size of the distribution can increase by at most k at every iteration. 14
15
In Conclusion 15 Deterministic Algorithm Explicitly stores a distribution over states. In every iteration: Uses the previous LP to calculate the probabilities to move from one state to another. Calculates the distribution for the next iteration based on these probabilities. Deterministic Algorithm Explicitly stores a distribution over states. In every iteration: Uses the previous LP to calculate the probabilities to move from one state to another. Calculates the distribution for the next iteration based on these probabilities. Performance The analysis of the original (randomized) algorithm still works. The size of the distribution grows linearly in k – polynomial time algorithm. Performance The analysis of the original (randomized) algorithm still works. The size of the distribution grows linearly in k – polynomial time algorithm. Sometimes the LP can be solved quickly, resulting in a quite fast algorithm.
16
Open Problems 16 Derandomizing additional algorithms for submodular max. problems. In particular, derandomizing algorithms involving the multilinear extension. Derandomizing additional algorithms for submodular max. problems. In particular, derandomizing algorithms involving the multilinear extension. Obtaining faster deterministic algorithms for the problems we considered. Using our technique to derandomize algorithms from other fields.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.