Partial Sublinear Time Approximation and Inapproximation for Maximum Coverage Bin Fu Department of Computer Science University of Texas Rio Grande Valley Texas, USA
Hard to find, Easy to check blind monkey
Hamiltonian Path Hamiltonian path goes through each node exactly once HAMPATH={G| G is a directed graph with a Hamiltonian path}
P versus NP Polynomial time: P: polynomial time decidable problems. NP: polynomial time verifiable problems.
Polynomial Time Verifier verifier V(w,c) w: input c: certificate of length poly(|w|). where |w| is length of w. For example, |C++|=3
Polynomial Time Verifier A verifier V(w,c) for a language L is an algorithm V, V(w,c) runs in polynomial time poly(|w|) for input w, and c with a polynomial length w is in L if and only if V(w,c) accepts for some c with a polynomial length.
Algorithms toward NP-hardness 1.Approximation algorithm 2. Fixed Parameter Algorithm 3.Heurisitc Algorithm
NP-Complete A problem H is NP-complete if for every B in NP, Every problem in NP can be reduced to an NP-complete problem in a polynomial time. NP NP-complete P
Polynomial Time Reduction Assume that A and B are two sets. A is polynomial time mapping reducible to A if a polynomial time computable function f exists such that
Introduction Approximation algorithms are used to get a solution close to the (optimal) solution of an optimization problem in polynomial time
Definition An algorithm is an α-approximation algorithm for an optimization problem Π if The algorithm runs in polynomial time. The algorithm always produces a solution that is within a factor of α of the optimal solution.
Maximum Cover (MC) A collection T of finite m sets S1, S2, …, Sm, and integer k. Find k of them with largest union.
Decision Version of Maximum Cover A collection T of finite m sets S1, S2, …, Sm, integers k and t. Decide if there are k sets from them with union size at least t.
c-Approximation for Max Opt. An approximation produces a solution T if
Example of Maximum Coverage Input: k=2 with sets: S1 = { 1, 2, 3 } S2 = { 2, 7, 8 } S3 = { 1, 4, 5, 6, 7, 8} S4 = { 4, 5, 6, 8 } Output: Optimal Solution is S1, S3
MC Hardness MC is NP-Hard The decision version of MC is NP-complete: Is it possible to find k sets with union size at least t?
Greedy Algorithm Repeat k times Pick one set to cover the largest number of uncovered elements
Greedy Algorithm Performance Approximation Ratio for MC via greedy For any fixed there is no poly. time ratio approximation to MC unless P=NP (Feige, J.ACM 1998).
Example of Maximum Coverage Step 1: Select the largest set S3 = { 1, 4, 5, 6, 7, 8} Step2: Select the set S1 such that S1-S3 is the largest.
Our Input Model Each Set : 1) Membership query ? 2) Generating a random element in 3) The size
Greedy Algorithm k times: Pick one set with largest number of uncovered elements Largest |B-A| B A
Approximate Union Union size Union size in MC B A
Input Size of Maximum Coverage S3 = { 1, 4, 5, 6, 7, 8} n=5, m=4 S4 = { 4, 5, 6, 8 }
Randomized Greedy Algorithm Approximate |B-A| using random samples from B Estimate the percentage to be in B-A B A
Randomized Greedy Algorithm Approximate |B-A| using random samples from B Estimate the percentage to be in B-A B A
Approximate |B-A| Let w be the random samples in B. Let t be the items in B-A among w samples. B A
Randomized Greedy Algorithm Accuracy B A
Approximate Assume A B
Approximate Proof
Randomized Greedy Algorithm Repeat k times Pick one set to cover approximate largest number of uncovered elements
Classical Ratio Analysis Let OPT be the optimal solution size. Let be selected via greedy. The first size Assume the union of first t sets
Classical Ratio Analysis The t+1-th set Assume the union of first t+1 sets
Classical Ratio Analysis Function is increasing. Limit Bound Ratio
Ratio Analysis Let OPT be the optimal solution size. Let be selected with Then
Ratio Analysis Let OPT be the optimal solution size. Let be selected. There is a
Ratio Analysis Let OPT be the optimal solution size. Let be selected with
Monte Carlo Algorithm Put the circle into a square A Generate n random points in A Compute the number of points m in the circle (m/n)*|A| is the approximate area size of the circle
Randomized algorithm blind monkey
Randomized algorithm blind monkey
Input length for Sorting Problem Input: a list of numbers: 5, 3, 1, 7, 6 Input length n=5 Output: the sorted list 1, 3, 5, 6, 7
Time: number of steps Super linear: n(log n) (sorting) Sublinear: log n (binary search at sorted list)
Input length for MC m: number of sets n: the number of elements in the biggest set The total input size can be mn
Partial Sublinear Time for MC for some functions f(.) or g(.)
Classical Algorithm Old algorithm time Approximation ratio
New Partial Sublinear Time Our algorithm time Approximation ratio
Hoeffiding Bound Therorem: Let be independent 0,1-random variables such that Then for , and
Hoeffiding Bound Therorem: Let be independent 0,1-random variables such that Then for , and
Chernoff Bound Therorem: Let be independent 0,1-random variables such that Then for , and
Chernoff Bound Therorem: Let be independent 0,1-random variables such that Then for , and
Union Bound Probability inequality:
Paper address arXiv https://arxiv.org/abs/1604.01421
Future work More partial sublinear time algorithms.
Thanks Question?