CS 3343: Analysis of Algorithms

CS 3343: Analysis of Algorithms
Introduction to Greedy Algorithms

Outline Review of DP Greedy algorithms
Similar to DP, not an actual algorithm, but a meta algorithm

Two steps to dynamic programming
Formulate the solution as a recurrence relation of solutions to subproblems. Specify an order of evaluation for the recurrence so you always have what you need.

Restaurant location problem
You work in the fast food business Your company plans to open up new restaurants in Texas along I-35 Many towns along the highway, call them t1, t2, …, tn Restaurants at ti has estimated annual profit pi No two restaurants can be located within 10 miles of each other due to regulation Your boss wants to maximize the total profit You want a big bonus 10 mile

A DP algorithm Suppose you’ve already found the optimal solution
It will either include tn or not include tn Case 1: tn not included in optimal solution Best solution same as best solution for t1 , …, tn-1 Case 2: tn included in optimal solution Best solution is pn + best solution for t1 , …, tj , where j < n is the largest index so that dist(tj, tn) ≥ 10

Recurrence formulation
Let S(i) be the total profit of the optimal solution when the first i towns are considered (not necessarily selected) S(n) is the optimal solution to the complete problem S(n-1) S(j) + pn j < n & dist (tj, tn) ≥ 10 S(n) = max S(i-1) S(j) + pi j < i & dist (tj, ti) ≥ 10 S(i) = max Generalize Number of sub-problems: n. Boundary condition: S(0) = 0. Dependency: i i-1 j S

Example S(i-1) S(j) + pi j < i & dist (tj, ti) ≥ 10 S(i) = max
Distance (mi) 100 5 2 2 6 6 3 6 10 7 dummy 7 3 4 12 Profit (100k) 6 7 9 8 3 3 2 4 12 5 S(i) 6 7 9 9 10 12 12 14 26 26 Optimal: 26 S(i-1) S(j) + pi j < i & dist (tj, ti) ≥ 10 S(i) = max

Complexity Time: O(nk), where k is the maximum number of towns that are within 10 miles to the left of any town In the worst case, O(n2) Can be reduced to O(n) by pre-processing Memory: Θ(n)

Knapsack problem We studied the 0-1 problem.
Each item has a value and a weight Objective: maximize value Constraint: knapsack has a weight limitation Three versions: 0-1 knapsack problem: take each item or leave it Fractional knapsack problem: items are divisible Unbounded knapsack problem: unlimited supplies of each item. Which one is easiest to solve? We studied the 0-1 problem.

Formal definition (0-1 problem)
Knapsack has weight limit W Items labeled 1, 2, …, n (arbitrarily) Items have weights w1, w2, …, wn Assume all weights are integers For practical reason, only consider wi < W Items have values v1, v2, …, vn Objective: find a subset of items, S, such that iS wi  W and iS vi is maximal among all such (feasible) subsets

A DP algorithm Suppose you’ve find the optimal solution S
Case 1: item n is included Case 2: item n is not included Total weight limit: W Total weight limit: W wn wn Find an optimal solution using items 1, 2, …, n-1 with weight limit W - wn Find an optimal solution using items 1, 2, …, n-1 with weight limit W

Recursive formulation
Let V[i, w] be the optimal total value when items 1, 2, …, i are considered for a knapsack with weight limit w => V[n, W] is the optimal solution V[n, W] = max V[n-1, W-wn] + vn V[n-1, W] Generalize V[i, w] = max V[i-1, w-wi] + vi item i is taken V[i-1, w] item i not taken V[i-1, w] if wi > w item i not taken Boundary condition: V[i, 0] = 0, V[0, w] = 0. Number of sub-problems = ?

Example n = 6 (# of items) W = 10 (weight limit)
Items (weight, value):

w 1 2 3 4 5 6 7 8 9 10 i wi vi 1 2 2 2 4 3 wi 3 3 3 V[i-1, w-wi] V[i-1, w] 4 5 5 6 6 V[i, w] 5 2 4 6 6 9 V[i-1, w-wi] + vi item i is taken V[i-1, w] item i not taken max V[i, w] = V[i-1, w] if wi > w item i not taken

w 1 2 3 4 5 6 7 8 9 10 i wi vi 1 2 4 3 5 6 9 2 2 2 2 2 2 2 2 2 3 5 2 2 3 5 5 5 5 6 8 3 5 2 3 5 6 8 6 8 9 11 2 3 3 6 9 4 6 7 10 12 13 4 7 10 13 15 9 4 4 6 7 10 13 V[i-1, w-wi] + vi item i is taken V[i-1, w] item i not taken max V[i-1, w] if wi > w item i not taken V[i, w] =

w 1 2 3 4 5 6 7 8 9 10 i wi vi 1 2 4 3 5 6 9 2 2 2 2 2 2 2 2 2 3 5 2 2 3 5 5 5 5 6 8 3 5 2 3 5 6 8 6 8 9 11 2 3 3 6 9 4 7 10 12 13 4 6 7 10 13 9 4 4 6 7 10 13 15 Optimal value: 15 Item: 6, 5, 1 Weight: = 10 Value: = 15

Time complexity Θ (nW) Polynomial?
Pseudo-polynomial Works well if W is small Consider following items (weight, value): (10, 5), (15, 6), (20, 5), (18, 6) Weight limit 35 Optimal solution: item 2, 4 (value = 12). Iterate: 2^4 = 16 subsets Dynamic programming: fill up a 4 x 35 = 140 table entries What’s the problem? Many entries are unused: no such weight combination Top-down may be better

Events scheduling problem
f9 s8 f8 s7 f7 e8 e3 e4 e5 e7 e9 e1 e2 Time A list of events to schedule ei has start time si and finishing time fi Indexed such that fi < fj if i < j Each event has a value vi Schedule to make the largest value You can attend only one event at any time

f9 s8 f8 s7 f7 e8 e3 e4 e5 e7 e9 e1 e2 Time V(i) is the optimal value that can be achieved when the first i events are considered V(n) = V(n-1) en not selected max { V(j) + vn en selected j < n and fj < sn

Restaurant location problem 2
Now the objective is to maximize the number of new restaurants (subject to the distance constraint) In other words, we assume that each restaurant makes the same profit, no matter where it is opened 10 mile

A DP Algorithm Exactly as before, but pi = 1 for all i S(i-1)
S(j) + pi j < i & dist (tj, ti) ≥ 10 S(i) = max S(i-1) S(j) + 1 j < i & dist (tj, ti) ≥ 10 S(i) = max

Example S(i-1) S(j) + 1 j < i & dist (tj, ti) ≥ 10 S(i) = max
Distance (mi) 100 5 2 2 6 6 3 6 10 7 dummy Profit (100k) 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 2 2 3 4 4 S(i) Optimal: 4 S(i-1) S(j) + 1 j < i & dist (tj, ti) ≥ 10 S(i) = max Natural greedy 1: = 4 Maybe greedy is ok here? Does it work for all cases?

Comparison Dist(mi) 100 5 2 2 6 6 3 6 10 7 Profit (100k) 1 1 1 1 1 1 1
Profit (100k) 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 2 2 3 4 4 S(i) Benefit of taking t1 rather than t2? Benefit of waiting to see t2? t1 gives you more choices for the future None! Dist(mi) 100 5 2 2 6 6 3 6 10 7 Profit (100k) 6 7 9 8 3 3 2 4 12 5 S(i) 6 7 9 9 10 12 12 14 26 26 Benefit of taking t1 rather than t2? Benefit of waiting to see t2? t1 gives you more choices for the future t2 may have a bigger profit

Moral of the story If a better opportunity may come out next, you may want to hold on your decision Otherwise, grasp the current opportunity immediately because there is no reason to wait …

Greedy algorithm For certain problems, DP is an overkill
Greedy algorithm may guarantee to give you the optimal solution Much more efficient

Formal argument Claim 1: if A = [m1, m2, …, mk] is the optimal solution to the restaurant location problem for a set of towns [t1, …, tn] m1 < m2 < … < mk are indices of the selected towns Then B = [m2, m3, …, mk] is the optimal solution to the sub-problem [tj, …, tn], where tj is the first town that are at least 10 miles to the right of tm1 Proof by contradiction: suppose B is not the optimal solution to the sub-problem, which means there is a better solution B’ to the sub-problem Then A’ = m1 || B’ gives a better solution than A = m1 || B => A is not optimal => contradiction => B is optimal B m1 A m2 mk m1 B’ (imaginary) A’

Implication of Claim 1 If we know the first town that needs to be chosen, we can reduce the problem to a smaller sub-problem This is similar to dynamic programming Optimal substructure

Formal argument (cont’d)
Claim 2: for the uniform-profit restaurant location problem, there is an optimal solution that chooses t1 Proof by contradiction: suppose that no optimal solution can be obtained by choosing t1 Say the first town chosen by the optimal solution S is ti, i > 1 Replace ti with t1 will not violate the distance constraint, and the total profit remains the same => S’ is an optimal solution Contradiction Therefore claim 2 is valid S S’

Implication of Claim 2 We can simply choose the first town as part of the optimal solution This is different from DP Decisions are made immediately By Claim 1, we then only need to repeat this strategy to the remaining sub-problem

Greedy algorithm for restaurant location problem
select t1 d = 0; for (i = 2 to n) d = d + dist(ti, ti-1); if (d >= min_dist) select ti end 5 2 2 6 6 3 6 10 7 d 5 7 9 15 6 9 15 10 7

Complexity Time: Θ(n) Memory: Θ(n) to store the input
Θ(1) for greedy selection

Time Objective: to schedule the maximal number of events Let vi = 1 for all i and solve by DP, but overkill Greedy strategy: choose the first-finishing event that is compatible with previous selection (1, 2, 4, 6, 8 for the above example) Why is this a valid strategy? Claim 1: optimal substructure Claim 2: there is an optimal solution that chooses e1 Proof by contradiction: Suppose that no optimal solution contains e1 Say the first event chosen is ei => other chosen events start after ei finishes Replace ei by e1 will result in another optimal solution (e1 finishes earlier than ei) Contradiction Simple idea: attend the event that will left you with the most amount of time when finished

Knapsack problem Each item has a value and a weight Objective: maximize value Constraint: knapsack has a weight limitation Three versions: 0-1 knapsack problem: take each item or leave it Fractional knapsack problem: items are divisible Unbounded knapsack problem: unlimited supplies of each item. Which one is easiest to solve? We can solve the fractional knapsack problem using greedy algorithm

Greedy algorithm for fractional knapsack problem
Compute value/weight ratio for each item Sort items by their value/weight ratio into decreasing order Call the remaining item with the highest ratio the most valuable item (MVI) Iteratively: If the weight limit can not be reached by adding MVI Select MVI Otherwise select MVI partially until weight limit

Example Weight limit: 10 9 6 4 2 5 3 1 Value ($) Weight (LB) item 1.5
1.2 1 0.75 $ / LB

Example Weight limit: 10 Take item 5 Take item 6 Take 2 LB of item 4
Weight (LB) Value ($) $ / LB 5 2 4 6 9 1.5 1.2 1 3 0.75

Why is greedy algorithm for fractional knapsack problem valid?
Claim: the optimal solution must contain the MVI as much as possible (either up to the weight limit or until MVI is exhausted) Proof by contradiction: suppose that the optimal solution does not use all available MVI (i.e., there is still w (w < W) units of MVI left while we choose other items) We can replace w pounds of less valuable items by MVI The total weight is the same, but with value higher than the “optimal” Contradiction w w

Elements of greedy algorithm
Optimal substructure Locally optimal decision leads to globally optimal solution For most optimization problems, greedy algorithm will not guarantee an optimal solution But may give you a good starting point to use other optimization techniques Starting from next lecture, we’ll study several problems in graph theory that can actually be solved by greedy algorithm

CS 3343: Analysis of Algorithms

Similar presentations

Presentation on theme: "CS 3343: Analysis of Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 3343: Analysis of Algorithms

Similar presentations

Presentation on theme: "CS 3343: Analysis of Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback