Greedy Technique Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: Feasible: satisfying the prob. constraints locally optimal: the best local choice Irrevocable: cannot be changed on subsequent steps once made For some problems, yields an optimal solution for every instance. For most, does not but can be useful for fast approximations.
DP vs. Greedy Algorithms Both to solve problems exhibiting optimal substructure DP – Solution to the problem assembled from the solutions to subproblems by considering various choices Greedy algroithms – Greedy-choice property: the locally optimal choice is made w/o considering results from subproblems However, DP could be overkill sometimes
Applications of the Greedy Strategy Optimal solutions: – change making for “normal” coin denominations – minimum spanning tree (MST) – single-source shortest paths (Dijkstra’s algorithm) – simple scheduling problems – Huffman codes Approximations: – traveling salesman problem (TSP) – knapsack problem – other combinatorial optimization problems
An Activity-Selection Problem Suppose A set of activities S={a 1, a 2,…, a n } – They use resources, such as lecture hall, one lecture at a time – Each a i, has a start time s i, and finish time f i, with 0 s i < f i < . – a i and a j are compatible if [s i, f i ) and [s j, f j ) do not overlap Goal: select maximum-size subset of mutually compatible activities. Start from dynamic programming, then greedy algorithm, see the relation between the two.
Activity-Selection Problem Problem: get your money’s worth out of a carnival – Buy a wristband that lets you onto any ride – Lots of rides, each starting and ending at different times – Your goal: ride as many rides as possible Another, alternative goal that we don’t solve here: maximize time spent on rides Welcome to the activity selection problem
Activity-Selection Formally: – Given a set S of n activities S ={a 1, a 2,…, a n } s i = start time of activity a i f i = finish time of activity a i – Find max-size subset A of compatible activities n Assume (wlog) that f 1 f 2 … f n
DP Solution Optimal substructure?
DP solution –step 1 Optimal substructure of activity-selection problem. – Assume that f 1 … f n (otherwise, sort them by f i ) – Define S ij ={a k : f i s k <f k s j }, i.e., all activities starting after a i finished and ending before a j begins. – Define two fictitious activities a 0 with f 0 =0 and a n+1 with s n+1 = So f 0 f 1 … f n+1. – Then an optimal solution including a k to S ij contains within it the optimal solution to S ik and S kj.
DP solution –step 2 A recursive solution Let c[i,j] be # of activities in a maximum-size subset of mutually compatible activities in S ij. So the solution is c[0,n+1]=S 0,n+1. C[i,j]= 0 if S ij = max{c[i,k]+c[k,j]+1} if S ij i<k<j and a k S ij
Greedy Algorithms Greedy choice – Intuition: we should choose an activity that leaves the resource available for as many other activities as possible – So, consider the locally optimal choice Select the activity a k with the earliest finish time in S i,j Unlike DP solution, after the local greedy choice, only one subproblem remains! One big question: – Is our intuition correct? – We have to prove it is safe to make the greedy choice
Justify Greedy Choice Theorem 16.1: consider any nonempty subproblem S ij, and let a m be the activity in S ij with earliest finish time: f m =min{f k : a k S ij }, then 1.Activity a m is used in some maximum-size subset of mutually compatible activities of S ij. 2.The subproblem S im is empty, so that choosing a m leaves S mj as the only one that may be nonempty. Proof of the theorem (p418)
Top-Down Rather Than Bottom-Up To solve S ij, choose a m in S ij with the earliest finish time, then solve S mj, (S im is empty) It is certain that optimal solution to S mj is in optimal solution to S ij. No need to solve S mj ahead of S ij. Subproblem pattern: S i,n+1.
Recursive Solution recursive_select(s, f, k, n) { m = k+1 while (m < n && s[m] < f[k]) m++ if (m < n) return {a m } U recursive_select(s, f, m, n) else return Ø }
Optimal Solution Properties In DP, optimal solution depends: – How many subproblems to divide. (2 subproblems) – How many choices to determine which subproblem to use. (j-i- 1 choices) However, the above theorem (16.1) reduces both significantly – One subproblem (the other is sure to be empty). – One choice, i.e., the one with earliest finish time in S ij. – Moreover, top-down solving, rather than bottom-up in DP. – Pattern to the subproblems that we solve, S m,n+1 from S ij. – Pattern to the activities that we choose. The activity with earliest finish time. – With this local optimal, it is in fact the global optimal.
Elements of greedy strategy Determine the optimal substructure Develop the recursive solution Prove one of the optimal choices is the greedy choice yet safe Show that all but one of subproblems are empty after greedy choice Develop a recursive algorithm that implements the greedy strategy Convert the recursive algorithm to an iterative one.
Change-Making Problem Given unlimited amounts of coins of denominations d 1 > … > d m, give change for amount n with the least number of coins Example: d 1 = 25c, d 2 =10c, d 3 = 5c, d 4 = 1c and n = 48c Greedy solution: Greedy solution is optimal for any amount and “normal’’ set of denominations may not be optimal for arbitrary coin denominations – (4,3,1) for 6
Minimum Spanning Tree (MST), p Spanning tree of a connected graph G: a connected acyclic subgraph of G that includes all of G’s vertices Minimum spanning tree of a weighted, connected graph G: a spanning tree of G of minimum total weight Example: c d b a
Prim’s MST algorithm (p ) Start with tree T 1 consisting of one (any) vertex and “grow” tree one vertex at a time to produce MST through a series of expanding subtrees T 1, T 2, …, T n On each iteration, construct T i+1 from T i by adding vertex not in T i that is closest to those already in T i (this is a “greedy” step!) Stop when all vertices are included
Prim’s algorithm Step 0: Original graphStep 1: D is chose as an arbitrary starting node Step 2: A is added into the MST Step 3: F is added into the MST
Prim’s algorithm Step 4: B is added into the MST Step 5: E is added into the MST Step 6: C is added into the MSTStep 7: G is added into the MST
Notes about Prim’s algorithm Proof by induction that this construction actually yields MST Needs priority queue for locating closest fringe vertex Efficiency – O(n 2 ) for weight matrix representation of graph and array implementation of priority queue – O(m log n) for adjacency list representation of graph with n vertices and m edges and min-heap implementation of priority queue, how to get this
O(m log n) Prim’s Alg. Hints – A mini-heap of size n, each vertex ordered by mini_dist of infinity except the initial vertex – parent[n]: – n iterations of heap removal operation For each removal, update the mini_dist and parent of the remaining vertices in the heap m/n: avg. # of edges per vertex
Shortest paths – Dijkstra’s algorithm Single Source Shortest Paths Problem: Given a weighted connected graph G, find shortest paths from source vertex s to each of the other vertices Dijkstra’s algorithm: Similar to Prim’s MST algorithm, with a different way of computing numerical labels: Among vertices not already in the tree, it finds vertex u with the smallest sum d v + w(v,u) where v is a vertex for which shortest path has been already found on preceding iterations (such vertices form a tree) d v is the length of the shortest path form source to v w(v,u) is the length (weight) of edge from v to u
Example d 4 Tree vertices Remaining vertices a(-,0) b(a,3) c(-,∞) d(a,7) e(-,∞) a b 4 e c a b d 4 c e a b d 4 c e a b d 4 c e b(a,3) c(b,3+4) d(b,3+2) e(-,∞) d(b,5) c(b,7) e(d,5+4) c(b,7) e(d,9) e(d,9) d a b d 4 c e
Notes on Dijkstra’s algorithm Doesn’t work for graphs with negative weights Applicable to both undirected and directed graphs Efficiency – O(|V| 2 ) for graphs represented by weight matrix and array implementation of priority queue – O(|E|log|V|) for graphs represented by adj. lists and min-heap implementation of priority queue Don’t mix up Dijkstra’s algorithm with Prim’s algorithm!
Review: The Knapsack Problem The famous knapsack problem: – A thief breaks into a museum. Fabulous paintings, sculptures, and jewels are everywhere. The thief has a good eye for the value of these objects, and knows that each will fetch hundreds or thousands of dollars on the clandestine art collector’s market. But, the thief has only brought a single knapsack to the scene of the robbery, and can take away only what he can carry. What items should the thief take to maximize the haul?
Review: The Knapsack Problem More formally, the 0-1 knapsack problem: – The thief must choose among n items, where the ith item worth v i dollars and weighs w i pounds – Carrying at most W pounds, maximize value Note: assume v i, w i, and W are all integers “0-1” b/c each item must be taken or left in entirety A variation, the fractional knapsack problem: – Thief can take fractions of items – Think of items in 0-1 problem as gold ingots, in fractional problem as buckets of gold dust
Review: The Knapsack Problem And Optimal Substructure Both variations exhibit optimal substructure To show this for the 0-1 problem, consider the most valuable load weighing at most W pounds – If we remove item j from the load, what do we know about the remaining load? – A: remainder must be the most valuable load weighing at most W - w j that thief could take from museum, excluding item j
Solving The Knapsack Problem The optimal solution to the fractional knapsack problem can be found with a greedy algorithm – How? The optimal solution to the 0-1 problem cannot be found with the same greedy strategy – Greedy strategy: take in order of dollars/pound – Example: 3 items weighing 10, 20, and 30 pounds, knapsack can hold 50 pounds Suppose item 2 is worth $100. Assign values to the other items so that the greedy strategy will fail
The Knapsack Problem: Greedy vs. DP The fractional problem can be solved greedily The 0-1 problem cannot be solved with a greedy approach – As you have seen, however, it can be solved with dynamic programming
Coding Problem Coding: assignment of bit strings to alphabet characters Codewords: bit strings assigned for characters of alphabet Two types of codes: fixed-length encoding (e.g., ASCII) variable-length encoding (e,g., Morse code) Prefix-free codes: no codeword is a prefix of another codeword Problem: If frequencies of the character occurrences are known, what is the best binary prefix-free code?
Huffman codes Any binary tree with edges labeled with 0’s and 1’s yields a prefix-free code of characters assigned to its leaves Optimal binary tree minimizing the expected (weighted average) length of a codeword can be constructed as follows Huffman’s algorithm Initialize n one-node trees with alphabet characters and the tree weights with their frequencies. Repeat the following step n-1 times: join two binary trees with smallest weights into one (as left and right subtrees) and make its weight equal the sum of the weights of the two trees. Mark edges leading to left and right subtrees with 0’s and 1’s, respectively.
Example character AB C D _ frequency codeword average bits per character: 2.25 for fixed-length encoding: 3 compression ratio: (3-2.25)/3*100% = 25%