GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling.

GREEDY ALGORITHMS UNIT IV

TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling Problem Backtracking: – Introduction and N-Queens problem.

Overview Like dynamic programming, used to solve optimization problems. Problems exhibit optimal substructure. Problems also exhibit the greedy-choice property. – A greedy algorithm always makes the choice that looks best at the moment. – Make a locally optimal choice in hope of getting a globally optimal solution. – Greedy algorithms do not always yield optimal solutions, but for many problems they do. – The greedy method is quite powerful and works well for a wide range of problems.

Elements of the greedy strategy Determine the optimal substructure of the problem. Develop a recursive solution. Prove that at any stage of the recursion, one of the optimal choices is the greedy choice. Thus, it is always safe to make the greedy choice. Show that all but one of the subproblems induced by having made the greedy choice are empty. Develop a recursive algorithm that implements the greedy strategy. Convert the recursive algorithm to an iterative algorithm.

The Fractional Knapsack Problem Given: A set S of n items, with each item i having – b i - a positive benefit – w i - a positive weight Goal: Choose items with maximum total benefit but with weight at most W. If we are allowed to take fractional amounts, then this is the fractional knapsack problem. - In this case, we let x i denote the amount we take of item i – Objective: Maximize – Constraint:

Example Given: A set S of n items, with each item i having – b i - a positive benefit – w i - a positive weight Goal: Choose items with maximum total benefit but with weight at most W. Weight: Benefit: 12345 4 ml8 ml2 ml6 ml1 ml $12$32$40$30$50 Items: Value: 3 ($ per ml) 420550 10 ml Solution: 1 ml of 5 2 ml of 3 6 ml of 4 1 ml of 2 “knapsack”

The Fractional Knapsack Algorithm Greedy choice: Keep taking item with highest value (benefit to weight ratio) – Use a heap-based priority queue to store the items, then the time complexity is O(n log n). Correctness: Suppose there is a better solution – there is an item i with higher value than a chosen item j (i.e., v j <v i ), if we replace some j with i, we get a better solution – Thus, there is no better solution than the greedy one Algorithm fractionalKnapsack(S, W) Input: set S of items w/ benefit b i and weight w i ; max. weight W Output: amount x i of each item i to maximize benefit with weight at most W for each item i in S x i  0 v i  b i / w i {value} w  0{current total weight} while w < W remove item i with highest v i x i  min{w i, W  w} w  w + min{w i, W  w}

Greedy versus dynamic programming (0-1 Knapsack Problem)

Huffman codes Huffman codes are a widely used and very effective technique for compressing data Savings of 20% to 90% are typical, depending on the characteristics of the data being compressed. We consider the data to be a sequence of characters. Huffman’s greedy algorithm uses a table of the frequencies of occurrence of the characters to build up an optimal way of representing each character as a binary string.

Example Suppose we have a 100,000-character data file that we wish to store compactly. We observe that the characters in the file occur with the frequencies given by Figure That is, only six different characters appear, and the character ‘a’ occurs 45,000 times. Charactersabcdef Frequency (in thousands)4513121695 Fixed-length codeword000001010011100101 Variable-length codeword010110011111011100

Example There are many ways to represent such a file of information. We consider the problem of designing a binary character code (or code for short) wherein each character is represented by a unique binary string. If we use a fixed-length code, we need 3 bits to represent six characters: a = 000, b = 001,..., f = 101. This method requires 300,000 bits to code the entire file. Can we do better?

Example A variable-length code can do considerably better than a fixed-length code, by giving frequent characters short codewords and infrequent characters long codewords. Above Figure shows such a code; Here the 1-bit string 0 represents ‘a’, and The 4-bit string 1100 represents ‘f’. This code requires (45 · 1 + 13 · 3 + 12 · 3 + 16 · 3 + 9 · 4 + 5 · 4)  10 3 = 224  10 3 bits to represent the file, a savings of approximately 25%. In fact, this is an optimal character code for this file.

Huffman codes (Basic Concepts) We consider here only codes in which no codeword is also a prefix of some other codeword. Such codes are called prefix codes. Encoding is always simple for any binary character code; we just concatenate the codewords representing each character of the file. We code the 3-character file abc as 0·101·100 = 0101100, where we use “·” to denote concatenation.

Huffman codes (Basic Concepts) Prefix codes are desirable because they simplify decoding. Since no codeword is a prefix of any other, the codeword that begins an encoded file is unambiguous. We can simply identify the initial codeword, translate it back to the original character, and repeat the decoding process on the remainder of the encoded file.

Huffman codes (Construction) Huffman invented a greedy algorithm that constructs an optimal prefix code called a Huffman code. Keeping in line with our observations, its proof of correctness relies on the greedy-choice property and optimal substructure. In the pseudo-code that follows: we assume that C is a set of n characters and each character c ∈ C is an object with a defined frequency f[c]. The algorithm builds the tree T corresponding to the optimal code in a bottom-up manner. It begins with a set of |C| leaves and performs a sequence of |C| − 1 “merging” operations to create the final tree. A min-priority queue Q, keyed on f, is used to identify the two least frequent objects to merge together. The result of the merger of two objects is a new object whose frequency is the sum of the frequencies of the two objects that were merged.

Huffman codes (Algorithm)

Huffman codes (Example) Charactersabcdef Frequency (in thousands) 4513121695 Fixed-length codeword000001010011100101 Variable-length codeword 010110011111011100

Huffman codes (Example)

Char.Code a0 b101 c100 d111 e1101 f1100

Single Source Shortest Path we are given a weighted, directed graph G = (V, E), with weight function w : E → R mapping edges to real-valued weights. The weight of path p = v0, v1,..., vk is the sum of the weights of its constituent edges: We define the shortest-path weight from u to v by

Single Source Shortest Path A shortest path from vertex u to vertex v is then defined as any path p with weight: w(p) = δ(u, v). Relaxation The algorithms use the technique of relaxation. For each vertex v ∈ V, we maintain an attribute d[v], which is an upper bound on the weight of a shortest path from source s to v. We call d[v] a shortest-path estimate. We initialize the shortest-path estimates and predecessors by the following (V)-time procedure.

Single Source Shortest Path INITIALIZE-SINGLE-SOURCE(G, s) 1 for each vertex v ∈ V[G] 2 do d[v]←∞ 3 π[v]← NIL 4d[s] ← 0 After initialization, π[v] = NIL for all v ∈ V, d[s] = 0, and d[v] = ∞ for v ∈ V − {s}.

Single Source Shortest Path The process of relaxing an edge (u, v) consists of testing whether we can improve the shortest path to v found so far by going through u and, if so, updating d[v] and π[v]. A relaxation step may decrease the value of the shortest-path estimate d[v] and update v’s predecessor field π[v]. The following code performs a relaxation step on edge (u, v).

Single Source Shortest Path RELAX(u, v, w) 1 if d[v] > d[u] + w(u, v) 2 then d[v] ← d[u] + w(u, v) 3π[v]← u

Single Source Shortest Path Dijkstra’s algorithm Dijkstra’s algorithm solves the single-source shortest-paths problem on a weighted, directed graph G = (V, E) for the case in which all edge weights are nonnegative. We assume that w(u, v) ≥ 0 for each edge (u, v) ∈ E. Dijkstra’s algorithm maintains a set S of vertices whose final shortest-path weights from the source s have already been determined.

Single Source Shortest Path Dijkstra’s algorithm The algorithm repeatedly selects the vertex u ∈ V − S with the minimum shortest-path estimate, adds u to S, and relaxes all edges leaving u. In the following implementation, we use a min- priority queue Q of vertices, keyed by their d values.

Single Source Shortest Path Dijkstra’s algorithm DIJKSTRA(G, w, s) 1 INITIALIZE-SINGLE-SOURCE(G, s) 2 S ← ∅ 3 Q ← V[G] 4 while Q ≠ ∅ 5 do u ← EXTRACT-MIN(Q) 6 S ← S ∪ {u} 7 for each vertex v ∈ Adj[u] 8 do RELAX(u, v, w)

Single Source Shortest Path Dijkstra’s algorithm

36 Task Scheduling Given: a set T of n tasks, each having: – A start time, s i – A finish time, f i (where s i < f i ) Goal: Perform all the tasks using a minimum number of “machines.”

37 Task Scheduling Algorithm Greedy choice: consider tasks by their start time and use as few machines as possible with this order. Algorithm taskSchedule(T) Input: set T of tasks with start time s i and finish time f i Output: non-conflicting schedule with minimum number of machines m  0 {no. of machines} while T is not empty remove task i with smallest s i if there’s a machine j for i then schedule i on machine j else m  m + 1 schedule i on machine m

38 Task Scheduling Algorithm Running time: Given a set of n tasks specified by their start and finish times, Algorithm TaskSchedule produces a schedule of the tasks with the minimum number of machines in O(nlogn) time. – Use heap-based priority queue to store tasks with the start time as the priorities – Finding the earliest task takes O(logn) time

39 Example Given: a set T of n tasks, each having: – A start time, s i – A finish time, f i (where s i < f i ) – [1,4], [1,3], [2,5], [3,7], [4,7], [6,9], [7,8] (ordered by start) Goal: Perform all tasks on min. number of machines 198765432 Machine 1 Machine 3 Machine 2

40 Backtracking Suppose you have to make a series of decisions, among various choices, where – You don’t have enough information to know what to choose – Each decision leads to a new set of choices – Some sequence of choices (possibly more than one) may be a solution to your problem Backtracking is a methodical way of trying out various sequences of decisions, until you find one that “works”

41 Backtracking (animation) start? ? dead end ? ? ? success! dead end

42 Terminology I There are three kinds of nodes: A tree is composed of nodes The (one) root node Internal nodes Leaf nodes Backtracking can be thought of as searching a tree for a particular “goal” leaf node

43 Terminology II Each non-leaf node in a tree is a parent of one or more other nodes (its children) Each node in the tree, other than the root, has exactly one parent parent children parent children Usually, however, we draw our trees downward, with the root at the top

44 The backtracking algorithm Backtracking is really quite simple--we “explore” each node, as follows: To “explore” node N: 1. If N is a goal node, return “success” 2. If N is a leaf node, return “failure” 3. For each child C of N, 3.1. Explore C 3.1.1. If C was successful, return “success” 4. Return “failure”

45 FOUR QUEENS PROBLEM 1.. 2 1 2 1 2 3.. 1 1 1 2 3.. 4

46 8 QUEENS PROBLEM 12345678 1Q 2Q 3Q 4Q 5Q 6Q 7Q 8Q

47 The End

GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling.

Similar presentations

Presentation on theme: "GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling.

Similar presentations

Presentation on theme: "GREEDY ALGORITHMS UNIT IV. TOPICS TO BE COVERED Fractional Knapsack problem Huffman Coding Single source shortest paths Minimum Spanning Trees Task Scheduling."— Presentation transcript:

Similar presentations

About project

Feedback