Lecture 2: Greedy Algorithms

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Algorithm Design Techniques: Greedy Algorithms. Introduction Algorithm Design Techniques –Design of algorithms –Algorithms commonly used to solve problems.
Lecture 2: Greedy Algorithms II Shang-Hua Teng Optimization Problems A problem that may have many feasible solutions. Each solution has a value In maximization.
Greedy Algorithms Amihood Amir Bar-Ilan University.
Greedy Algorithms Greed is good. (Some of the time)
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Association Clusters Definition The frequency of a stem in a document,, is referred to as. Let be an association matrix with rows and columns, where. Let.
1 Assignment 2: (Due at 10:30 a.m on Friday of Week 10) Question 1 (Given in Tutorial 5) Question 2 (Given in Tutorial 7) If you do Question 1 only, you.
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Lecture 6: Greedy Algorithms I Shang-Hua Teng. Optimization Problems A problem that may have many feasible solutions. Each solution has a value In maximization.
CS3381 Des & Anal of Alg ( SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy.
Data Structures – LECTURE 10 Huffman coding
Week 2: Greedy Algorithms
Greedy Algorithms Huffman Coding
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
16.Greedy algorithms Hsu, Lih-Hsing. Computer Theory Lab. Chapter 16P An activity-selection problem Suppose we have a set S = {a 1, a 2,..., a.
Advanced Algorithm Design and Analysis (Lecture 5) SW5 fall 2004 Simonas Šaltenis E1-215b
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
1 Algorithms CSCI 235, Fall 2015 Lecture 30 More Greedy Algorithms.
Greedy Algorithms.
Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.
1 Chapter 16: Greedy Algorithm. 2 About this lecture Introduce Greedy Algorithm Look at some problems solvable by Greedy Algorithm.
CS3381 Des & Anal of Alg ( SemA) City Univ of HK / Dept of CS / Helena Wong 5. Greedy Algorithms - 1 Greedy.
Greedy Algorithms Analysis of Algorithms.
Huffman encoding.
Greedy algorithms 2 David Kauchak cs302 Spring 2012.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
Greedy Algorithms. p2. Activity-selection problem: Problem : Want to schedule as many compatible activities as possible., n activities. Activity i, start.
CS6045: Advanced Algorithms Greedy Algorithms. Main Concept –Divide the problem into multiple steps (sub-problems) –For each step take the best choice.
HUFFMAN CODES.
Greedy Algorithms Alexandra Stefan.
CSC317 Greedy algorithms; Two main properties:
CSCE 411 Design and Analysis of Algorithms
The Greedy Method and Text Compression
Proving the Correctness of Huffman’s Algorithm
The Greedy Method and Text Compression
Introduction to Algorithms`
Greedy Algorithm.
Chapter 16: Greedy Algorithm
Analysis & Design of Algorithms (CSCE 321)
Huffman Coding.
Binhai Zhu Computer Science Department, Montana State University
Merge Sort 11/28/2018 2:21 AM The Greedy Method The Greedy Method.
CS6045: Advanced Algorithms
CS4335 Design and Analysis of Algorithms/WANG Lusheng
Algorithms (2IL15) – Lecture 2
Advanced Algorithms Analysis and Design
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Chapter 16: Greedy algorithms Ming-Te Chi
Advanced Algorithms Analysis and Design
Merge Sort Dynamic Programming
Greedy Algorithms TOPICS Greedy Strategy Activity Selection
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Data Structure and Algorithms
Greedy Algorithms Alexandra Stefan.
Chapter 16: Greedy algorithms Ming-Te Chi
2019/2/25 chapter25.
Algorithm Design Techniques Greedy Approach vs Dynamic Programming
The results for Challenging Problem 1.
Podcast Ch23d Title: Huffman Compression
Week 2: Greedy Algorithms
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Huffman codes Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%)
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Proving the Correctness of Huffman’s Algorithm
Analysis of Algorithms CS 477/677
Presentation transcript:

Lecture 2: Greedy Algorithms The change-making problem: Give changes for a specific amount n with the least number of coins of the denominations d1>d2>…>dm. For example, coin denominations in some country are: 10 dollars, 5 dollars and 1 dollar. How do you give change of 28 dollars? 210 dollars +15 dollars+3 1 dollar=28. GREEDY ALGORIYHM: (1) use 10 dollar coins as many as possible; (2) use 5 dollar coins as many as possible, and (3) the remain amount is for 1 dollar coins. You can prove that the no. of coins used is minimized. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

The change-making problem: Strange denominations: 7 dollars, 5 dollars and 1 dollars Greedy algorithm for 11 dollars: 7dollars+4 1 dollar=11 dollars. (5 coins are required.) A better way: 2 5 dollars +1 dollar=11 dollars. (3 coins are required) Sometimes greedy algorithm works, sometimes it does not. For denominations: 10, 5 and 1 (The proof is easy.) If 5-dollar coins are used at least twice, then the remaining amount is 10 and we use a 10-dollar coin to replace two 5-dollar coins. If 1-dollar coins are used at least five times, then the remaining amount is  5 and we use a 5-dollar coin to replace five 1-dollar coins. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

Greedy works for denominations 10, 5, and 1 Strategy for Proof: (The strategy can be used for many problems. The hard part.) Compare an optimal solution with the solution given by greedy algorithm (bit by bit, or component by component). Let an optimal solution, denoted (x, y, z), have: x 10 dollar coins , y 5 dollar coins and z 1 dollar coins. Let the solution obtained from our greedy algorithm, denoted (x’, y’, z’) have: x’ 10 dollar coins , y’ 5 dollar coins and z’ 1 dollar coins. Show that the two solutions are the same. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

Greedy works for denominations 10, 5, and 1 (Fun part, not tested) Theorem: The greedy algorithm works for denominations: 10, 5 and 1. Proof: We compare x and x’ (# of 10 dollar coins), (y and y’) # of 5 dollar coins and (z and z’) # of 1 dollar coins, one by one. Comparison of x and x’. Since x’ is from greedy algorithm, x’x. If x’>x, then y5+ z 1 10. (because x  10+ y5+ z 1 = x’ 10+y’5+ z ‘1 ) Thus, we can modify the optimal solution (x, y, z) using one more 10 dollar coin to replace 10 dollars that are expressed by (1) two 5 dollar coins, (2) one 5 dollar coin +five 1 dollar coins or (3) ten 1 dollar coins. The new solution (x+1, y’’, z’’) contains less # of coins than (x, y, z). Contradiction! Because by assumption, (x, y, z) is optimum. Thus, x’>x cannot be ture. Therefore, x’=x. Similarly, we can show y’=y and z’=z. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

The 0-1 Knapsack problem: N items, where the i-th item is worth vi dollars and weight wi pounds. vi and wi are integers. A thief can carry at most W (integer) pounds. How to take as valuable a load as possible. An item cannot be divided into pieces. The fractional knapsack problem: The same setting, but the thief can take fractions of items. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

Solve the fractional Knapsack problem: Greedy on the value per pound vi/wi. Each time, take the item with maximum vi/wi . If exceeds W, take fractions of the item. Proof of correctness: (The hard part) Let X = i1, i2, …ik be the optimal items taken. Consider the item j with the highest vi /wi. if j is not used in X (the optimal solution), get rid of some items (possibly fractional items) and add item j. (since fractional items are allowed, we can do it.) Total value is increased. One more item selected by greedy is added to X Repeat the process, X is changed to contain all items selected by greedy WITHOUT decreasing the totall value taken by the thief. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

The 0-1 knapsack problem cannot be solved by greedy Counter example: (moderate part) W=10 Items found (6pounds, 12dollars), (5pounds, 9 dollar), (5pounds, 9 dollars), (3pounds, 3 dollars), (3 pounds, 3 dollars) If we first take (6, 12) according to greedy algorithm, then solution is (6, 12), (3, 3) (total value is 12+3=15). However, a better solution is (5, 9), (5, 9) with total value 18. To show that a statement does not hold, we only have to give an example. To show that the theorem is true, we have to give an proof. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

Huffman codes Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 The first way needs 1003=300 bits. The second way needs 45 1+13 3+12 3+16 3+9 4+5 4=232 bits. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng Variable-length code Need some care to read the code. 001011101 (codeword: a=0, b=00, c=01, d=11.) Where to cut? 00 can be explained as either aa or b. Prefix of 0011: 0, 00, 001, and 0011. Prefix codes: no codeword is a prefix of some other codeword. (prefix free) Prefix codes are simple to encode and decode. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

Using codeword in Table to encode and decode Encode: abc = 0.101.100 = 0101100 (just concatenate the codewords.) Decode: 001011101 = 0.0.101.1101 = aabe a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

Encode: abc = 0.101.100 = 0101100 (just concatenate the codewords.) Decode: 001011101 = 0.0.101.1101 = aabe (use the (right)binary tree below:) a:45 b:13 c:12 d:16 e:9 f:5 1 100 14 86 28 58 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 Tree for the fixed length codeword Tree for variable-length codeword 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng Binary tree Every nonleaf node has two children. The fixed-length code in our example is not optimal. The total number of bits required to encode a file is f ( c ) : the frequency (number of occurrences) of c in the file dT(c): denote the depth of c’s leaf in the tree 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

Constructing an optimal code Formal definition of the problem: Input: a set of characters C={c1, c2, …, cn}, each cC has frequency f[c]. Output: a binary tree representing codewords so that the total number of bits required for the file is minimized. Huffman proposed a greedy algorithm to solve the problem. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng b:13 d:16 a:45 (b) a:45 d:16 e:9 f:5 14 1 b:13 c:12 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng 14 1 b:13 c:12 25 (c) a:45 b:13 c:12 d:16 e:9 f:5 25 30 14 1 (d) 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 1 (f) (e) 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng HUFFMAN(C) 1 n:=|C| 2 Q:=C 3 for i:=1 to n-1 do 4 z:=ALLOCATE_NODE() 5 x:=left[z]:=EXTRACT_MIN(Q) 6 y:=right[z]:=EXTRACT_MIN(Q) 7 f[z]:=f[x]+f[y] 8 INSERT(Q,z) 9 return EXTRACT_MIN(Q) 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng The Huffman Algorithm This algorithm builds the tree T corresponding to the optimal code in a bottom-up manner. C is a set of n characters, and each character c in C is a character with a defined frequency f[c]. Q is a priority queue, keyed on f, used to identify the two least-frequent characters to merge together. The result of the merger is a new object (internal node) whose frequency is the sum of the two objects. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng Time complexity Lines 4-8 are executed n-1 times. Each heap operation in Lines 4-8 takes O(lg n) time. Total time required is O(n lg n). Note: The details of heap operation will not be tested. Time complexity O(n lg n) should be remembered. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng Another example: e:4 a:6 c:6 b:9 d:11 c:6 b:9 d:11 e:4 a:6 10 1 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng 10 1 d:11 c:6 b:9 15 1 c:6 b:9 15 1 d:11 e:4 a:6 10 1 21 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng b:9 15 1 d:11 e:4 a:6 10 21 36 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

Correctness of Huffman’s Greedy Algorithm (Fun Part, not required) Again, we use our general strategy. Let x and y are the two characters in C having the lowest frequencies. (the first two characters selected in the greedy algorithm.) We will show the two properties: There exists an optimal solution Topt (binary tree representing codewords) such that x and y are siblings in Topt. Let z be a new character with frequency f[z]=f[x]+f[y] and C’=C-{x, y}{z}. Let T’ be an optimal tree for C’. Then we can get Topt from T’ by replacing z with z x y 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng Proof of Property 1 b x y c x b c y Topt Tnew Look at the lowest siblings in Topt, say, b and c. Exchange x with b and y with c. B(Topt)-B(Tnew)0 since f[x] and f[y] are the smallest. 1 is proved. 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng

CS3335 Design and Analysis of Algorithms/WANG Lusheng Let z be a new character with frequency f[z]=f[x]+f[y] and C’=C-{x, y}{z}. Let T’ be an optimal tree for C’. Then we can get Topt from T’ by replacing z with Proof: Let T be the tree obtained from T’ by replacing z with the three nodes. B(T)=B(T’)+f[x]+f[y]. … (1) (the length of the codes for x and y are 1 bit more than that of z.) Now prove T= Topt by contradiction. If TTopt, then B(T)>B(Topt). …(2) From 1, x and y are siblings in Topt . Thus, we can delete x and y from Topt and get another tree T’’ for C’. B(T’’)=B(Topt) –f[x]-f[y]<B(T)-f[x]-f[y]=B(T’). using (2) using (1) Thus, T(T’’)<B(T’). Contradiction to the assumption : T’ is optimum for C’. z y x 2019/5/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng