CS420 lecture eight Greedy Algorithms. Going from A to G Starting with a full tank, we can drive 350 miles before we need to gas up, minimize the number.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Chapter 9 Greedy Technique. Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: b feasible - b feasible.
Algorithm Design Techniques: Greedy Algorithms. Introduction Algorithm Design Techniques –Design of algorithms –Algorithms commonly used to solve problems.
Lecture 2: Greedy Algorithms II Shang-Hua Teng Optimization Problems A problem that may have many feasible solutions. Each solution has a value In maximization.
CSCE 411H Design and Analysis of Algorithms Set 8: Greedy Algorithms Prof. Evdokia Nikolova* Spring 2013 CSCE 411H, Spring 2013: Set 8 1 * Slides adapted.
Greedy Algorithms Amihood Amir Bar-Ilan University.
Greedy Algorithms Greed is good. (Some of the time)
22C:19 Discrete Structures Trees Spring 2014 Sukumar Ghosh.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
Lecture 6: Greedy Algorithms I Shang-Hua Teng. Optimization Problems A problem that may have many feasible solutions. Each solution has a value In maximization.
Minimal Spanning Trees. Spanning Tree Assume you have an undirected graph G = (V,E) Spanning tree of graph G is tree T = (V,E T E, R) –Tree has same set.
Data Structures – LECTURE 10 Huffman coding
Greedy Algorithms Huffman Coding
CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.
16.Greedy algorithms Hsu, Lih-Hsing. Computer Theory Lab. Chapter 16P An activity-selection problem Suppose we have a set S = {a 1, a 2,..., a.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
Lecture 1: The Greedy Method 主講人 : 虞台文. Content What is it? Activity Selection Problem Fractional Knapsack Problem Minimum Spanning Tree – Kruskal’s Algorithm.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Advanced Algorithm Design and Analysis (Lecture 5) SW5 fall 2004 Simonas Šaltenis E1-215b
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
Section 10.1 Introduction to Trees These class notes are based on material from our textbook, Discrete Mathematics and Its Applications, 6 th ed., by Kenneth.
Dijkstra’s Algorithm. Announcements Assignment #2 Due Tonight Exams Graded Assignment #3 Posted.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
5.5.3 Rooted tree and binary tree  Definition 25: A directed graph is a directed tree if the graph is a tree in the underlying undirected graph.  Definition.
Lecture 19 Greedy Algorithms Minimum Spanning Tree Problem.
Lectures on Greedy Algorithms and Dynamic Programming
1 Greedy Technique Constructs a solution to an optimization problem piece by piece through a sequence of choices that are: b feasible b locally optimal.
Bahareh Sarrafzadeh 6111 Fall 2009
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Spring, 2010 Lecture 2 Tuesday, 2/2/10 Design Patterns for Optimization.
Greedy Algorithms Analysis of Algorithms.
Greedy algorithms 2 David Kauchak cs302 Spring 2012.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
Greedy Algorithms. p2. Activity-selection problem: Problem : Want to schedule as many compatible activities as possible., n activities. Activity i, start.
Greedy Algorithms Alexandra Stefan.
CSC317 Greedy algorithms; Two main properties:
CSCE 411 Design and Analysis of Algorithms
Chapter 5 : Trees.
Greedy Technique.
Chapter 5. Greedy Algorithms
Minimal Spanning Trees
The Greedy Method and Text Compression
The Greedy Method and Text Compression
Greedy Algorithm.
Advanced Algorithms Analysis and Design
Minimal Spanning Trees
Chapter 16: Greedy Algorithms
Merge Sort 11/28/2018 2:21 AM The Greedy Method The Greedy Method.
Math 221 Huffman Codes.
Algorithms (2IL15) – Lecture 2
Advanced Algorithms Analysis and Design
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Chapter 16: Greedy algorithms Ming-Te Chi
Greedy Algorithms TOPICS Greedy Strategy Activity Selection
Greedy: Huffman Codes Yin Tat Lee
Data Structure and Algorithms
Greedy Algorithms Alexandra Stefan.
Chapter 16: Greedy algorithms Ming-Te Chi
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Huffman codes Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%)
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Analysis of Algorithms CS 477/677
Presentation transcript:

CS420 lecture eight Greedy Algorithms

Going from A to G Starting with a full tank, we can drive 350 miles before we need to gas up, minimize the number of times we need to stop at a gas station A B C D E F G

Going from A to G We can drive 350 miles before we need to gas up, minimize the number of times we need to stop at a gas station a possible (non greedy) optimal solution A B C D E F G

Going from A to G We can drive 350 miles before we need to gas up, minimize the number of times we need to stop at a gas station we can make it greedy: A B C D E F G A greedy solution goes as far as possible before gassing up, and we can turn an optimal solution into a greedy one by making the first step greedy, and then applying induction

Greedy algorithms Greedy algorithms determine a global optimum via (a number of) locally optimal choices

Activity Selection Given a set of activities S = { 1,2,3,...,N } that use a resource and have a start time S i and finish time F i S i <=F i. Activities are compatible if the intervals [S i,F i ) and [S j,F j ) do not overlap: S i >=F j or S j >=F i [ [) means: includes left but only up to right ) The Activity-selection problem is to select a maximum-size set of mutually compatible activities.

Greedy algorithm for Activity Selection How would you do it?

Greedy algorithm for Activity Selection Sort activities by finish time F 1 <= F 2...<= F n A=1 j=1 for i = 2 to n if S i >=F j include i in A j=i

Eg from Cormen et. al. i S i F i

Eg from Cormen et. al. i S i F i A = 1,4,8,11

Activity selection Are there other ways to do it? sure...

Greedy works for Activity Selection BASE: Optimal solution contains activity 1 as first activity Let A be an optimal solution with activity k != 1 as first activity Then we can replace activity k (which has F k >=F 1 ) by activity 1 So, picking the first element in a greedy fashion works

Greedy works for Activity Selection STEP: After the first choice is made, remove all activities that are incompatible with the first chosen activity and recursively define a new problem consisting of the remaining activities. The first activity for this reduced problem can be made in a greedy fashion by principle 1. By induction, Greedy is optimal.

What did we do? We assumed there was a non greedy optimal solution, then we stepwise morphed this solution in a greedy optimal solution, thereby showing that the greedy solution works in the first place.

MST: Minimal Spanning Tree Given a connected and undirected graph with labeled edges (label = distance), find a tree that – is a sub-graph of the given graph (has nodes and edges from the given graph) – and is a minimal spanning tree: it reaches each node, such that the sum of the edge labels is minimal (MST).

Greedy solution for MST? 15

Pick a node and a minimal edge emanating from it, now we have a MST in the making. Keep adding minimal edges to the MST until connected. 15

Greedy works for MST Lemma 1 Let G be connected and undirected graph (V,E) and S be a spanning tree S = (V,T) of G, then forall V 1,V 2 in S the path from V 1 to V 2 is unique why?

Greedy works for MST Lemma 1 Let G be connected and undirected graph (V,E) and S be a spanning tree S = (V,T) of G, then for all V 1,V 2 in S the path from V 1 to V 2 is unique. otherwise it wouldn't be a tree

Greedy works for MST Lemma 2 Let G be connected and undirected graph (V,E) and S be a spanning tree S = (V,T) of G, then if any edge in E-T is added to S, a unique cycle results. why?

Greedy works for MST Lemma 2 Let G be connected and undirected graph (V,E) and S be a spanning tree S = (V,T) of G, then if any edge in E-T is added to S, a unique cycle results. because there already is a unique path between the endpoints of the added edge

Greedy works for MST Lemma 2 Let G be connected and undirected graph (V,E) and S be a spanning tree S = (V,T) of G then also, any edge on the cycle can be taken away, making the graph a spanning tree again.

Greedy works for MST Proof by contradiction: Suppose we can create an MST by at some stage not taking the minimal cost edge min, but a non-minimal edge other. We build the rest of the spanning tree, so now all vertices are connected. We can now make a lower cost spanning tree by removing other and adding min. Hence the spanning tree with other in it was not minimal.

Bounds for MST MST = Ω(|V|) (we need to touch all nodes) Greedy with priority heap for nodes is O(|E| lg|V|) – See lecture on Shortest Paths There is no known O(n) algorithm for MST – MST has algorithmic gap

Huffman codes Say I have a code consisting of the letters a, b, c, d, e, f with frequencies (x1000) 45, 13, 12, 16, 9, 5 What would a fixed encoding look like?

Huffman codes Say I have a code consisting of the letters a, b, c, d, e, f with frequencies(x1000) 45, 13, 12, 16, 9, 5 What would a fixed bit encoding look like? a b c d e f

Variable encoding a b c d e f frequency(x1000) fixed encoding variable encoding

Fixed vs variable 100,000 characters Fixed:

Fixed vs variable 100,000 characters Fixed: 300,000 bits Variable:

Fixed vs variable 100,000 characters Fixed: 300,000 bits Variable: (1*45 + 3*13 + 3*12 + 3*16 + 4*9 + 4*5)*1000 = 224,000 bits 25% saving

Variable prefix encoding a b c d e f frequency(x1000) variable encoding what is special about our encoding?

Variable prefix encoding a b c d e f frequency(x1000) variable encoding no code is a prefix of another. why does it matter?

Variable prefix encoding a b c d e f frequency(x1000) variable encoding no code is a prefix of another. We can concatenate the codes without ambiguities

Variable prefix encoding a b c d e f frequency(x1000) variable encoding = =

Representing an encoding A binary tree, where the intermediate nodes contain frequencies, and the leaves are the characters (+their frequencies) and the paths to the leaves are the codes, is nice.

100 0/ \1 / \ a:45 55 / \ 0/ \ / \1 0/ \1 c:12 b:13 14 d:16 / \ 0/ \1 f:5 e:9 T HE FREQUENCIES OF THE INTERNAL NODES ARE THE SUMS OF THE FREQUENCIES OF THEIR CHILDREN.

100 0/ \1 / \ a:45 55 / \ 0/ \ / \1 0/ \1 c:12 b:13 14 d:16 / \ 0/ \1 f:5 e:9 T HE FREQUENCIES OF THE INTERNAL NODES ARE THE SUMS OF THE FREQUENCIES OF THEIR CHILDREN. I F THE TREE IS NOT FULL, THE ENCODING IS NON OPTIMAL. W HY ?

100 0/ \1 / \ a:45 55 / \ 0/ \ / \1 0/ \1 c:12 b:13 14 d:16 / \ 0/ \1 f:5 e:9 A N OPTIMAL CODE IS REPRESENTED BY A FULL BINARY TREE, WHERE EACH INTERNAL NODE HAS TWO CHILDREN. If a tree is not full it has an internal node with one child labeled with a redundant bit. (check the fixed encoding)

100 0/ \1 / \ / \1 0/ / \ | / \1 0/ \1 0/ \1 / \ / \ / \ a:45 b:13 c:12 d:16 e:9 f:5

100 0/ \1 / \ / \1 0 / redundant 0 / \ | / \1 0/ \1 0/ \1 / \ / \ / \ a:45 b:13 c:12 d:16 e:9 f:5

Cost of encoding a file For each character c in C, f(c) is its frequency and d(c) is its depth in the tree, which equals the number of bits it takes to encode c. Then the cost of the encoding is the number of bits to encode the file, which is

Huffman code An optimal encoding of a file has a minimal cost. Huffman invented a greedy algorithm to construct an optimal prefix code called the Huffman code.

Huffman algorithm Create |C| leaves, one for each character Perform |C|-1 merge operations, each creating a new node, with children the nodes with least two frequencies and with frequency the sum of these two frequencies. By using a heap for the collection of intermediate trees this algorithm takes O(nlgn) time.

1) f:5 e:9 c:12 b:13 d:16 a:45

2) c:12 b:13 14 d:16 a:45 / \ f e

1) f:5 e:9 c:12 b:13 d:16 a:45 2) c:12 b:13 14 d:16 a:45 / \ f e 3) 14 d:16 25 a:45 / \ / \ f e c b

1) f:5 e:9 c:12 b:13 d:16 a:45 2) c:12 b:13 14 d:16 a:45 / \ f e 3) 14 d:16 25 a:45 / \ / \ f e c b 4) a:45 / \ / \ c b 14 d / \ f e

1) f:5 e:9 c:12 b:13 d:16 a:45 2) c:12 b:13 14 d:16 a:45 / \ f e 3) 14 d:16 25 a:45 / \ / \ f e c b 4) a:45 / \ / \ c b 14 d / \ f e 5) a:45 55 / \ / \ / \ c b 14 d / \ f e

1) f:5 e:9 c:12 b:13 d:16 a:45 2) c:12 b:13 14 d:16 a:45 / \ f e 3) 14 d:16 25 a:45 / \ / \ f e c b 4) a:45 / \ / \ c b 14 d / \ f e 5) a:45 55 / \ / \ / \ c b 14 d / \ f e 6) 100 / \ a 55 / \ / \ / \ c b 14 d / \ f e

Huffman is optimal Base step inductive approach. First we show: Let x and y be the two characters with the minimal frequencies, then there is a minimal cost encoding tree with x and y of equal and highest depth (see e and f in our example above). How?

Huffman is optimal Base step inductive approach. First we show: Let x and y be the two characters with the minimal frequencies, then there is a minimal cost encoding tree with x and y of equal and highest depth (see e and f in our example above). How?

Greedy proof technique The proof technique is the same as we have used for the previous two problems (activities and MST): If the greedy choice is not taken then we show that by taking the greedy choice we get a solution that is as good or better.

Lowest leaves x,y lowest frequencies Assume that two other characters a and b with higher frequencies are siblings at the lowest level of the tree: T / \ O x / \ y O / \ a b

Since the frequencies of x and y are lowest, the cost of the tree can only improve if we swap y and a, and x and b: T / \ O b / \ a O / \ y x why?

Since the frequencies of x and y are lowest, the cost of the tree can only improve if we swap y and a, and x and b: T / \ O b / \ a O / \ y x what is the cost of an encoding tree?

Greedy start We have shown that putting the lowest two frequency characters lowest in the tree is a good greedy starting point for our algorithm: base of the induction.

Step If we have an alphabet C' = C with x and y replaced by a new character z with frequency f(z)=f(x)+f(y) with an optimal encoding tree T' (eg, the tree created from steps 2 to 6 in the example) then we need to show that the tree T with leaf z replaced by an internal node f(z) with children x:f(x) and y:f(y)) is an optimal encoding for C (the tree created form steps 1 to 6 in the example).

Proof of step: by contradiction d(x)=d(y)=d(z)+1 so f(x)d(x)+f(y)d(y) = (f(x)+f(y))(d(z)+1) = f(z)d(z)+f(x)+f(y) because f(z)=f(x)+f(y) So cost(T) = cost(T')+f(x)+f(y)

Now suppose T is not an optimal encoding, then there is another optimal tree T''. We have shown that we can put x and y as siblings at the lowest level of T''. Let T''' be T'' with x and y replaced by z, then cost(T''') = cost(T'')-f(x)-f(y) < cost(T)-f(x)-f(y) = cost(T'). But that yields a contradiction with the assumption that T' was optimal for C'. Hence Huffman (ie Greedy) produces an optimal prefix encoding tree.

conclusion All proofs that a greedy method works are based on the same principle. We show that taking the greedy choice first and then recursively solving the reduced problem is optimal, because either we can change an optimal solution without the greedy first choice to one with it, or we can show that an optimal solution must have the greedy first choice.