Download presentation
Presentation is loading. Please wait.
1
Merge Sort Dynamic Programming
1/15/2019 8:23 PM Algorithms Dynamic Programming
2
A Gentle Introduction to Dynamic Programming – An Interesting History
Merge Sort 1/15/2019 8:23 PM A Gentle Introduction to Dynamic Programming – An Interesting History Invented by a mathematician, Richard Bellman, in 1950s as a method for optimizing multistage decision processes. So the word programming refers to planning, not computer programming. Later computer scientists realized it was a technique that could be applied to problems that were not special types of optimization problems.
3
The technique solves problems with overlapping subproblems.
Merge Sort 1/15/2019 8:23 PM The Basic Idea The technique solves problems with overlapping subproblems. Typically the subproblems arise through a recursive solution to a problem. Rather than solve the subproblems repeatedly, we solve the smaller subproblem and save the results in a table from which we can form a solution to the original problem.
4
Dynamic Programming An algorithm design technique (like divide and conquer) Divide and conquer Partition the problem into independent subproblems Solve the subproblems recursively Combine the solutions to solve the original problem
5
Merge Sort 1/15/2019 8:23 PM A Simple Example Consider the calculation of the Fibonacci numbers using the simple recurrence F(n) = F(n-1) + F(n-2) for n ≥ 2 and the two initial conditions F(0) = 0 and F(1) = 1. If we blindly use recursion to solve this, we will be re-computing the same values many times. In fact, the recursion tree suggests a simpler solution:
6
Merge Sort 1/15/2019 8:23 PM F(5) F(4) F(3) F(3) F(2) F(2) F(1) F(2) F(1) F(1) F(0) F(1) F(0) F(1) F(0) So, one solution, a dynamic programming one, would be to keep an array and record each F(k) as it is computed. Not all problems that fall to dynamic programming are this simple, but this is a good one to remember of how the technique works.
7
LCS recursive solution
Second case: x[i] != y[j] As symbols don’t match, our solution is not improved, and the length of LCS(Xi , Yj) is the same as before (i.e. maximum of LCS(Xi, Yj-1) and LCS(Xi-1,Yj) Why not just take the length of LCS(Xi-1, Yj-1) ?
8
What is the Longest Common Subsequence of X and Y?
LCS Example We’ll see how LCS algorithm works on the following example: X = ABCB Y = BDCAB What is the Longest Common Subsequence of X and Y? LCS(X, Y) = BCB X = A B C B Y = B D C A B
9
LCS Example (0) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 B 2 3 C
A 1 B 2 3 C 4 B X = ABCB; m = |X| = 4 Y = BDCAB; n = |Y| = 5 Allocate array c[5,4]
10
LCS Example (1) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 B 2 3 C
A 1 B 2 3 C 4 B for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0
11
LCS Example (2) ABCB BDCAB j i Yj B D C A B Xi A 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
12
LCS Example (3) ABCB BDCAB j i Yj B D C A B Xi A 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
13
LCS Example (4) ABCB BDCAB j i Yj B D C A B Xi A 1 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
14
LCS Example (5) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
15
LCS Example (6) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 1 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
16
LCS Example (7) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
17
LCS Example (8) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
18
LCS Example (10) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
19
LCS Example (11) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
20
LCS Example (12) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
21
LCS Example (13) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B 1 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
22
LCS Example (14) ABCB BDCAB j i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B 1 1 2 2 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
23
3 LCS Example (15) ABCB BDCAB j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1
A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )
24
How to find actual LCS For example, here
So far, we have just found the length of LCS, but not LCS itself. We want to modify this algorithm to make it output Longest Common Subsequence of X and Y Each c[i,j] depends on c[i-1,j] and c[i,j-1] or c[i-1, j-1] For each c[i,j] we can say how it was acquired: For example, here c[i,j] = c[i-1,j-1] +1 = 2+1=3 2 2 2 3
25
3 Finding LCS j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1 2 3
A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2
26
3 Finding LCS (2) j 0 1 2 3 4 5 i Yj B D C A B Xi A 1 1 1 B 2 1 1 1 1
A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 B C B LCS (reversed order): B C B (this string turned out to be a palindrome) LCS (straight order):
27
Other examples for using dynamic programming are: Edit Distance
Merge Sort 1/15/2019 8:23 PM Other examples for using dynamic programming are: Edit Distance When a spell checker encounters a possible misspelling or google is given words it doesn't recognize, they look in their dictionaries for other words that are close by. What is an appropriate note of closeness in this case? The edit distance is the minimum number of edits (insertions, deletions, and substitutions) of characters needed to transform one string into a second one.
28
Merge Sort 1/15/2019 8:23 PM Edit Distance Define the cost of an alignment to be the number of columns where the strings differ. We can place a gap, _, in any string which is like a wildcard. Example 1: Cost is 3 (insert U, substitute O with N, delete W. S _ N O W Y S U N N _ Y Example 2: Cost is 5 _ S N O W _ Y S U N _ _ N Y
29
The General Dynamic Programming Technique
Merge Sort 1/15/2019 8:23 PM The General Dynamic Programming Technique Applies to a problem that at first seems to require a lot of time (possibly exponential), provided we have: Simple subproblems: the subproblems can be defined in terms of a few variables, such as j, k, l, m, and so on. Subproblem optimality: the global optimum value can be defined in terms of optimal subproblems Subproblem overlap: the subproblems are not independent, but instead they overlap (hence, should be constructed bottom-up).
30
The Knapsack Problem The 0-1 knapsack problem
A thief rubbing a store finds n items: the i-th item is worth vi dollars and weights wi pounds (vi, wi integers) The thief can only carry W pounds in his knapsack Items must be taken entirely or left behind Which items should the thief take to maximize the value of his load? The fractional knapsack problem Similar to above The thief can take fractions of items
31
Merge Sort 1/15/2019 8:23 PM The 0/1 Knapsack Problem Given: A set S of n items, with each item i having bi - a positive benefit wi - a positive weight Goal: Choose items with maximum total benefit but with weight at most W. If we are not allowed to take fractional amounts, then this is the 0/1 knapsack problem. In this case, we let T denote the set of items we take Objective: maximize Constraint:
32
Fractional Knapsack Problem
Knapsack capacity: W There are n items: the i-th item has value vi and weight wi Goal: find xi such that for all 0 xi 1, i = 1, 2, .., n wixi W and xivi is maximum
33
Fractional Knapsack Problem
Greedy strategy: Pick the item with the maximum value per pound vi/wi If the supply of that element is exhausted and the thief can carry more: take as much as possible from the item with the next greatest value per pound It is good to order items based on their value per pound
34
Fractional Knapsack - Example
E.g.: 50 20 --- 30 50 $80 + Item 3 30 20 Item 2 $100 + 20 Item 1 10 10 $60 $60 $100 $120 $240 $6/pound $5/pound $4/pound
35
Example Given: A set S of n items, with each item i having
Merge Sort 1/15/2019 8:23 PM Example Given: A set S of n items, with each item i having bi - a positive benefit wi - a positive weight Goal: Choose items with maximum total benefit but with weight at most W. “knapsack” Solution: 5 ($80, 2 in) 3 ($6, 2 in) 1 ($20, 4 in) Items: 1 2 3 4 5 Weight: 4 in 2 in 2 in 6 in 2 in 9 in Benefit: $20 $3 $6 $25 $80
36
0-1 Knapsack - Dynamic Programming
P(i, w) – the maximum profit that can be obtained from items 1 to i, if the knapsack has size w Case 1: thief takes item i P(i, w) = Case 2: thief does not take item i vi + P(i - 1, w-wi) P(i - 1, w)
37
0-1 Knapsack - Dynamic Programming
Item i was taken Item i was not taken P(i, w) = max {vi + P(i - 1, w-wi), P(i - 1, w) } 0: 1 w - wi w W first second i-1 i n
38
0-1 Knapsack problem: a picture
Weight Benefit value wi bi Items 3 2 This is a knapsack Max weight: W = 20 4 3 5 4 W = 20 8 5 9 10
39
W = 5 1 2 3 4 5 P(0, 1) = 0 P(1, 1) = P(1, 2) = P(1, 3) = P(1, 4) =
Example: W = 5 Item Weight Value 1 2 12 10 3 20 4 15 P(i, w) = max {vi + P(i - 1, w-wi), P(i - 1, w) } 1 2 3 4 5 P(0, 1) = 0 P(1, 1) = P(1, 2) = P(1, 3) = P(1, 4) = P(1, 5) = max{12+0, 0} = 12 max{12+0, 0} = 12 max{12+0, 0} = 12 max{12+0, 0} = 12 1 12 12 12 12 max{10+12, 12} = 22 max{10+12, 12} = 22 max{10+0, 12} = 12 max{10+12, 12} = 22 max{10+0, 0} = 10 2 10 12 22 22 22 max{20+12,22}=32 P(2,1) = 10 max{20+10,22}=30 P(2,2) = 12 max{20+0, 22}=22 3 10 12 22 30 32 max{15+0, 12} = 15 max{15+10, 22}=25 P(3,1) = 10 max{15+22, 32}=37 max{15+12, 30}=30 4 10 15 25 30 37 P(2, 1)= P(2, 2)= P(2, 3)= P(2, 4)= P(2, 5)= P(3, 1)= P(3, 2)= P(3, 3)= P(3, 4)= P(4, 5)= P(4, 1)= P(4, 2)= P(4, 3)= P(4, 4)= P(4, 5)=
40
Reconstructing the Optimal Solution
1 2 3 4 5 Item 4 Item 1 1 12 12 12 12 Item 2 2 10 12 22 22 22 3 10 12 22 30 32 4 10 15 25 30 37 Start at P(n, W) When you go left-up item i has been taken When you go straight up item i has not been taken
41
Huffman Codes Widely used technique for data compression
Assume the data to be a sequence of characters Looking for an effective way of storing the data
42
Huffman Codes Idea: Use the frequencies of occurrence of characters to build an optimal way of representing each character Binary character code Uniquely represents a character by a binary string a b c d e f Frequency (thousands) 45 13 12 16 9 5
43
Fixed-Length Codes E.g.: Data file containing 100,000 characters
3 bits needed a = 000, b = 001, c = 010, d = 011, e = 100, f = 101 Requires: 100,000 3 = 300,000 bits a b c d e f Frequency (thousands) 45 13 12 16 9 5
44
Variable-Length Codes
E.g.: Data file containing 100,000 characters Assign short codewords to frequent characters and long codewords to infrequent characters a = 0, b = 101, c = 100, d = 111, e = 1101, f = 1100 (45 4) 1,000 = 224,000 bits a b c d e f Frequency (thousands) 45 13 12 16 9 5
45
Encoding with Binary Character Codes
Concatenate the codewords representing each character in the file E.g.: a = 0, b = 101, c = 100, d = 111, e = 1101, f = 1100 abc = 0 101 100 =
46
Decoding with Binary Character Codes
Prefix codes simplify decoding No codeword is a prefix of another the codeword that begins an encoded file is unambiguous Approach Identify the initial codeword Translate it back to the original character Repeat the process on the remainder of the file E.g.: a = 0, b = 101, c = 100, d = 111, e = 1101, f = 1100 = 0 101 1101 = aabe
47
Optimal Codes An optimal code is always represented by a full binary tree Every non-leaf has two children Fixed-length code is not optimal, variable-length is How many bits are required to encode a file? Let C be the alphabet of characters Let f(c) be the frequency of character c Let dT(c) be the depth of c’s leaf in the tree T corresponding to a prefix code the cost of tree T
48
Prefix Code Representation
Binary tree whose leaves are the given characters Binary codeword the path from the root to the character, where 0 means “go to the left child” and 1 means “go to the right child” Length of the codeword Length of the path from root to the character leaf (depth of node) 100 86 14 58 28 a: 45 b: 13 c: 12 d: 16 e: 9 f: 5 1 100 a: 45 55 1 25 30 c: 12 b: 13 14 f: 5 e: 9 d: 16
49
Constructing a Huffman Code
A greedy algorithm that constructs an optimal prefix code called a Huffman code Assume that: C is a set of n characters Each character has a frequency f(c) The tree T is built in a bottom up manner Idea: Start with a set of |C| leaves At each step, merge the two least frequent objects: the frequency of the new node = sum of two frequencies Use a min-priority queue Q, keyed on f to identify the two least frequent objects a: 45 c: 12 b: 13 f: 5 e: 9 d: 16
50
Example a: 45 c: 12 b: 13 d: 16 14 f: 5 e: 9 1 a: 45 c: 12 b: 13 f: 5 e: 9 d: 16 d: 16 c: 12 b: 13 25 a: 45 f: 5 e: 9 14 1 f: 5 e: 9 14 c: 12 b: 13 25 d: 16 30 a: 45 1 f: 5 e: 9 14 c: 12 b: 13 25 d: 16 30 55 a: 45 100 1 a: 45 f: 5 e: 9 14 c: 12 b: 13 25 d: 16 30 55 1
51
Building a Huffman Code
Alg.: HUFFMAN(C) n C Q C for i 1 to n – 1 do allocate a new node z left[z] x EXTRACT-MIN(Q) right[z] y EXTRACT-MIN(Q) f[z] f[x] + f[y] INSERT (Q, z) return EXTRACT-MIN(Q) Running time: O(nlgn) O(n) O(nlgn)
52
Interval scheduling Process Interval A 5 – 8 B 10 – 13 C 6 – 9 D 12 – 15 E 3 – 7 F 8 – 11 G 1 – 6 H 8 – 12 J 3 – 5 K 2 – 4 L 11 – 16 M 10 – 15 A problem we saw in the topic on greedy algorithms was interval scheduling
53
Interval scheduling Process Interval A 5 – 8 B 10 – 13 C 6 – 9 D 12 – 15 E 3 – 7 F 8 – 11 G 1 – 6 H 8 – 12 J 3 – 5 K 2 – 4 L 11 – 16 M 10 – 15 The earliest-deadline-first greedy algorithm will maximize the number of processes that can be scheduled
54
Interval scheduling Process Interval Weight A 5 – 8 1.7 B 10 – 13 1.3 C 6 – 9 3.9 D 12 – 15 3.2 E 3 – 7 5.9 F 8 – 11 3.3 G 1 – 6 5.4 H 8 – 12 1.2 J 3 – 5 5.8 K 2 – 4 4.8 L 11 – 16 M 10 – 15 2.6 Suppose, however, that there are weights associated with the processes Can we find the schedulable processes that have maximal weight? Currently, no greedy algorithm is known that can solve this problem
55
Interval scheduling Process Interval Weight A 5 – 8 3 B 10 – 13 C 6 – 9 D 12 – 15 E 3 – 7 4 F 8 – 11 G 1 – 6 5 H 8 – 12 J 3 – 5 2 K 2 – 4 L 11 – 16 M 10 – 15 Note: if we wanted to optimize based on processor usage, the weight would be equal to the computation time
56
Interval scheduling Place Process Interval Weight 1 K 2 – 4 4.8 2 J 3 – 5 5.8 3 G 1 – 6 5.4 4 E 3 – 7 5.9 5 A 5 – 8 1.7 6 C 6 – 9 3.9 7 F 8 – 11 3.3 8 H 8 – 12 1.2 9 B 10 – 13 1.3 10 D 12 – 15 3.2 11 M 10 – 15 2.6 12 L 11 – 16 First, as before, we sort the processes by their deadlines
57
Interval scheduling Place Process Interval Weight Previous 1 K 2 – 4 4.8 2 J 3 – 5 5.8 3 G 1 – 6 5.4 4 E 3 – 7 5.9 5 A 5 – 8 1.7 6 C 6 – 9 3.9 7 F 8 – 11 3.3 8 H 8 – 12 1.2 9 B 10 – 13 1.3 10 D 12 – 15 3.2 11 M 10 – 15 2.6 12 L 11 – 16 Process L could only be run if nothing after the 7th process is chosen
58
Interval scheduling Place Process Interval Weight Previous 1 K 2 – 4 4.8 2 J 3 – 5 5.8 3 G 1 – 6 5.4 4 E 3 – 7 5.9 5 A 5 – 8 1.7 6 C 6 – 9 3.9 7 F 8 – 11 3.3 8 H 8 – 12 1.2 9 B 10 – 13 1.3 10 D 12 – 15 3.2 11 M 10 – 15 2.6 12 L 11 – 16 Process M could only be run if nothing after the 6th process is chosen
59
Interval scheduling Place Process Interval Weight Previous 1 K 2 – 4 4.8 2 J 3 – 5 5.8 3 G 1 – 6 5.4 4 E 3 – 7 5.9 5 A 5 – 8 1.7 6 C 6 – 9 3.9 7 F 8 – 11 3.3 8 H 8 – 12 1.2 9 B 10 – 13 1.3 10 D 12 – 15 3.2 11 M 10 – 15 2.6 12 L 11 – 16 If a process must be run first, we mark its previous process as 0
60
Interval scheduling Place Process Interval Weight Previous 1 K 2 – 4 4.8 2 J 3 – 5 5.8 3 G 1 – 6 5.4 4 E 3 – 7 5.9 5 A 5 – 8 1.7 6 C 6 – 9 3.9 7 F 8 – 11 3.3 8 H 8 – 12 1.2 9 B 10 – 13 1.3 10 D 12 – 15 3.2 11 M 10 – 15 2.6 12 L 11 – 16 If a process must be run first, we mark its previous process as 0
61
Greedy Choice Property
Lemma: Let C be an alphabet in which each character c C has frequency f[c]. Let x and y be two characters in C having the lowest frequencies. Then, there exists an optimal prefix code for C in which the codewords for x and y have the same length and differ only in the last bit.
62
Discussion Greedy choice property:
Building an optimal tree by mergers can begin with the greedy choice: merging the two characters with the lowest frequencies The cost of each merger is the sum of frequencies of the two items being merged Of all possible mergers, HUFFMAN chooses the one that incurs the least cost
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.