Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 14
CS 477/677 - Lecture 142 Longest Common Subsequence Given two sequences X = x 1, x 2, …, x m Y = y 1, y 2, …, y n find a maximum length common subsequence (LCS) of X and Y E.g.: X = A, B, C, B, D, A, B Subsequence of X: –A subset of elements in the sequence taken in order (but not necessarily consecutive) A, B, D , B, C, D, B , etc. 2
CS 477/677 - Lecture 143 Example X = A, B, C, B, D, A, B Y = B, D, C, A, B, A B, C, B, A and B, D, A, B are longest common subsequences of X and Y (length = 4) B, C, A , however is not a LCS of X and Y 3
CS 477/677 - Lecture Making the choice X = A, B, D, E Y = Z, B, E Choice: include one element into the common sequence (E) and solve the resulting subproblem X = A, B, D, G Y = Z, B, D Choice: exclude an element from a string and solve the resulting subproblem 4
CS 477/677 - Lecture 145 Notations Given a sequence X = x 1, x 2, …, x m we define the i-th prefix of X, for i = 0, 1, 2, …, m X i = x 1, x 2, …, x i c[i, j] = the length of a LCS of the sequences X i = x 1, x 2, …, x i and Y j = y 1, y 2, …, y j 5
CS 477/677 - Lecture A Recursive Solution Case 1: x i = y j e.g.: X i = A, B, D, E Y j = Z, B, E –Append x i = y j to the LCS of X i-1 and Y j-1 –Must find a LCS of X i-1 and Y j-1 optimal solution to a problem includes optimal solutions to subproblems c[i, j] = c[i - 1, j - 1] + 1 6
CS 477/677 - Lecture A Recursive Solution Case 2: x i y j e.g.: X i = A, B, D, G Y j = Z, B, D –Must solve two problems find a LCS of X i-1 and Y j : X i-1 = A, B, D and Y j = Z, B, D find a LCS of X i and Y j-1 : X i = A, B, D, G and Y j = Z, B Optimal solution to a problem includes optimal solutions to subproblems c[i, j] = max { c[i - 1, j], c[i, j-1] } 7
CS 477/677 - Lecture Computing the Length of the LCS 0if i = 0 or j = 0 c[i, j] = c[i-1, j-1] + 1if x i = y j max(c[i, j-1], c[i-1, j])if x i y j yj:yj: xmxm y1y1 y2y2 ynyn x1x1 x2x2 Xi:Xi: j i 012n m first second 8
CS 477/677 - Lecture Additional Information 0if i,j = 0 c[i, j] = c[i-1, j-1] + 1if x i = y j max(c[i, j-1], c[i-1, j])if x i y j y j: D ACF A B xixi j i 012n m A matrix b[i, j] : For a subproblem [i, j] it tells us what choice was made to obtain the optimal value If x i = y j b[i, j] = “ ” Else, if c[i - 1, j] ≥ c[i, j-1] b[i, j] = “ ” else b[i, j] = “ ” 3 3C D b & c: c[i,j-1] c[i-1,j] 9
CS 477/677 - Lecture Constructing a LCS Start at b[m, n] and follow the arrows When we encounter a “ “ in b[i, j] x i = y j is an element of the LCS yjyj BDACAB D A B xixi C B A B 00 00 00 1 11 1 1 11 11 11 2 22 11 11 2 22 22 22 1 11 22 22 3 33 11 2 22 22 33 33 11 22 33 2 22 22 33 4 44 10
CS 477/677 - Lecture 1411 PRINT-LCS(b, X, i, j) 1. if i = 0 or j = 0 2. then return 3. if b[i, j] = “ ” 4. then PRINT-LCS( b, X, i - 1, j - 1 ) 5. print x i 6. else if b[i, j] = “ ↑ ” 7. then PRINT-LCS( b, X, i - 1, j ) 8. else PRINT-LCS( b, X, i, j - 1 ) Initial call: PRINT-LCS( b, X, length[X], length[Y] ) Running time: (m + n) 11
CS 477/677 - Lecture 1412 Improving the Code What can we say about how each entry c[i, j] is computed? –It depends only on c[i -1, j - 1], c[i - 1, j], and c[i, j - 1] –Eliminate table b and compute in O(1) which of the three values was used to compute c[i, j] –We save (mn) space from table b –However, we do not asymptotically decrease the auxiliary space requirements: still need table c 12
CS 477/677 - Lecture 1413 Improving the Code If we only need the length of the LCS –LCS-LENGTH works only on two rows of c at a time The row being computed and the previous row –We can reduce the asymptotic space requirements by storing only these two rows 13
Weighted Interval Scheduling Job j starts at s j, finishes at f j, and has weight or value v j Two jobs are compatible if they don't overlap Goal: find maximum weight subset of mutually compatible jobs 14 Time f g h e a b c d
Weighted Interval Scheduling Label jobs by finishing time: f 1 f 2 ... f n Def. p(j) = largest index i < j such that job i is compatible with j Ex: p(8) = 5, p(7) = 3, p(2) = 0 CS 477/677 - Lecture 1415 Time
1. Making the Choice OPT(j) = value of optimal solution to the problem consisting of job requests 1, 2,..., j –Case 1: OPT selects job j Can't use incompatible jobs { p(j) + 1, p(j) + 2,..., j - 1 } Must include optimal solution to problem consisting of remaining compatible jobs 1, 2,..., p(j) –Case 2: OPT does not select job j Must include optimal solution to problem consisting of remaining compatible jobs 1, 2,..., j-1 optimal substructure CS 477/677 - Lecture 1416
2. A Recursive Solution OPT(j) = value of optimal solution to the problem consisting of job requests 1, 2,..., j –Case 1: OPT selects job j Can't use incompatible jobs { p(j) + 1, p(j) + 2,..., j - 1 } Must include optimal solution to problem consisting of remaining compatible jobs 1, 2,..., p(j) OPT(j) = v j + OPT(p(j)) –Case 2: OPT does not select job j Must include optimal solution to problem consisting of remaining compatible jobs 1, 2,..., j-1 OPT(i) = OPT(j-1) CS 477/677 - Lecture 1417
Input: n, s 1,…,s n, f 1,…,f n, v 1,…,v n Sort jobs by finish times so that f 1 f 2 ... f n Compute p(1), p(2), …, p(n) Compute-Opt(j) { if (j = 0) return 0 else return max( v j + Compute-Opt( p(j) ), Compute-Opt( j-1 )) } Top-Down Recursive Algorithm CS 477/677 - Lecture 1418 WRONG!
Top-Down Recursive Algorithm Recursive algorithm fails because of redundant sub-problems exponential algorithms Number of recursive calls for family of "layered" instances grows like Fibonacci sequence –Needs solutions for j-1 and j-2 at each call p(1) = 0, p(j) = j CS 477/677 - Lecture 1419
3. Compute the Optimal Value Compute values in increasing order of j Input: n, s 1,…,s n, f 1,…,f n, v 1,…,v n Sort jobs by finish times so that f 1 f 2 ... f n Compute p(1), p(2), …, p(n) Iterative-Compute-Opt { M[0] = 0 for j = 1 to n M[j] = max( v j + M[p(j)], M[j-1] ) } CS 477/677 - Lecture 1420
Input: n, s 1,…,s n, f 1,…,f n, v 1,…,v n Sort jobs by finish times so that f 1 f 2 ... f n. Compute p(1), p(2), …, p(n) for j = 1 to n M[j] = empty M[j] = 0 M-Compute-Opt( j ) { if ( M[j] is empty) M[j] = max( w j + M-Compute-Opt( p(j) ), M-Compute-Opt( j-1 )) return M[j] } global array Memoized Version Store results of each sub-problem; lookup as needed CS 477/677 - Lecture 1421
Analysis of Running Time Memoized version of algorithm takes O(n log n) –Sort by finish time: O(n log n) –Computing p( ) : O(n) after sorting by start time –M-Compute-Opt(j) : each invocation takes O(1) time and either (i) returns an existing value M[j] (ii) fills in one new entry M[j] and makes two recursive calls –Progress measure = # nonempty entries of M[]. initially = 0, throughout n. (ii) increases by 1 at most 2n recursive calls. –Overall running time of M-Compute-Opt(n) is O(n). Running time: O(n) if jobs are pre-sorted by start and finish times CS 477/677 - Lecture 1422
4. Finding the Optimal Solution Two options 1.Store additional information: at each time step store either j or p(j) – value that gave the optimal solution 2.Recursively find the solution by iterating through array M Find-Solution(j) { if ( j = 0 ) output nothing else if ( v j + M[p(j )] > M[j-1] ) print j Find-Solution( p(j) ) else Find-Solution( j-1 ) } CS 477/677 - Lecture 1423
An Example CS 477/677 - Lecture 1424
Segmented Least Squares Least squares –Foundational problem in statistic and numerical analysis –Given n points in the plane: (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) –Find a line y = ax + b that minimizes the sum of the squared error: Solution – closed form –Minimum error is achieved when x y CS 477/677 - Lecture 1425
Segmented Least Squares Segmented least squares –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes f(x) What is a reasonable choice for f(x) to balance accuracy and parsimony? CS 477/677 - Lecture 1426 x y goodness of fit number of lines
Segmented Least Squares Segmented least squares –Points lie roughly on a sequence of several line segments –Given n points in the plane (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ) with x 1 < x 2 <... < x n, find a sequence of lines that minimizes: the sum of the sums of the squared errors E in each segment the number of lines L Tradeoff function: E + c L, for some constant c > 0 x y CS 477/677 - Lecture 1427
(1,2) Making the Choice and Recursive Solution Notation –OPT(j) = minimum cost for points p 1, p i+1,..., p j –e(i, j) = minimum sum of squares for points p i, p i+1,.., p j To compute OPT(j) –Last segment uses points p i, p i+1,..., p j for some i –Cost = e(i, j) + c + OPT(i-1) CS 477/677 - Lecture 1428
3. Compute the Optimal Value Running time: O(n 3 ) –Bottleneck = computing e(i, j) for O(n 2 ) pairs, O(n) per pair using previous formula INPUT: n, p 1,…,p N, c Segmented-Least-Squares() { M[0] = 0 for j = 1 to n for i = 1 to j compute the least square error e ij for the segment p i,…, p j for j = 1 to n M[j] = min 1 i j (e ij + c + M[i-1]) return M[n] } 29CS 477/677 - Lecture 14
30 Greedy Algorithms Similar to dynamic programming, but simpler approach –Also used for optimization problems Idea: When we have a choice to make, make the one that looks best right now –Make a locally optimal choice in the hope of getting a globally optimal solution Greedy algorithms don’t always yield an optimal solution When the problem has certain general characteristics, greedy algorithms give optimal solutions 30
CS 477/677 - Lecture 1431 Activity Selection StartEndActivity 18:00am9:15amNumerical methods class 28:30am10:30amMovie presentation (refreshments served) 39:20am11:00amData structures class 410:00amnoonProgramming club mtg. (Pizza provided) 511:30am1:00pmComputer graphics class 61:05pm2:15pmAnalysis of algorithms class 72:30pm3:00pmComputer security class 8noon4:00pmComputer games contest (refreshments served) 94:00pm5:30pmOperating systems class Problem –Schedule the largest possible set of non-overlapping activities for a given room 31
32 Readings Chapter 15 CS 477/677 - Lecture 1432