CS 3343: Analysis of Algorithms

Slides:

Advertisements

Similar presentations

Introduction to Algorithms 6.046J/18.401J/SMA5503

Advertisements

Dynamic Programming.

Lecture 8: Dynamic Programming Shang-Hua Teng. Longest Common Subsequence Biologists need to measure how similar strands of DNA are to determine how closely.

CS 5263 Bioinformatics Lecture 3: Dynamic Programming and Global Sequence Alignment.

Overview What is Dynamic Programming? A Sequence of 4 Steps

Algorithms Dynamic programming Longest Common Subsequence.

David Luebke 1 5/4/2015 CS 332: Algorithms Dynamic Programming Greedy Algorithms.

November 7, 2005Copyright © by Erik D. Demaine and Charles E. Leiserson Dynamic programming Design technique, like divide-and-conquer. Example:

Lecture 8: Dynamic Programming Shang-Hua Teng. First Example: n choose k Many combinatorial problems require the calculation of the binomial coefficient.

1 Dynamic Programming Jose Rolim University of Geneva.

Lecture 7 Topics Dynamic Programming

Longest Common Subsequence

Dynamic Programming (16.0/15) The 3-d Paradigm 1st = Divide and Conquer 2nd = Greedy Algorithm Dynamic Programming = metatechnique (not a particular algorithm)

Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.

ADA: 7. Dynamic Prog.1 Objective o introduce DP, its two hallmarks, and two major programming techniques o look at two examples: the fibonacci.

David Luebke 1 10/24/2015 CS 332: Algorithms Greedy Algorithms Continued.

CSC 413/513: Intro to Algorithms Greedy Algorithms.

Greedy Methods and Backtracking Dr. Marina Gavrilova Computer Science University of Calgary Canada.

6/4/ ITCS 6114 Dynamic programming Longest Common Subsequence.

Dynamic Programming Louis Siu What is Dynamic Programming (DP)? Not a single algorithm A technique for speeding up algorithms (making use of.

Introduction to Algorithms Jiafen Liu Sept

CS 3343: Analysis of Algorithms Lecture 18: More Examples on Dynamic Programming.

1 Dynamic Programming Topic 07 Asst. Prof. Dr. Bunyarit Uyyanonvara IT Program, Image and Vision Computing Lab. School of Information and Computer Technology.

9/27/10 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Adam Smith Algorithm Design and Analysis L ECTURE 16 Dynamic.

TU/e Algorithms (2IL15) – Lecture 4 1 DYNAMIC PROGRAMMING II

CS583 Lecture 12 Jana Kosecka Dynamic Programming Longest Common Subsequence Matrix Chain Multiplication Greedy Algorithms Many slides here are based on.

Dynamic Programming Typically applied to optimization problems

Dynamic Programming Sequence of decisions. Problem state.

David Meredith Dynamic programming David Meredith

Weighted Interval Scheduling

CS 3343: Analysis of Algorithms

Advanced Algorithms Analysis and Design

Least common subsequence:

JinJu Lee & Beatrice Seifert CSE 5311 Fall 2005 Week 10 (Nov 1 & 3)

CS200: Algorithm Analysis

Dynamic Programming Several problems Principle of dynamic programming

Chapter 8 Dynamic Programming.

Dynamic Programming Comp 122, Fall 2004.

CSCE 411 Design and Analysis of Algorithms

CS 3343: Analysis of Algorithms

Dynamic Programming.

Dynamic Programming.

Dynamic Programming.

Prepared by Chen & Po-Chuan 2016/03/29

Data Structures and Algorithms

Dynamic Programming General Idea

CS Algorithms Dynamic programming 0-1 Knapsack problem 12/5/2018.

Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.

Dynamic Programming Dr. Yingwu Zhu Chapter 15.

Merge Sort Dynamic Programming

CS6045: Advanced Algorithms

Dynamic Programming.

Longest Common Subsequence

Dynamic Programming Comp 122, Fall 2004.

Lecture 8. Paradigm #6 Dynamic Programming

CS 332: Algorithms Amortized Analysis Continued

Trevor Brown DC 2338, Office hour M3-4pm

Dynamic Programming-- Longest Common Subsequence

Dynamic Programming General Idea

Introduction to Algorithms: Dynamic Programming

Dynamic Programming.

Longest Common Subsequence

Dynamic Programming II DP over Intervals

Algorithms and Data Structures Lecture X

Longest common subsequence (LCS)

Analysis of Algorithms CS 477/677

Longest Common Subsequence

Dynamic Programming.

Algorithm Course Algorithms Lecture 3 Sorting Algorithm-1

Algorithm Course Dr. Aref Rashad

Presentation transcript:

CS 3343: Analysis of Algorithms Intro to Dynamic Programming

In the next few lectures Two important algorithm design techniques Dynamic programming Greedy algorithm Meta algorithms, not actual algorithms (like divide-and-conquer) Very useful in practice Once you understand them, it is often easier to reinvent certain algorithms than trying to look them up!

Optimization Problems An important and practical class of computational problems. For each problem instance, there are many feasible solutions (often exponential) Each solution has a value (or cost) The goal is to find a solution with the optimal value For most of these, the best known algorithm runs in exponential time. Industry would pay dearly to have faster algorithms. For some of them, there are quick dynamic programming or greedy algorithms. For the rest, you may have to consider acceptable solutions rather than optimal solutions

Dynamic programming The word “programming” is historical and predates computer programming A very powerful, general tool for solving certain optimization problems Once understood it is relatively easy to apply. It looks like magic until you have seen enough examples

A motivating example: Fibonacci numbers 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, … F(0) = 1; F(1) = 1; F(n) = F(n-1) + f(n-2) How to compute F(n)?

A recursive algorithm function fib(n) if (n == 0 or n == 1) return 1; else return fib(n-1) + fib(n-2); What’s the time complexity?

Time complexity between 2n/2 and 2n F(9) F(8) F(7) h = n h = n/2 F(7) F(6) F(5) F(6) F(5) F(4) F(3) Time complexity between 2n/2 and 2n T(n) = F(n) = O(1.6n)

An iterative algorithm function fib(n) F[0] = 1; F[1] = 1; for i = 2 to n F[i] = F[i-1] + F[i-2]; Return F[n]; Time complexity?

Problem with the recursive Fib algorithm: Each subproblem was solved for many times! Solution: avoid solving the same subproblem more than once (1) pre-compute all subproblems that may be needed later (2) Compute on demand, but memorize the solution to avoid recomputing Can you always speedup a recursive algorithm by making it an iterative algorithm? E.g., merge sort No. since there is no overlap between the two sub-problems

Another motivating example: Shortest path problem start goal Find the shortest path from start to goal An optimization problem

Possible approaches Brute-force algorithm Greedy algorithm Enumerate all possible paths and compare How many? Greedy algorithm Always use the currently shortest edge Does not necessarily lead to the optimal solution Can you think about a recursive strategy?

Optimal subpaths Claim: if a path startgoal is optimal, any sub-path xy, where x, y is on the optimal path, is also optimal. Proof by contradiction If the subpath b between x and y is not the shortest we can replace it with a shorter one, b’ which will reduce the total length of the new path the optimal path from start to goal is not the shortest => contradiction! Hence, the subpath b must be the shortest among all paths from x to y start goal x y a b c a + b + c is shortest b’ < b a + b’ + c < a + b + c b’

Shortest path problem SP(start, goal) = min dist(start, A) + SP(A, goal) dist(start, B) + SP(B, goal) dist(start, C) + SP(C, goal)

A special shortest path problem G n Each edge has a length (cost). We need to get to G from S. Can only move to the right or bottom. Aim: find a path with the minimum total length

Recursive solution Suppose we’ve found the shortest path It must use one of the two edges: (m, n-1) to (m, n) Case 1 (m-1, n) to (m, n) Case 2 If case 1 find shortest path from (0, 0) to (m, n-1) SP(0, 0, m, n-1) + dist(m, n-1, m, n) is the overall shortest path If case 2 find shortest path from (0, 0) to (m-1, n) We don’t know which case is true But if we’ve found the two paths, we can compare Actual shortest path is the one with shorter overall length m

Recursive solution F(m-1, n) + dist(m-1, n, m, n) F(m, n) = min Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n) n F(m-1, n) + dist(m-1, n, m, n) F(m, n) = min F(m, n-1) + dist(m, n-1, m, n) F(i-1, j) + dist(i-1, j, i, j) F(i, j) = min F(i, j-1) + dist(i, j-1, i, j) Generalize m

To find shortest path from (0, 0) to (m, n), we need to find shortest path to both (m-1, n) and (m, n-1) If we use recursive call, some subpaths will be recomputed for many times Strategy: solve subproblems in certain order such that redundant computation can be avoided m n

Dynamic Programming F(i-1, j) + dist(i-1, j, i, j) F(i, j) = min Let F(i, j) = SP(0, 0, i, j). => F(m, n) is length of SP from (0, 0) to (m, n) n F(i-1, j) + dist(i-1, j, i, j) F(i, j) = min F(i, j-1) + dist(i, j-1, i, j) m Boundary condition: i = 0 or j = 0. Easy to figure out. i = 1 .. m, j = 1 .. n Number of unique subproblems = m * n Data dependency determines order to compute: left to right, top to bottom (i, j) (i-1, j) (i, j-1)

Dynamic programming illustration 3 9 1 2 3 12 13 15 5 3 3 3 3 3 2 5 2 5 6 8 13 15 2 3 3 9 3 2 4 2 3 7 9 11 13 16 6 2 3 7 4 3 6 3 3 13 11 14 17 20 4 6 3 1 3 1 2 3 2 17 17 17 18 20 G F(i-1, j) + dist(i-1, j, i, j) F(i, j) = min F(i, j-1) + dist(i, j-1, i, j)

Trace back S 3 9 1 2 3 12 13 15 5 3 3 3 3 3 2 5 2 5 6 8 13 15 2 3 3 9 3 2 4 2 3 7 9 11 13 16 6 2 3 7 4 3 6 3 3 13 11 14 17 20 4 6 3 1 3 1 2 3 2 17 17 17 18 20 G Shortest path has length 20 What is the actual shortest path?

Elements of dynamic programming Optimal sub-structures Optimal solutions to the original problem contains optimal solutions to sub-problems Overlapping sub-problems Some sub-problems appear in many solutions Memorization and reuse Carefully choose the order that sub-problems are solved, such that each sub-problem will be solved for at most once and the solution can be reused

Two steps to dynamic programming Formulate the solution as a recurrence relation of solutions to subproblems. Specify an order to solve the subproblems so you always have what you need. Bottom-up Tabulate the solutions to all subproblems before they are used What if you cannot determine the order easily, of if not all subproblems are needed? Top-down Compute when needed. Remember the ones you’ve computed

Question If instead of shortest path, we want (simple) longest path, can we still use dynamic programming? Simple means no cycle allowed Not in general, since no optimal substructure Even if s…m…t is the optimal path from s to t, s…m may not be optimal from s to m Yes for acyclic graphs

Example problems that can be solved by dynamic programming Sequence alignment problem (several different versions) Shortest path in general graphs Knapsack (several different versions) Scheduling etc. We’ll discuss alignment, knapsack, and scheduling problems Shortest path in graph algorithms

Sequence alignment Compare two strings to see if they are similar We need to define similarity Very useful in many applications Comparing DNA sequences, articles, source code, etc. Example: Longest Common Subsequence problem (LCS)

Common subsequence A subsequence of a string is the string with zero or more chars left out A common subsequence of two strings: A subsequence of both strings Ex: x = {A B C B D A B }, y = {B D C A B A} {B C} and {A A} are both common subsequences of x and y

Longest Common Subsequence Given two sequences x[1 . . m] and y[1 . . n], find a longest subsequence common to them both. “a” not “the” x: A B C D y: BCBA = LCS(x, y) functional notation, but not a function

Brute-force LCS algorithm Check every subsequence of x[1 . . m] to see if it is also a subsequence of y[1 . . n]. Analysis 2m subsequences of x (each bit-vector of length m determines a distinct subsequence of x). Hence, the runtime would be exponential ! Towards a better algorithm: a DP strategy Key: optimal substructure and overlapping sub-problems First we’ll find the length of LCS. Later we’ll modify the algorithm to find LCS itself.

Optimal substructure Notice that the LCS problem has optimal substructure: parts of the final solution are solutions of subproblems. If z = LCS(x, y), then any prefix of z is an LCS of a prefix of x and a prefix of y. Subproblems: “find LCS of pairs of prefixes of x and y” i m x z n y j

Recursive thinking m x n y Case 1: x[m]=y[n]. There is an optimal LCS that matches x[m] with y[n]. Case 2: x[m] y[n]. At most one of them is in LCS Case 2.1: x[m] not in LCS Case 2.2: y[n] not in LCS Find out LCS (x[1..m-1], y[1..n-1]) Find out LCS (x[1..m-1], y[1..n]) Find out LCS (x[1..m], y[1..n-1])

Recursive thinking Case 1: x[m]=y[n] Case 2: x[m]  y[n] LCS(x, y) = LCS(x[1..m-1], y[1..n-1]) || x[m] Case 2: x[m]  y[n] LCS(x, y) = LCS(x[1..m-1], y[1..n]) or LCS(x[1..m], y[1..n-1]), whichever is longer Reduce both sequences by 1 char concatenate Reduce either sequence by 1 char

Finding length of LCS m x n y Let c[i, j] be the length of LCS(x[1..i], y[1..j]) => c[m, n] is the length of LCS(x, y) If x[m] = y[n] c[m, n] = c[m-1, n-1] + 1 If x[m] != y[n] c[m, n] = max { c[m-1, n], c[m, n-1] }

Generalize: recursive formulation c[i, j] = c[i–1, j–1] + 1 if x[i] = y[j], max{c[i–1, j], c[i, j–1]} otherwise. ... 1 2 i m j n x: y:

Recursive algorithm for LCS LCS(x, y, i, j) if x[i] = y[ j] then c[i, j]  LCS(x, y, i–1, j–1) + 1 else c[i, j]  max{ LCS(x, y, i–1, j), LCS(x, y, i, j–1)} Worst-case: x[i] ¹ y[ j], in which case the algorithm evaluates two subproblems, each with only one parameter decremented.

Recursion tree m = 3, n = 4: same subproblem m+n 3,4 m+n 2,4 3,3 same subproblem , but we’re solving subproblems already solved! 1,4 2,3 2,3 3,2 1,3 2,2 1,3 2,2 Height = m + n  work potentially exponential.

DP Algorithm c[i–1, j–1] + 1 if x[i] = y[j], Key: find out the correct order to solve the sub-problems Total number of sub-problems: m * n c[i, j] = c[i–1, j–1] + 1 if x[i] = y[j], max{c[i–1, j], c[i, j–1]} otherwise. j n C(i, j) i m

DP Algorithm LCS-Length(X, Y) 1. m = length(X) // get the # of symbols in X 2. n = length(Y) // get the # of symbols in Y 3. for i = 1 to m c[i,0] = 0 // special case: Y[0] 4. for j = 1 to n c[0,j] = 0 // special case: X[0] 5. for i = 1 to m // for all X[i] 6. for j = 1 to n // for all Y[j] 7. if ( X[i] == Y[j]) 8. c[i,j] = c[i-1,j-1] + 1 9. else c[i,j] = max( c[i-1,j], c[i,j-1] ) 10. return c

LCS Example What is the LCS of X and Y? LCS(X, Y) = BCB X = A B C B We’ll see how LCS algorithm works on the following example: X = ABCB Y = BDCAB What is the LCS of X and Y? LCS(X, Y) = BCB X = A B C B Y = B D C A B

LCS Example (0) ABCB BDCAB X = ABCB; m = |X| = 4 j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 B 2 3 C 4 B X = ABCB; m = |X| = 4 Y = BDCAB; n = |Y| = 5 Allocate array c[5,6]

LCS Example (1) ABCB BDCAB for i = 1 to m c[i,0] = 0 j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 B 2 3 C 4 B for i = 1 to m c[i,0] = 0 for j = 1 to n c[0,j] = 0

LCS Example (2) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 B 2 A 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (3) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 B 2 A 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (4) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 B A 1 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (5) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 1 A 1 1 1 B 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (6) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 1 A 1 1 1 B 2 1 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (7) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 1 A 1 1 1 B 2 1 1 1 1 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (8) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 1 A 1 1 1 B 2 1 1 1 1 2 3 C 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (9) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 1 A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (10) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (11) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (12) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B 1 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (13) ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 4 B 1 1 2 2 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Example (14) 3 ABCB BDCAB j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 if ( Xi == Yj ) c[i,j] = c[i-1,j-1] + 1 else c[i,j] = max( c[i-1,j], c[i,j-1] )

LCS Algorithm Running Time LCS algorithm calculates the values of each entry of the array c[m,n] So what is the running time? O(m*n) since each c[i,j] is calculated in constant time, and there are m*n elements in the array

How to find actual LCS For example, here The algorithm just found the length of LCS, but not LCS itself. How to find the actual LCS? For each c[i,j] we know how it was acquired: A match happens only when the first equation is taken So we can start from c[m,n] and go backwards, remember x[i] whenever c[i,j] = c[i-1, j-1]+1. 2 2 For example, here c[i,j] = c[i-1,j-1] +1 = 2+1=3 2 3

Finding LCS 3 Time for trace back: O(m+n). j 0 1 2 3 4 5 i Y[j] B D C X[i] A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 Time for trace back: O(m+n).

Finding LCS (2) 3 LCS (reversed order): B C B B C B j 0 1 2 3 4 5 i Y[j] B D C A B X[i] A 1 1 1 B 2 1 1 1 1 2 3 C 1 1 2 2 2 3 4 B 1 1 2 2 LCS (reversed order): B C B B C B (this string turned out to be a palindrome) LCS (straight order):

LCS as a longest path problem D C A B A 1 B 1 1 1 C B 1 1

LCS as a longest path problem D C A B A 1 1 1 B 1 1 1 1 1 1 2 1 C 1 1 2 2 2 B 1 1 1 1 1 2 3

A more general problem Aligning two strings, such that Match = 1 Mismatch = 0 (or other scores) Insertion/deletion = -1 Aligning ABBC with CABC LCS = 3: ABC Best alignment ABBC CABC Score = 2 ABBC CABC Score = 1

Let F(i, j) be the best alignment score between X[1..i] and Y[1..j]. F(m, n) is the best alignment score between X and Y Recurrence F(i, j) = max F(i-1, j-1) + (i, j) F(i-1,j) – 1 F(i, j-1) – 1 Match/Mismatch Insertion on Y Insertion on X (i, j) = 1 if X[i]=Y[j] and 0 otherwise.

Alignment Example ABBC CABC X = ABBC; m = |X| = 4 j 0 1 2 3 4 i Y[j] C A B C X[i] A 1 B 2 3 B 4 C X = ABBC; m = |X| = 4 Y = CABC; n = |Y| = 4 Allocate array F[5,5]

Alignment Example ABBC CABC F(i, j) = max j 0 1 2 3 4 i Y[j] C A B C X[i] -1 -2 -3 -4 A -1 -1 -2 1 B 2 -2 -1 1 3 B -3 -2 -1 1 1 4 C -4 -2 -2 2 F(i-1, j-1) + (i, j) F(i-1,j) – 1 F(i, j-1) – 1 Match/Mismatch F(i, j) = max Insertion on Y Insertion on X (i, j) = 1 if X[i]=Y[j] and 0 otherwise.

Alignment Example ABBC CABC F(i, j) = max j 0 1 2 3 4 i Y[j] C A B C X[i] -1 -2 -3 -4 A -1 -1 -2 1 B 2 -2 -1 1 ABBC CABC Score = 2 3 B -3 -2 -1 1 1 4 C -4 -2 -2 2 F(i-1, j-1) + (i, j) F(i-1,j) – 1 F(i, j-1) – 1 Match/Mismatch F(i, j) = max Insertion on Y Insertion on X (i, j) = 1 if X[i]=Y[j] and 0 otherwise.

Alignment as a longest path problem C A B C A B B C -1 1

Alignment as a longest path problem C A B C A B B C -1 1