5/1/03CSE 202 - Dynamic Programming CSE 202 - Algorithms Dynamic Programming.

Slides:



Advertisements
Similar presentations
Alignment methods Introduction to global and local sequence alignment methods Global : Needleman-Wunch Local : Smith-Waterman Database Search BLAST FASTA.
Advertisements

Lecture 8: Dynamic Programming Shang-Hua Teng. Longest Common Subsequence Biologists need to measure how similar strands of DNA are to determine how closely.
Overview What is Dynamic Programming? A Sequence of 4 Steps
Algorithms Dynamic programming Longest Common Subsequence.
Dynamic Programming.
Introduction to Algorithms
Comp 122, Fall 2004 Dynamic Programming. dynprog - 2 Lin / Devi Comp 122, Spring 2004 Longest Common Subsequence  Problem: Given 2 sequences, X =  x.
CPSC 311, Fall 2009: Dynamic Programming 1 CPSC 311 Analysis of Algorithms Dynamic Programming Prof. Jennifer Welch Fall 2009.
Heuristic alignment algorithms and cost matrices
Dynamic Programming Reading Material: Chapter 7..
10/29/02CSE Dynamic Programming CSE Algorithms Dynamic Programming.
12/3CSE NP-complete CSE Algorithms NP Completeness Approximations.
Dynamic Programming Optimization Problems Dynamic Programming Paradigm
CPSC 411 Design and Analysis of Algorithms Set 5: Dynamic Programming Prof. Jennifer Welch Spring 2011 CPSC 411, Spring 2011: Set 5 1.
Dynamic Programming1. 2 Outline and Reading Matrix Chain-Product (§5.3.1) The General Technique (§5.3.2) 0-1 Knapsack Problem (§5.3.3)
UNC Chapel Hill Lin/Manocha/Foskey Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject.
Dynamic Programming and Biological Sequence Comparison Part I.
Analysis of Algorithms CS 477/677
Dynamic Programming Reading Material: Chapter 7 Sections and 6.
© 2004 Goodrich, Tamassia Dynamic Programming1. © 2004 Goodrich, Tamassia Dynamic Programming2 Matrix Chain-Products (not in book) Dynamic Programming.
Lecture 8: Dynamic Programming Shang-Hua Teng. First Example: n choose k Many combinatorial problems require the calculation of the binomial coefficient.
Analysis of Algorithms
1 Dynamic Programming Jose Rolim University of Geneva.
Lecture 7 Topics Dynamic Programming
Longest Common Subsequence
10/31/02CSE Greedy Algorithms CSE Algorithms Greedy Algorithms.
Alignment methods II April 24, 2007 Learning objectives- 1) Understand how Global alignment program works using the longest common subsequence method.
Dynamic Programming Introduction to Algorithms Dynamic Programming CSE 680 Prof. Roger Crawfis.
10/31/02CSE Greedy Algorithms CSE Algorithms Greedy Algorithms.
Developing Pairwise Sequence Alignment Algorithms
5/27/03CSE Shortest Paths CSE Algorithms Shortest Paths Problems.
Dynamic Programming UNC Chapel Hill Z. Guo.
October 21, Algorithms and Data Structures Lecture X Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.
ADA: 7. Dynamic Prog.1 Objective o introduce DP, its two hallmarks, and two major programming techniques o look at two examples: the fibonacci.
CS 5243: Algorithms Dynamic Programming Dynamic Programming is applicable when sub-problems are dependent! In the case of Divide and Conquer they are.
COSC 3101A - Design and Analysis of Algorithms 7 Dynamic Programming Assembly-Line Scheduling Matrix-Chain Multiplication Elements of DP Many of these.
Chapter 3 Computational Molecular Biology Michael Smith
CS 8833 Algorithms Algorithms Dynamic Programming.
6/4/ ITCS 6114 Dynamic programming Longest Common Subsequence.
Dynamic Programming David Kauchak cs302 Spring 2013.
Introduction to Algorithms Jiafen Liu Sept
CS 3343: Analysis of Algorithms Lecture 18: More Examples on Dynamic Programming.
Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject to some constraints. (There may.
Computer Sciences Department1.  Property 1: each node can have up to two successor nodes (children)  The predecessor node of a node is called its.
Dynamic Programming David Kauchak cs161 Summer 2009.
Part 2 # 68 Longest Common Subsequence T.H. Cormen et al., Introduction to Algorithms, MIT press, 3/e, 2009, pp Example: X=abadcda, Y=acbacadb.
9/27/10 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Adam Smith Algorithm Design and Analysis L ECTURE 16 Dynamic.
TU/e Algorithms (2IL15) – Lecture 4 1 DYNAMIC PROGRAMMING II
4.2 - Algorithms Sébastien Lemieux Elitra Canada Ltd.
1 Chapter 15-2: Dynamic Programming II. 2 Matrix Multiplication Let A be a matrix of dimension p x q and B be a matrix of dimension q x r Then, if we.
Dynamic Programming Typically applied to optimization problems
CS 3343: Analysis of Algorithms
Advanced Design and Analysis Techniques
JinJu Lee & Beatrice Seifert CSE 5311 Fall 2005 Week 10 (Nov 1 & 3)
Chapter 8 Dynamic Programming.
CSCE 411 Design and Analysis of Algorithms
Dynamic Programming Dr. Yingwu Zhu Chapter 15.
BCB 444/544 Lecture 7 #7_Sept5 Global vs Local Alignment
Longest Common Subsequence
Lecture 8. Paradigm #6 Dynamic Programming
Ch. 15: Dynamic Programming Ming-Te Chi
Dynamic Programming-- Longest Common Subsequence
Introduction to Algorithms: Dynamic Programming
DYNAMIC PROGRAMMING.
Longest Common Subsequence
Algorithms and Data Structures Lecture X
Longest Common Subsequence
Presentation transcript:

5/1/03CSE Dynamic Programming CSE Algorithms Dynamic Programming

CSE Dynamic Programming2 Chess Endgames Ken Thompson (Bell Labs) solved all chess endgame positions with  5 pieces (1986). Some surprises: –K+Q beats K+2B. (Since 1634, believed to be draw). –50 moves without a pawn move or capture is not sufficient to ensure it’s a draw. –By now, most (all?) 6-piece endgames are solved too. Searching all possible games with 5 pieces, going forward 50 or so moves, is infeasible (today). –If typical position has 4 moves, search is long –2 100 = 2 45 ops/second x 2 25 seconds/year x 1 billion years. There is a better way!

CSE Dynamic Programming3 “Retrograde Analysis” Insight: there aren’t so many different positions. With 5 specific pieces, there are  64 5 positions is about 10 9 (large, but quite feasible). Make memory location for each position. “Position” includes whose move it is. Store count of possible moves (for Black moves). Initialize work list to all won positions for white. For each “win” P in list, process all positions Q that can move to P (unless Q was already known to win for white). If Q is a white move, mark Q as win and add it to list. If Q is a black move, decrease possible move count by 1 If count == 0, mark Q as win for white and put it on list.

CSE Dynamic Programming4 Why did this reduce work?? Can evaluate game tree by divide & conquer Win-for-white (P) { let Q 1,..., Q k be all possible moves from P; W i = Win-for-white(Q i ); if P is a white move and any W i =T then return T; if P is a black move and all W i =T then return T; return F; } Inefficient due to overlapping subproblems. Many subproblems share the same sub-subproblems Don’t allow repeated positions Divide Combine

CSE Dynamic Programming5 Dynamic Programming Motto: “It’s not dynamic and it’s not programming” For a problem with “optimal substructure” property... Means that, like D&C, you can build solution to bigger problem from solutions to subproblems.... and “overlapping subproblems” which makes D&C inefficient. Dynamic Programming: builds a big table and fills it in from small subproblems to large. Another approach is memoization: Start with the big subproblems, Store answers in table as you compute them. Before starting on subproblem, check if you’ve already done it. More overhead, but (maybe) more intuitive

CSE Dynamic Programming6 Example: Longest Common Substring Z = z 1 z 2... z k is a substring of X = x 1 x 2... x n if you can get Z by dropping letters of X. –Example: “Hello world” is a substring of “Help! I'm lost without our landrover”. LCS problem: given strings X and Y, find the length of the longest Z that is a substring of both (or perhaps find that Z).

CSE Dynamic Programming7 LCS has “optimal substructure” Suppose length(X) = x and length(Y)=y. Let X[a:b] mean the a-th to b-th character of X. Observation: If Z is a substring of X and Y, then the last character of Z is the last character of both X and Y... so LCS(X,Y) = LCS( X[1:x-1], Y[1:y-1] ) || X[x] or of just X, so LCS(X,Y) = LCS(X, Y[1:y-1]) or of just Y, so LCS(X,Y) = LCS(X[1:x-1], Y) or neither so LCS(X,Y) = LCS(X[1:x-1], Y[1:y-1])

CSE Dynamic Programming8 Dynamic Programming for LCS Table: L(i,j) = length of LCS ( X[1:i], Y[1,j] ). If the last character of both X and Y is the same... LCS(X,Y) = LCS(X[1:x-1], Y[1:y-1] ) || X[x] otherwise, LCS(X,Y) = LCS(X, Y[1:y-1]) or LCS(X,Y) = LCS(X[1:x-1], Y) or LCS(X,Y) = LCS(X[1:x-1], Y[1:y-1]) tells us: L(i,j) = max { L(i-1,j-1)+ (X[i]==Y[i]), L(i,j-1), L(i-1,j) } Each table entry is computed from 3 earlier entries.

CSE Dynamic Programming9 Dynamic Programming for LCS HELLOWORLD L A N D R O V E R = match (increment)

CSE Dynamic Programming10 But – there are other substructures (Work out the following at board): –Can we have T(i) = LCS(X[1:i], Y[1,i]) ? –How about, “the first half of X matches up with some part of Y, and the rest with the rest.” Suggests LCS(X,Y) = maximum (for i=1,...,y) of LCS(X[1:x/2], Y[1:i]) + LCS(X[x/2+1,x],Y[i+1,y]) –We can make the table 1-D instead of2-D!

CSE Dynamic Programming11 Summary For a good dynamic programming algorithm: –Table should be low-dimensional keeps memory requirements low. –Each table entry should be fast to calculate keeps complexity low. May decompose problems differently than a good Divide & Conquer algorithm.

CSE Dynamic Programming12 How to get your name on an algorithm: Use Dynamic Programming Earley’s algorithm: Parse a string using a CFG. Viterbi’s algorithm: Break string into codewords Smith-Waterman algorithm: Match up two strings of DNA or protein Floyd-Warshall: All-pairs shortest paths.

CSE Dynamic Programming13 How to find a Dynamic Programming algorithm For a moderately small problem, draw a tree representing possible choices you have in finding an optimal solution. Look for instances of common subproblems. Figure out how you could store the subproblem solutions in a table. –make a careful statement of what the table entries mean Figure out where to start filling in the table. –typically the smallest subproblems at the tree’s bottom. Figure out how to fill in other table entries. –recursive step: how do you solve a larger subproblem given the solutions to smaller subproblems.

CSE Dynamic Programming14 Example: Viterbi’s Algorithm Given a Hidden Markov Model (HMM) and a string s = s 1 s 2.. s n, find most probable path taken by the Markov model that produces s. –HMM is a set of states, a set of transition probabilities between the states, and, for each state and each output character, a probability of producing that character at that state. –The probability of a path is the product of the transition probabilities along the path and the probabilities of producing the observed character are the states along the path.

CSE Dynamic Programming15 Discovering Viterbi’s algorithm Example HMM: Tree for producing “BCA”: State 1 Pr{A} =.2 Pr{B} =.8 State 2 Pr{A} =.5 Pr{C} =.5 State 3 Pr{A} =.3 Pr{B} =.4 Pr{C} = start State 1 Pr{B} =.8 State 3 Pr{B} =.4 State 2 Pr{C} =.5 State 3 Pr{C} =.3 1 A:.2 2 A:.5 3 A:.3 1 A:.2 2 A: State 2 Pr{C} =.5 1 A:.2 2 A:.5 3 A: Most probable path that produces “BCA”: (It’s probability is.33x.8x.7x.5x.5x.5 =.0233) The two red subtrees are identical! They both represent the choices for producing “CA” starting in state 2. Only paths to leaves that produce BCA are shown. Assume first states are equi- probably

CSE Dynamic Programming16 How to find a Dynamic Programming algorithm Figure out how you could store the subproblem solutions in a table. –Index the table by a state and a level in the tree Make a careful statement of what the table entries mean. –T(i,k) = Probability of the most-probable path that starts in state i and produces the string s k, s k+1,..., s n. Figure out where to start filling in the table. –You can fill in T(i,n) = Pr{state i produces s n } for each i. Figure out how to fill in other table entries. –T(i,k) = Pr{i produces s k } x Max [Pr{i moves to j} x T(j,k+1)] state j

CSE Dynamic Programming17 Yet another example Disclaimer: dynamic programming is NOT the best way to solve this problem! Given matrices M 1, M 2,... M n, where M i has r i rows and c i columns, find the best order of performing the computation. –It will be some parenthesization, e.g. (M 1 xM 2 )x(M 3 x(...xM n )...) –Note that for i = 1,...,n-1, c i =r i+1. –Assume time to multiply an r by s matrix with an s by t matrix is rst. We worked out an example, found subproblems like “What is time to compute M i x M i+1 x... x M k ?” This led us to an  (n 3 ) algorithm.

CSE Dynamic Programming18 Protein (or DNA) String Matching Biological sequences: A proteins is a string of amino acids (20 “characters”). DNA is a string of base pairs (4 “characters”). Databases of known sequences: SDSC handles the Protein Data Base (PDB). Human Genome (3x10 9 base pairs) computed in ‘01. String matching problem: Given a query string, find similar strings in database. Smith-Waterman algorithm is most accurate method. FastA and BLAST are faster approximations.

CSE Dynamic Programming19 Smith-Waterman Algorithm Like LCS, but more sophisticated: Each pair (x,y) of characters has “affinity” A(x,y). –easy to handle – add affinity instead of “1”. Skipping characters incurs a penalty –c 0 + k c 1 penalty for skipping over k characters A R T S E D I O U S M Y C R O A R E L I C Score=50-29=21 M I C R O A R E L I S P A L L M using c 0 =2, c 1=1

CSE Dynamic Programming20 Smith-Waterman Algorithm We could handle skips by: SW(i,j) = max { 0, SW(i-1, j-1) + A(X i,Y j ), max { SW(i, j-k) - c 0 - c 1 k }, max { SW(i-k, j) - c 0 - c 1 k }, max max { SW(i-k, j-m) - 2c 0 - c 1 (k+m) } } This would take O(n 4 ) time. Fortunately, the last term (the double-max) is redundant. Go from SW(i-k, j-m) to SW(i, j-k) and then to SW(i,j). Eliminating it reduces the complexity to O(n 3 ). k=1 to j k=1 to i m=1 to j

CSE Dynamic Programming21 Smith-Waterman in O(n 2 ) time At cell i,j of table keep three numbers: s(i,j) = score of best matching of X[1:i] and Y[1:j]. whether or not it ends in a skip t(i,j) = best score of X[1:i] and Y[1:j] that skips Xi. u(i,j) = best score of X[1:i] and Y[1:j] that skips Yj. At each cell, compute: t(i,j) = max { s(i-1,j)-c 0 -c 1, t(i-1,j)-c 1 } u(i,j) = max { s(i,j-1)-c 0 -c 1, u(i,j-1)-c 1 } s(i,j) = max { s(i-1,j-1)+A(x i,y j ), t(i,j), u(i,j), 0 } start new gap continue old gap

CSE Dynamic Programming22 Glossary (in case symbols are weird)    Ísubset  element of  for all  there exists  big theta  big omega  summation  >=  <=  about equal  not equal  natural numbers(N)  reals(R)  rationals(Q)  integers(Z)