Lecture 8: Dynamic Programming

Lecture 8:Dynamic Programming Shang-Hua Teng

Longest Common Subsequence • Biologists need to measure how similar strands of DNA are to determine how closely related an organism is to another. • They do this by considering DNA as strings of letters A,C,G,T and then comparing similarities in the strings. • Formally they look at common subsequences in the strings. • Example X = AGTCAACGTT, Y=GTTCGACTGTG • Both S = AGTG and S’=GTCACGT are subsequences • How to do find these efficiently?

Brute Force • if |X| = m, |Y| = n, then there are 2m subsequences of x; we must compare each with Y (n comparisons) • So the running time of the brute-force algorithm is O(n 2m) • Notice that the LCS problem has optimal substructure: solutions of subproblems are parts of the final solution. • Subproblems: “find LCS of pairs of prefixes of X and Y”

Some observations

Setup • First we’ll find the length of LCS, along the way we will leave “clues” on finding subsequence. • Define Xi, Yj to be the prefixes of X and Y of length i and j respectively. • Define c[i,j] to be the length of LCS of Xi and Yj • Then the length of LCS of X and Y will be c[m,n]

The recurrence • The subproblems overlap, to find LCS we need to find LCS of c[i, j-1] and of c[i-1, j]

LCS Algorithm • First we’ll find the length of LCS. Later we’ll modify the algorithm to find LCS itself. • Recall we want to let Xi, Yj to be the prefixes of X and Y of length i and j respectively • And that Define c[i,j] to be the length of LCS of Xi and Yj • Then the length of LCS of X and Y will be c[m,n]

LCS recursive solution • We start with i = j = 0 (empty substrings of x and y) • Since X0 and Y0 are empty strings, their LCS is always empty (i.e. c[0,0] = 0) • LCS of empty string and any other string is empty, so for every i and j: c[0, j] = c[i,0] = 0

LCS recursive solution • When we calculate c[i,j], we consider two cases: • First case:x[i]=y[j]: one more symbol in strings X and Y matches, so the length of LCS Xi and Yjequals to the length of LCS of smaller strings Xi-1 and Yi-1 , plus 1

LCS recursive solution • Second case:x[i] != y[j] • As symbols don’t match, our solution is not improved, and the length of LCS(Xi , Yj) is the same as before (i.e. maximum of LCS(Xi, Yj-1) and LCS(Xi-1,Yj)

LCS Example We’ll see how LCS algorithm works on the following example: • X = ABCB • Y = BDCAB What is the Longest Common Subsequence of X and Y? LCS(X, Y) = BCB X = A BCB Y = B D C A B

LCS Example (0) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 A 1 B 2 3 C 4 B X = ABCB; m = |X| = 4 Y = BDCAB; n = |Y| = 5 Allocate array c[6,5]

LCS Example (1) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 A 1 0 B 2 0 3 C 0 4 B 0 for i = 1 to m c[i,0] = 0

LCS Example (2) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 B 2 0 3 C 0 4 B 0 for j = 0 to n c[0,j] = 0

LCS Example (3) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 B 2 0 3 C 0 4 B 0 case i=1 and j=1 A != B but, c[0,1]>=c[1,0] so c[1,1] = c[0,1], and b[1,1] =

LCS Example (4) j 0 12 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 B 2 0 3 C 0 4 B 0 case i=1 and j=2 A != D but, c[0,2]>=c[1,1] so c[1,2] = c[0,2], and b[1,2] =

LCS Example (5) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 B 2 0 3 C 0 4 B 0 case i=1 and j=3 A != C but, c[0,3]>=c[1,2] so c[1,3] = c[0,3], and b[1,3] =

LCS Example (6) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 B 2 0 3 C 0 4 B 0 case i=1 and j=4 A = A so c[1,4] = c[0,2]+1, and b[1,4] =

LCS Example (7) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 3 C 0 4 B 0 case i=1 and j=5 A != B this time c[0,5]<c[1,4] so c[1,5] = c[1, 4], and b[1,5] =

LCS Example (8) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 3 C 0 4 B 0 case i=2 and j=1 B = B so c[2, 1] = c[1, 0]+1, and b[2, 1] =

LCS Example (9) j 0 12 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 3 C 0 4 B 0 case i=2 and j=2 B != D and c[1, 2] < c[2, 1] so c[2, 2] = c[2, 1] and b[2, 2] =

LCS Example (10) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 3 C 0 4 B 0 case i=2 and j=3 B != D and c[1, 3] < c[2, 2] so c[2, 3] = c[2, 2] and b[2, 3] =

LCS Example (11) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 3 C 0 4 B 0 case i=2 and j=4 B != A and c[1, 4] = c[2, 3] so c[2, 4] = c[1, 4] and b[2, 2] =

LCS Example (12) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 4 B 0 case i=2 and j=5 B = B so c[2, 5] = c[1, 4]+1 and b[2, 5] =

LCS Example (13) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 4 B 0 case i=3 and j=1 C != B and c[2, 1] > c[3,0] so c[3, 1] = c[2, 1] and b[3, 1] =

LCS Example (14) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 4 B 0 case i=3 and j= 2 C != D and c[2, 2] = c[3, 1] so c[3, 2] = c[2, 2] and b[3, 2] =

LCS Example (15) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 4 B 0 case i=3 and j= 3 C = C so c[3, 3] = c[2, 2]+1 and b[3, 3] =

LCS Example (16) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 4 B 0 case i=3 and j= 4 C != A c[2, 4] < c[3, 3] so c[3, 4] = c[3, 3] and b[3, 3] =

LCS Example (17) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 case i=3 and j= 5 C != B c[2, 5] = c[3, 4] so c[3, 5] = c[2, 5] and b[3, 5] =

LCS Example (18) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 1 case i=4 and j=1 B = B so c[4, 1] = c[3, 0]+1 and b[4, 1] =

LCS Example (19) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 1 1 case i=4 and j=2 B != D c[3, 2] = c[4, 1] so c[4, 2] = c[3, 2] and b[4, 2] =

LCS Example (20) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 1 1 2 case i=4 and j= 3 B != C c[3, 3] > c[4, 2] so c[4, 3] = c[3, 3] and b[4, 3] =

LCS Example (21) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 4 B 0 1 1 2 2 case i=4 and j=4 B != A c[3, 4] = c[4, 3] so c[4, 4] = c[3, 4] and b[3, 5] =

LCS Example (22) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 3 4 B 0 1 1 2 2 case i=4 and j=5 B= B so c[4, 5] = c[3, 4]+1 and b[4, 5] =

LCS Algorithm Running Time • LCS algorithm calculates the values of each entry of the array c[m,n] • So the running time is clearly O(mn) as each entry is done in 3 steps. • Now how to get at the solution? • We use the arrows we created to guide us. • We simply follow arrows back to base case 0

Finding LCS j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 3 4 B 0 1 1 2 2

Finding LCS (2) j 0 1 2 3 4 5 i Yj B D C A B Xi 0 0 0 0 0 0 0 A 1 0 0 0 0 1 1 B 2 0 1 1 1 1 2 3 C 0 1 1 2 2 2 3 4 B 0 1 1 2 2 B C B LCS (reversed order): B C B (this string turned out to be a palindrome) LCS (straight order):

LCS-Length(X, Y) m = length(X), n = length(Y) for i = 1 to m do c[i, 0] = 0 for j = 0 to n do c[0, j] = 0 for i = 1 to m do for j = 1 to n do if ( xi = = yj ) then c[i, j] = c[i - 1, j - 1] + 1 else if c[i - 1, j]>=c[i, j - 1] then c[i, j] = c[i - 1, j] else c[i, j] = c[i, j - 1] return c and b

Two Common Issues • Optimal substructure • Overlapping subproblems

Optimal Substructure • A problem exhibits optimal substructure if an optimal solution contains optimal solutions to its sub-problems. • Build an optimal solution from optimal solutions to sub-problems • Example - Matrix-chain multiplication: An optimal parenthesization of AiAi+1…Aj that splits the product between Akand Ak+1 contains within it optimal solutions to the problem of parenthesizing AiAi+1…Ak and Ak+1Ak+2…Aj.

Then (A1A2) (A3((A4A5)A6)) must be optimal for A1A2A3A4A5A6 Otherwise, if (A1(A2 A3)) ((A4A5)A6) is optimal for A1A2A3A4A5A6 Then ((A1(A2 A3)) ((A4A5)A6)) ((A7A8)A9) will be better than ((A1A2)(A3((A4A5)A6))) ((A7A8)A9) Illustration of Optimal Substructure Minimal Cost_A1..6 + Cost_A7..9+p0p6p9 A1A2A3A4A5A6A7A8A9 Suppose ((A1A2)(A3((A4A5)A6))) ((A7A8)A9) is optimal

Recognizing subproblems • Show a solution to the problem consists of making a choice. Making the choice leaves one or more sub-problems to be solved. • Suppose that for a given problem, the choice that leads to an optimal solution is available. • Notice something in common with a greedy solution, more on this later.

Dynamic vs. Greedy • Dynamic programming uses optimal substructure in a bottom-up fashion • First find optimal solutions to subproblems and, having solved the subproblems, we find an optimal solution to the problem • Greedy algorithms use optimal substructure in a top-down fashion • First make a choice – the choice that looks best at the time – and then solving a resulting subproblem

Overlapping Subproblems • Divide-and-Conquer is suitable when generating brand-new problems at each step of the recursion. • Dynamic-programming algorithms take advantage of overlapping subproblems by solving each subproblem once and then storing the solution in a table where it can be looked up when needed, using constant time per lookup

Assembly Line

2n possible solutions Time between adjacent stationsare 0 Problem Definition ti,j: time to transfer from assembly line 12 or 21 ai,j: processing time in each station e1, e2: time to enter assembly lines 1 and 2 x1, x2: time to exit assembly lines 1 and 2

Optimal Substructure • We want the fastest way through the factory (from the starting point) • What is the fastest possible way through S1,1 (similar for S2,1) • Only one way  take time e1 • The fastest possible way through S1,j for j=2, 3, ..., n (similar for S2,j) • S1,j-1 S1,j : T1,j-1 + a1,j • If the fastest way through S1,j is through S1,j-1 must have taken a fastest way through S1,j-1 • Similar for S2, i-1 • An optimal solution contains an optimal solution to sub-problems  optimal substructure.

S1,1  2 + 7 = 9S2,1  4 + 8 = 12

S1,1 + 9 = 9 + 9 = 18 S1,1  2 + 7 = 9S2,1  4 + 8 = 12 S1,2 = S2,1 + 2 + 9 = 12 + 2 + 9 = 23 S1,1 + 2 + 5 = 9 + 2 + 5 = 16 S2,2 = S2,1 + 5 = 12 + 5 = 17

Lecture 8: Dynamic Programming

Lecture 8: Dynamic Programming

Presentation Transcript

Revealing the Man Behind the Curtain

Dynamic Programming: Edit Distance

Dynamic Programming

Integer Programming

Advanced Programming

INTRODUCTION TO FUNCTIONAL PROGRAMMING

Mobile Programming Lecture 14

Mobile Programming Lecture 8

Machine Programming - Introduction CENG331: Introduction to Computer Systems 5 th Lecture

Dynamic-Programming Strategies for Analyzing Biomolecular Sequences

Introduction to Programming

Usability of Programming Languages

Object Oriented Programming in Java

Machine Programming - Introduction CENG331: Introduction to Computer Systems 4 th Lecture

Dynamic Programming: from the Basics to Completely Dyamic Applications

CSE-321 Programming Languages Introduction to Functional Programming

Mobile Programming Lecture 6

Programming Language Concepts, COSC-3308-01 Lecture 4

Dynamic Programming

Neuro-Dynamic Programming

Lecture 1 : Introduction to Programming in Java Lecturer : Susan Eisenbach