Dynamic Programming

Dynamic Programming Characteristics and Examples

Overview • What is dynamic programming? • Examples • Applications

What is Dynamic Programming? • Design technique • ‘optimization’ problems (sequence of related decisions) • Programming does not mean ‘coding’ in this context, it means ‘solve by making a chart’- or ‘using an array to save intermediate steps”. Some books call this ‘memoization’ (see below) • Similar to Divide and Conquer BUT subproblem solutions are SAVED and NEVER recomputed • Principal of optimality: the optimal solution to the problem contains optimal solutions to the subproblems (Is this true for EVERYTHING?)

Characteristics • Optimal substructure • Unweighted shortest path? • Unweighted longest simple path? • Overlapping Subproblems • What happens in recursion (D&C) when this happens? • Memoization (not a typo!) • Saving solutions of subproblems (like we did in Fibonacci) to avoid recomputation

Examples • Matrix chain • Longest common subsequence (called the LCS problem”

Review of Technique • You have already applied dynamic programming and understand it why it may result in a good algorithm • Fibonacci • Ackermann • Combinations

Principal of Optimality • Often called the “optimality condition” • What it means in plain English: you apply the divide and conquer technique so the subproblems are SMALLER VERSIONS OF THE ORIGINAL PROBLEM: if you solve “optimize” the answer to the small problems , does that fact automatically mean that the solution to the big problem is also optimized????? If the answer is yes , then DP applies to his problem

Example 1 • Assume your problem is to draw a straight line between two points A and B. You solve this by divide and conquer by drawing a line from A to the midpoint and from the midpoint to B. QUESTION: if you paste the two smaller lines together will the RESULTING LINE FROM A TO B BE THE SHORTEST DISTANCE FROM A to B???

Example 2 • Say you want to buy Halloween candy – 100 candy bars. You do plan to do this by divide and conquer – buying 10 sets of 10 bars. Is this necessarily less expensive per bar than just buying 2 packages of 50? Or perhaps 1 package of 100?

How To Apply DP • There are TWO ways to apply dynamic programming • METHOD 1: solve the problem at hand recursively, notice where the same subproblem is being ‘re-solved’ and implement the algorithm as a TABLE (example: fibonacci) • METHOD 2: generate all feasible solutions to a problem but prune (eliminate) the solutions that cannot be optimal (example: shortest path)

Practice Is the Only Way to Learn This Technique • See class webpage for homework. • Do problems in textbook not assigned but for practice – even after term ends. It took me two years after I took this course before I could apply DP in the real world.

“a” not “the” BCBA = LCS(x, y) x: A B C B D A B y: B D C A B A functional notation, but not a function Dynamic programming Design technique, like divide-and-conquer. • Example:Longest Common Subsequence (LCS) • Given two sequences x[1 . . m] and y[1 . . n], find a longest subsequence common to them both.

Analysis • Checking = O(n) time per subsequence. • 2m subsequences of x (each bit-vector of length m determines a distinct subsequence of x). Worst-case running time = O(n2m) = exponential time. Brute-force LCS algorithm Check every subsequence of x[1 . . m] to see if it is also a subsequence of y[1 . . n].

Towards a better algorithm • Simplification: • Look at thelength of a longest-common subsequence. • Extend the algorithm to find the LCS itself. Notation: Denote the length of a sequence s by|s|. • Strategy: Consider prefixes of x and y. • Define c[i, j] = |LCS(x[1 . . i], y[1 . . j])|. • Then, c[m, n] = |LCS(x, y)|.

c[i–1, j–1] + 1 if x[i] = y[j], max{c[i–1, j], c[i, j–1]} otherwise. c[i, j] = Proof. Case x[i] = y[j]: L = 1 2 i m x: j 1 2 n L y: Recursive formulation Theorem. Let z[1 . . k]=LCS(x[1 . . i], y[1 . . j]), where c[i, j] = k. Then, z[k] = x[i], or else z could be extended. Thus, z[i . . k–1] is CS of x[1 . . i–1] and y[1 . . j–1].

Thus, c[i–1, j–1] = k–1, which implies that c[i, j] = c[i–1, j–1] + 1. Other cases are similar. Proof (continued) Claim:z[1 . . k–1] = LCS(x[1 . . i–1], y[1 . . j–1]). Suppose w is a longer CS of x[1 . . i–1] andy[1 . . j–1], that is, |w| > k–1. Then, cut and paste: w || z[k] (w concatenated with z[k]) is a common subsequence of x and y with |w || z[k]| > k.Contradiction, proving claim.

Dynamic-programming hallmark #1 Optimal substructure An optimal solution to a problem (instance) contains optimal solutions to subproblems. If z = LCS(x, y), then any prefix of z is an LCS of a prefix of x and a prefix of y.

Recursive algorithm for LCS • LCS(x, y, i, j) • if x[i] = y[ j] • thenc[i, j]  LCS(x, y, i–1, j–1) • elsec[i, j] max{ LCS(x, y, i–1, j), LCS(x, y, i, j–1)} Worst-case:x[i] ¹y[ j], in which case the algorithm evaluates two subproblems, each with only one parameter decremented.

same subproblem m+n , but we’re solving subproblems already solved! Recursion tree m = 3, n = 4: 3,4 2,4 3,3 1,4 2,3 3,2 2,3 1,3 2,2 1,3 2,2 Height = m + n work potentially exponential.

Dynamic-programming hallmark #2 Overlapping subproblems A recursive solution contains a “small” number of distinct subproblems repeated many times. The number of distinct LCS subproblems for two strings of lengths m and n is only mn.

LCS(x, y, i, j) • ifc[i, j] = NIL • then if x[i] = y[j] • thenc[i, j]  LCS(x, y, i–1, j–1) • elsec[i, j] max{ LCS(x, y, i–1, j), LCS(x, y, i, j–1)} same as before Memoization algorithm Memoization: After computing a solution to a subproblem, store it in a table. Subsequent calls check the table to avoid redoing work. Time = Q(mn) = constant work per table entry. Space = Q(mn).

1 1 1 2 2 1 3 2 3 4 1 4 Dynamic-programming algorithm A B C B D A B IDEA: Compute the table bottom-up. 0 0 0 0 0 0 0 0 0 0 1 1 1 B Time = Q(mn). 0 0 1 1 1 2 2 D 0 0 1 2 2 2 2 C 0 1 2 2 2 3 A 0 1 2 3 3 B 0 2 2 3 3 4 A

A B C B D A B 0 0 B 1 1 1 1 1 2 D 2 2 C 1 2 3 A 2 3 3 3 4 B 1 4 4 A Dynamic-programming algorithm A B C B D A B IDEA: Compute the table bottom-up. 0 0 0 0 0 0 0 0 0 0 1 1 1 B Time = Q(mn). 0 0 1 1 1 2 2 D Reconstruct LCS by tracing backwards. 0 0 1 2 2 2 2 C 0 1 2 2 2 3 A 0 1 2 3 3 B Space = Q(mn). Exercise: O(min{m, n}). 0 2 2 3 3 4 4 A

String Matching detecting the occurrence of a particular substring (pattern) in another string (text) A straightforward Solution The Knuth-Morris-Pratt Algorithm The Boyer-Moore Algorithm

Straightforward solution • Algorithm: Simple string matching • Input: P and T, the pattern and text strings; m, the length of P. The pattern is assumed to be nonempty. • Output: The return value is the index in T where a copy of P begins, or -1 if no match for P is found.

int simpleScan(char[] P,char[] T,int m) • int match //value to return. • int i,j,k; • match = -1; • j=1;k=1; i=j; • while(endText(T,j)==false) • if( k>m ) • match = i; //match found. • break; • if(tj == pk) • j++; k++; • else • //Back up over matched characters. • int backup=k-1; • j = j-backup; • k = k-backup; • //Slide pattern forward,start over. • j++; i=j; • return match;

Analysis • Worst-case complexity is in (mn) • Need to back up. • Works quite well on average for natural language.

The Knuth-Morris-Pratt Algorithm • Pattern Matching with Finite Automata • e.g. P = “AABC”

The Knuth-Morris-Pratt Flowchart • Character labels are inside the nodes • Each node has two arrows out to other nodes: success link, or fail link • next character is read only after a success link • A special node, node 0, called “get next char” which read in next text character. • e.g. P = “ABABCB”

Construction of the KMP Flowchart • Definition:Fail links • We define fail[k] as the largest r (with r<k) such that p1,..pr-1 matches pk-r+1...pk-1.That is the (r-1) character prefix of P is identical to the one (r-1) character substring ending at index k-1. Thus the fail links are determined by repetition within P itself.

Algorithm: KMP flowchart construction • Input: P,a string of characters;m,the length of P. • Output: fail,the array of failure links,defined for indexes 1,...,m.The array is passed in and the algorithm fills it. • Step: • void kmpSetup(char[] P, int m, int[] fail) • int k,s • 1. fail[1]=0; • 2. for(k=2;k<=m;k++) • 3. s=fail[k-1]; • 4. while(s>=1) • 5. if(ps==pk-1) • 6. break; • 7. s=fail[s]; • 8. fail[k]=s+1;

The Knuth-Morris-Pratt Scan Algorithm • int kmpScan(char[] P,char[] T,int m,int[] fail) • int match, j,k; • match= -1; • j=1; k=1; • while(endText(T,j)==false) • if(k>m) • match = j-m; • break; • if(k==0) • j++; k=1; • else if(tj==pk) • j++; k++; • else • //Follow fail arrow. • k=fail[k]; • //continue loop. • return match;

Analysis • KMP Flowchart Construction require 2m – 3 character comparisons in the worst case • The scan algorithm requires 2n character comparisons in the worst case • Overall: Worst case complexity is (n+m)

The Boyer-Moore Algorithm • The new idea • first heuristic • e.g. scan from right to left, jump forward … • Find “must” in • If you wish to understand you must… • must • 1 1 1 1 1 111 1 1 1211 • If you wish to understand you must…

Algorithm:Computing Jumps for the Boyer-Morre Algorithm • Input:Pattern string P:m the length of P;alphabet size alpha=|| • Output:Array charJump,defined on indexes 0,....,alpha-1.The array is passed in and the algorithm fills it. • void computeJumps(char[] P,int m,int alpha,int[] charJump) • char ch; int k; • for (ch=0;ch<alpha;ch++) • charJump[ch]=m; • for (k=1;k<=m;k++) • charJump[pk]=m-k;

Hashing Theory and Practice

Overview • What is hashing? • Why is hashing important • Hashing functions • Collision handling • Hashing in the real world

What Is Hashing? • Hashing in a search technique that – when it works – allows you to search in theta(1) • The term ‘hash’ refers to the old way of doing this type of search – hash means to ‘chop up’ and in the old days, a key such as my name would be ‘hashed’ by taking the first letter G or the next to last letter ‘n’, etc

Example Array • Example – hash my name ‘greene’ – take the g and map it to its number position in the alphabet • When someone want to retrieve my ‘record’ , the key is just ‘rehashed’ and we go to spot #7 without having to search 1 7 26

Why Hashing Is Important • Realtime applications • Applications with enormous numbers f possible keys of which only a tiny number will actually occur in practice • Social security numbers  Rivier students

Hashing Terminology • Hash function f(key) -> index. • The method you map the given key to an index of an array. • Hashing function should be fast and uniform (not map different keys to the same index). • Loading density (or loading factor). • The ratio between the used and free spots in the array (hash table) -> basically,how full is the table? • Collisions (or synonyms) – when two different keys map to the same spot in the hash table.

Hashing Functions • Any method that works is OK – no need for complicated techniques.Just convert the key to a numerical index SOMEHOW. • Division remainder method (modulus method). • Divide the key by the table size and use the remainder (0 …. N-1) as the index. • Example: key = 12 , table size = 30  F(12)  6. • Use a divisor that is the closest PRIME NUMBER to the table size (but isn’t bigger). Why????? (In our case it would be 29).

Huge ProblemCollision Handling • Collisions MAY occur no matter what hashing function you use. You will need a technique for collision handling that does not destroy system performance • Techniques • Linear probing and primary clustering • Quadratic probing • Chaining

Analysis of Collision Handling Techniques • Linear probing : 1/(1 – a) where a is the loading factor • Chaining • Theta(1 + a/2) 1 to access a slot, and a/2 time to search the chain  theta (1 + a) where ‘a’ is the loading factor

Hashing in the Real World • Often done for realtime applications • When designing a hashing strategy YOU MUST PAY ATTENTION TO ACTIONS ON THE TABLE SUCH AS DELETIONS! • Successful place to start is • Division remainder with prime divisor • Chaining for ‘overflow’ or collisions

Binary Trees Searching

Overview • We are studying searching techniques from the fastest (hashing) onwards. Notice searching techniques have come logically after sorting techniques • What is a binary search tree • Tree traversals • Randomly (dynamically) built binary search trees

K M H L T B J Worst case Search Time T(n)???? Is a single node a BST? Can a BST be empty??? What is a Binary Search Tree?

Traversals • Traversals search the tree in some order and guarantee that no node will be missed • Traversal types • Preorder PLR • Inorder LPR • Postorder LRP • Breadth first • Depth first

K M H L T B J K H B J M L T Preorder?

Dynamic Programming