1 / 8

The Longest Common Subsequence Problem

The Longest Common Subsequence Problem. CSE 373 Data Structures. Reading. Goodrich and Tamassia, 3 rd ed, Chapter 12, section 11.5, pp.570-574. Motivation. Two Problems and Methods for String Comparison: The substring problem The longest common subsequence problem .

Download Presentation

The Longest Common Subsequence Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Longest Common Subsequence Problem CSE 373 Data Structures

  2. Reading Goodrich and Tamassia, 3rd ed, Chapter 12, section 11.5, pp.570-574. CSE 373 AU 04 -- Longest Common Subsequences

  3. Motivation • Two Problems and Methods for String Comparison: • The substring problem • The longest common subsequence problem. • In both cases, good algorithms do substantially better than the brute force methods. CSE 373 AU 04 -- Longest Common Subsequences

  4. String Matching Problem • Given two strings TEXT and PATTERN, find the first occurrence of PATTERN in TEXT. • Useful in text editing, document analysis, genome analysis, etc. CSE 373 AU 04 -- Longest Common Subsequences

  5. String Matching Problem:Brute-Force Algorithm For i = 0 to n – m { For j = 0 to m – 1 { If TEXT[j]  PATTERN[i] then break If j = m – 1 then return i } return -1; } Suppose TEXT = 0000000000001 PATTERN = 0000001 This type of problem has (n2) behavior. A more efficient algorithm is the Boyer-Moore algorithm. (We will not be covering it in this course.) CSE 373 AU 04 -- Longest Common Subsequences

  6. Longest Common Subsequence Problem • A Longest Common Subsequence LCS of two strings S1 and S2 is a longest string the can be obtained from S1 and from S2 by deleting elements. • For example, S1 = “thoughtful” and S2 = “shuffle” have an LCS: “hufl”. • Useful in spelling correction, document comparison, etc. CSE 373 AU 04 -- Longest Common Subsequences

  7. Dynamic Programming • Analyze the problem in terms of a number of smaller subproblems. • Solve the subproblems and keep their answers in a table. • Each subproblem’s answer is easily computed from the answers to its own subproblems. CSE 373 AU 04 -- Longest Common Subsequences

  8. Longest Common Subsequence:Algorithm using Dynamic Programming • For every prefix of S1 and prefix of S2 we’ll compute the length L of an LCS. • In the end, we’ll get the length of an LCS for S1 and S2 themselves. • The subsequence can be recovered from the matrix of L values. • (see demonstration) CSE 373 AU 04 -- Longest Common Subsequences

More Related