Simple and fast linear space computation of l ongest c ommon s ubsequences
Sponsored Links
This presentation is the property of its rightful owner.
1 / 36

Simple and fast linear space computation of L ongest c ommon s ubsequences PowerPoint PPT Presentation


  • 51 Views
  • Uploaded on
  • Presentation posted in: General

Simple and fast linear space computation of L ongest c ommon s ubsequences. Claus Rick, 1999. A. What is the LCS problem?. A A B A C. A B C. …Finding a sequence of greatest possible length that can be obtained

Download Presentation

Simple and fast linear space computation of L ongest c ommon s ubsequences

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Simple and fast linear space computation of Longest common subsequences

Claus Rick, 1999


A

What is the LCS problem?

A A B A C

A B C

…Finding a sequence of greatest possible length that can be obtained

From both A and B by deleting zero or more (not necessarily adjacent) symbols.


A

Some boring history…


A

Pre-Info

  • Divide and conquer

  • Midpoint


A

Some basic terms

Ordered Pair (i,j)

A A B A C

A B C

(2,3)=

(A,C)


A

Some basic terms

Match

A A B A C

A B C


A

Some basic terms

Chain

A A B A C

A B C


A

Rank k

A A B A C

A B C


A

Some basic terms

c b a b b a c a c

Matching Matrix

a

b

a

c

b

c

b

a


A

Some basic terms

Dominant matches

All Upper-left matches in each rank


A

Dominant matches

c b a b b a c a c

a

b

a

c

b

c

b

a

1

2

3

4

5


A

A A B A C

A B C


A

c b a b b a c a c

a

b

a

c

b

c

b

a


A

Backward contours (BC)

a

b

a

c

b

c

b

a

5

4

3

2

1

c b a b b a c a c


A

Some last basic terms

FCk

BCk


A

Forward contours (FC)

c b a b b a c a c

a

b

a

c

b

c

b

a

1

2

3

4

5


A

Backward contours (BC)

a

b

a

c

b

c

b

a

5

4

3

2

1

c b a b b a c a c


A

Lemma 1

Let p be the length of an LCS between strings A and B. Then for every match (i,j) the following holds:

  • There is an LCS containing (i,j) if and only if (i,j) is on the kth forward contour and on the (p-k+1)st backward contour.


A

Lemma 1- proof

P

|BC|- (p-k+1)

|FC|= (k)

K

<(p-k+1)

<(p-k+1)

P


A

Start calculating

FC1

BC1

FC2

BC2

Sooner or later…


A

Really really last terms

Define sets Mi as:

M0= M

M1= M0\FC1

M2= M1\BC1

M2i-1=M2(i-1) \FCi

M2i=M2i-1\BCi


A

c b a b b a c a c

a

b

a

c

b

c

b

a

a

b

a

c

b

c

b

a

M

c b a b b a c a c


A

c b a b b a c a c

a

b

a

c

b

c

b

a

a

b

a

c

b

c

b

a

M1

M2

M3

M4

M5

c b a b b a c a c


A

Let call the first empty Mi….

M p’


A

Lemma 2

  • The Length of an LCS is p’ and each match in M(p’-1) is a possible midpoint


A

Lemma 2- proof

K

K-1

K-2

1

0

K=p

M k-1

M 0

M 2

M 1

M k


A

Little problem…

  • We can`t keep tracks of each set- very expensive


A

c b a b b a c a c

a

b

a

c

b

c

b

a

a

b

a

c

b

c

b

a

c b a b b a c a c


A

What do we do?

Keep only dominant matches…

When we see a dominant match below- done.


A

c b a b b a c a c

a

b

a

c

b

c

b

a

a

b

a

c

b

c

b

a

c b a b b a c a c


A

Lets define:

  • FCf’ , BCb’ the minimal indices as stated above


A

Lemma 3

  • The Length of an LCS is b’ + f’ -1.


A

Complexity

Finding the dominant matches each contour:

O(min(m, (n-p))

Number of contours:

P

O(Min(pm, p(n-p)


A

The End


Simple and fast linear space computation of longest common subsequence

Written by:

Claus Rick,1999

Based on algorithm by:

D.Hirschberg, 1975

Cast:

Matrices

Lines

Arrows

Squares

Blue

Red

Brown

Grey

Black

String A

String B

Presentation: Uri Scheiner

No Dominant Matches were harmed during the making of this presentation


Appendix

What is the LCS

Lemma 1

Divided And Conquer

Define M…

Match

Lemma 2

Chain

Keep just Dominant…

Dominant Matches

FC

Lemma 3

BC

Complexity


  • Login