1 / 16

Reverse Factor Algorithm

This article discusses techniques for speeding up two string matching algorithms: the Suffix to Prefix Rule and the Suffix Automaton. It provides examples and references for further reading.

neiljohnson
Download Presentation

Reverse Factor Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reverse Factor Algorithm Speeding up on two string matching algorithms, Algorithmica, Vol.12, 1994, pp.247-267 CROCHEMORE, M., CZUMAJ, A., GASIENIEC, L., JAROMINEK, S., LECROQ, T., PLANDOWSKI, W. and RYTTER, W. Advisor: Prof. R. C. T. Lee Speaker: L. C. Chen

  2. Rule 1: The Suffix to Prefix Rule • For a window to have any chance to match a pattern, in some way, there must be a suffix of the window which is equal to a prefix of the pattern.

  3. Basic Ideas Open a window W with size |P| in the text. W T |P| p • Find the longest suffix of W is also the prefix of pattern. Case 1: W T |P| p Match!

  4. Case 2: W T |P| p W T |P| p Case 3: If there is no such suffix, we move W withlength |P|. W T |P| |P| p

  5. Preprocessing phase • T=GCATCGGCGAGAGTATACAGTACG  • P=GCAGAGAG • L(S): a set contains all prefixes of the pattern. We construct the suffix automaton of P. C Suffix Automaton A G C A G G G A 8 7 6 5 4 3 2 1 0 C A C

  6. Preprocessing: Construct a Suffix Tree PR: the reversal string of P. 1 2 4 7 3 8 6 5

  7. When there is a match, how do we move the window? T P

  8. T P

  9. Find the longest suffix of W is also the prefix of pattern. T P

  10. T P

  11. A Whole Example • T=GCATCGCAGAGAGTATACAGTACG  • P=GCAGAGAG • First attempt : T P Shift by: 5 (8 - 3)

  12. Second attempt : T P Shift by: 7 (8 - 1)

  13. Third attempt: T P Shift by: 7 (8 - 1)

  14. Third attempt: T P

  15. Conclusion • Preprocessing phase is O(m). • Searching phase is O(mn).

  16. Reference • [A90]Algorithms for finding patterns in strings, A. V. Aho, Handbook of Theoretical Computer Science, Vol. A, Elsevier, Amsterdam, 1990, pp.255-300. • [A85]The myriad virtues of suffix trees, Apostolico, A., Combinatorial Algorithms on words, NATO Advanced Science Institutes, Series F, Vol. 12, 1985, pp.85-96 • [AG86]The Boyer-Moore-Galil string searching strategies revisited, Apostolico, A. and Giancarlo, R., SIAM, Comput. 15, 1986, pp98-105. • [BR92]Average running time of the Boyer-Moore-Horspool algorithm, Baeza-Yates, R. A. and Regnier, M. Theoret. Comput. Sci., 1992, pp.19-31. • [BKR91]Analysis of algorithms and Data Structures, Banachowski, L., Kreczmar, A. and Rytter, W., Addison-Wesley. Reading, MA,1991.

More Related