1 / 22

Smith Algorithm

Smith Algorithm. Experiments with a very fast substring search algorithm, SMITH P.D., Software - Practice & Experience 21(10), 1991, pp. 1065-1074. . Adviser: R. C. T. Lee Speaker: C. W. Cheng National Chi Nan University. Problem Definition.

eros
Download Presentation

Smith Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Smith Algorithm Experiments with a very fast substring search algorithm, SMITH P.D., Software - Practice & Experience 21(10), 1991, pp. 1065-1074. Adviser: R. C. T. Lee Speaker: C. W. Cheng National Chi Nan University

  2. Problem Definition Input: a text string T with length n and a pattern string P with length m. Output: all occurrences of P in T.

  3. Definition • Ts: the first character of a string T aligns to a pattern P. • Pl : the first character of a pattern P aligns to a string T. • Tj : the character of the jth position of a string T. • Pi : the character of the ith position of a pattern P. • Pf : the last character of a pattern P. • n :The length of T. • m : The length of P.

  4. Rule 2-2: 1-Suffix Rule (A Special Version of Rule 2) • Consider the 1-suffix x. We may apply Rule 2-2 now.

  5. Introduction • takes the maximum of the Horspool shift function and the Quick Search shift function. • uses Rule 2-2: 1-Suffix Rule

  6. Smith Algorithm • This algorithm is almost the same as Quick Search Algorithm except the last character of the window is also considered. If this will induce a better movement than the Quick Search Algorithm. This is used; otherwise the Quick Search is used.

  7. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG

  8. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG mismatch

  9. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG hpBC[A]=1, qsBC[G]=1, shift=1 mismatch

  10. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG

  11. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG mismatch

  12. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG hpBC[G]=2, qsBC[A]=2, shift=2 mismatch

  13. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG

  14. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG exact match

  15. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG hpBC[G]=2, qsBC[T]=8, shift=8 exact match

  16. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG

  17. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG mismatch

  18. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG hpBC[T]=7, qsBC[A]=2, shift=7 mismatch

  19. Example • Text string T=GCGCAGAGAGTAGAGAGTACG • Pattern string P=CAGAGAG

  20. Time complexity • preprocessing phase in O(m+ σ) time and O(σ) space complexity, σ is the number of alphabets in pattern. • searching phase in O(mn) time complexity.

  21. Reference [KMP77] Fast pattern matching in strings, D. E. Knuth, J. H. Morris, Jr and V. B. Pratt, SIAM J. Computing, 6, 1977, pp. 323–350. [BM77] A fast string search algorithm, R. S. Boyer and J. S. Moore, Comm. ACM, 20, 1977, pp. 762–772. [S90] A very fast substring search algorithm, D. M. Sunday, Comm. ACM, 33, 1990, pp. 132–142. [RR89] The Rand MH Message Handling system: User’s Manual (UCIVersion), M. T. Rose and J. L. Romine, University of California, Irvine, 1989. [S82] A comparison of three string matching algorithms, G. De V. Smith, Software—Practice and Experience,12, 1982, pp. 57–66. [HS91] Fast string searching, HUME A. and SUNDAY D.M. , Software - Practice & Experience 21(11), 1991, pp. 1221-1248. [S94] String Searching Algorithms , Stephen, G.A., World Scientific, 1994. [ZT87] On improving the average case of the Boyer-Moore string matching algorithm, ZHU, R.F. and TAKAOKA, T., Journal of Information Processing 10(3) , 1987, pp. 173-177 . [R92] Tuning the Boyer-Moore-Horspool string searching algorithm, RAITA T., Software - Practice & Experience, 22(10) , 1992, pp. 879-884. [S94] On tuning the Boyer-Moore-Horspool string searching algorithms, SMITH, P.D., Software - Practice & Experience, 24(4) , 1994, pp. 435-436. [BR92] Average running time of the Boyer-Moore-Horspool algorithm, BAEZA-YATES, R.A., RÉGNIER, M., Theoretical Computer Science 92(1) , 1992, pp. 19-31. [H80] Practical fast searching in strings, HORSPOOL R.N., Software - Practice & Experience, 10(6) , 1980, pp. 501-506. [L95] Experimental results on string matching algorithms, LECROQ, T., Software - Practice & Experience 25(7) , 1995, pp. 727-765.

  22. Thanks for your listening

More Related