1 / 17

The Galil-Giancarlo algorithm

The Galil-Giancarlo algorithm. On the exact complexity of string matching: upper bounds , SIAM Journal on Computing , Vol. 21 , No. 3 , 1992 , pp. 407-437 .  Galil, Z. and Giancarlo, R. Advisor: Prof. R. C. T. Lee Speaker: S. Y. Tang.

edmund
Download Presentation

The Galil-Giancarlo algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Galil-Giancarlo algorithm On the exact complexity of string matching: upper bounds , SIAM Journal on Computing , Vol. 21 , No. 3 , 1992 , pp. 407-437 .  Galil, Z. and Giancarlo, R. Advisor: Prof. R. C. T. Lee Speaker: S. Y. Tang

  2. The Galil-Giancarlo algorithm is an algorithm which solves the string matching problem. • String matching problem: Input: a text string T of length n and a pattern string P of length m. Output: all occurrences of P in T.

  3. The Galil-Giancarlo algorithm(GG algorithm for short) is an algorithm which improves the worst case of the Colussi algorithm. • There are two phases in the GG algorithm which are preprocessing and searching. • The preprocessing phase is the same as the Colussi algorithm. • The GG algorithm adds 5 cases to determine how to jump in the searching phase and this is the difference between GG algorithm and Colussi algorithm.

  4. The cases under which the GG algorithm is not used. • Case1: The pattern has only one period. The entire window is skipped. There is no way to know whether there is a prefix in the window equal to a prefix of the pattern. • Example: T: GCAGCGGGAC P: GGAGC GGAGC mismatch shift

  5. Case2: A prefix of the pattern is already known to be equal to a prefix of the window. T: GGACGGAACGCA P: GGAGGGA GGAGGGA T: GCAGGAGCAGCA P: GGAGGAG GGAGGAG mismatch shift mismatch shift

  6. Case:1 Text k = 2 If l>k Pattern l = 5 shift If l=k ; p[l+1]≠t[j+k] Case:2 Text k = 3 Pattern l = 3 shift If l<k ; p[l+1]≠t[j+k] Case:3 Text k = 5 Pattern l = 2 shift

  7. Case: 4 Text k = 3 If l=k ; p[l+1]= t[j+k] ; Pattern l = 3 Do not need to shift. Case: 5 Text k = 5 If l<k ; p[l+1]= t[j+k] Pattern l = 3 shift

  8. Example(1/7) T P mismatch shift Shift[4] = 4 We first compare noholes by using phase 1 of Colussi algorithm and shift by using the Shift[i].

  9. Example(2/7) T P match

  10. Example(3/7) T P mismatch shift Shift[0] = 5 After all noholes are matched, we compare holes by using phase 2 of Colussi algorithm and shift by using the Shift[i].

  11. Example(4/7) T k = 2 P l = 3 shift In this case, we use the Case 1 of the GG algorithm to shift because this case satisfies the condition overlay < lof using the GG algorithm and l > k.

  12. Example(5/7) T P All noholes are match mismatch shift Shift[2] = 5 After comparing the cases of the GG algorithm, We return to use the Colussi algorithm.

  13. Example(6/7) T k = 2 P l = 3 shift In the case, we use the Case 5 of the GG algorithm to shift because this case satisfies the condition of using the GG algorithm and l < k.

  14. Example(7/7) T P Exact match After comparing the cases of the GG algorithm, We return to use the Colussi algorithm.

  15. Time complexity • preprocessing phase in O(m) time and space complexity. • searching phase in O(n) time complexity. • performs (4/3)n text character comparisons in the worst case.

  16. Conclusion • The Galil-Giancarlo algorithm is very similar to Colussi algorithm. The Colussis algorithm performs very badly if the pattern starts and ends with a sequence of repetitions of the same symbol. For these patterns Colussis algorithm shifts by a single position and (3/2)n comparisons are actually performed. Galil and Giancarlo devised a way to avoid these shifts by a single position.

  17. References • [B92] BRESLAUER, D., Efficient String Algorithmics, Ph. D. Thesis, Report CU-024-92, Computer Science Department, Columbia University, New York, NY, 1992. • [GG92] On the exact complexity of string matching: upper bounds , Galil, Z. and Giancarlo, R. , SIAM Journal on Computing , Vol. 21 , No. 3 , 1992 , pp. 407-437 .

More Related