1 / 11

Modern Information Retrieval

Modern Information Retrieval. Chapter 8 Indexing and Searching. Sequential searching brute force approach. a b a c. a b a c. Knuth-Morris-Pratt approach Left-to-right scan Shifting rule. a b a b a b a c. a b a c. a b a c. a b a c. Boyer-Moore approach Right-to-left scan

clovis
Download Presentation

Modern Information Retrieval

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Modern Information Retrieval Chapter 8 Indexing and Searching

  2. Sequential searching • brute force approach

  3. a b a c a b a c • Knuth-Morris-Pratt approach • Left-to-right scan • Shifting rule a b a b a b a c a b ac a b ac a b a c

  4. Boyer-Moore approach • Right-to-left scan • Bad character shift rule • Good suffix shift rule • Sub-linear time method • Examines fewer than m+n characters

  5. Right-to-left scan • Shift one place when a mismatch occurs • O(nm) xpbctbxabpqx tpabxab

  6. Bad character rule • Right-most position in P of each character • R(T(k)) K  R(T(k))=R(y) y y x R(y) i y x R(y) < i, shift i-R(y) positions i-R(y)

  7. Bad character rule K  R(T(k))=R(y) y x i x y R(y) > i , Shift 1 positions x R(y) = 0, shift n-i+1 positions n-i+1

  8. The strong good suffix rule x t z t’ y t z t’ x t

  9. The strong good suffix rule x t y t y t y t

  10. Shift-Or approach An example of the shift-or algorithm for p=aab and s=abcaaab T a b c a 0 1 1 0 1 1 1 0 1 a b E S(E) T[a] E S(E) T[b] E S(E) T[c] E S(E) T[a] E E S(E) T[a] E S(E) T[a] E S(E) T[b] a a b 1 1 0 1 1 1 0 1 1 0 0 1 0 1 1 0 0 1 1 1 0 1 1 1 0 0 1 0 0 0 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 1

More Related