1 / 22

Rabin-Karp algorithm

Rabin-Karp algorithm. Robin Visser. What is Rabin-Karp?. What is Rabin-Karp?. String searching algorithm. What is Rabin-Karp?. String searching algorithm Uses hashing. What is Rabin-Karp?. String searching algorithm Uses hashing (e.g. hash(“answer ” ) = 42 ). Algorithm.

menora
Download Presentation

Rabin-Karp algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rabin-Karp algorithm Robin Visser

  2. What is Rabin-Karp?

  3. What is Rabin-Karp? • String searching algorithm

  4. What is Rabin-Karp? • String searching algorithm • Uses hashing

  5. What is Rabin-Karp? • String searching algorithm • Uses hashing(e.g. hash(“answer”) = 42 )

  6. Algorithm

  7. Algorithm • functionRabinKarp(string s[1..n], string sub[1..m]) • hsub:=hash(sub[1..m]);hs:=hash(s[1..m]) • fori from 1 to n-m+1 • ifhs=hsub • ifs[i..i+m-1]= sub • returni • hs:=hash(s[i+1..i+m]) • return not found

  8. Algorithm • functionRabinKarp(string s[1..n], string sub[1..m]) • hsub:=hash(sub[1..m]);hs:=hash(s[1..m]) • fori from 1 to n-m+1 • ifhs=hsub • ifs[i..i+m-1]= sub • returni • hs:=hash(s[i+1..i+m]) • return not found Naïve implementation: Runs in O(nm)

  9. Hash function

  10. Hash function • Use rolling hash to compute the next hash value in constant time

  11. Hash function • Use rolling hash to compute the next hash value in constant time Example: If we add the values of each character in the substring as our hash, we get: hash(s[i+1..i+m])= hash(s[i..i+m-1]) – hash(s[i])+ hash(s[i+m])

  12. Hash function • A popular and effective hash function treats every substring as a number in some base, usually a large prime.

  13. Hash function • A popular and effective hash function treats every substring as a number in some base, usually a large prime.(e.g. if the substring is “IOI" and the base is 101, the hash value would be 73 × 1012 + 79 × 1011 + 73 × 1010 = 752725)

  14. Hash function • A popular and effective hash function treats every substring as a number in some base, usually a large prime.(e.g. if the substring is “IOI" and the base is 101, the hash value would be 73 × 1012 + 79 × 1011 + 73 × 1010 = 752725)Due to the limited size of the integer data type, modular arithmetic must be used to scale down the hash result.

  15. How is this useful

  16. How is this useful • Inferior to KMP algorithm and Boyer-Moore algorithm for single pattern searching, however can be used effectively for multiple pattern searching.

  17. How is this useful • Inferior to KMP algorithm and Boyer-Moore algorithm for single pattern searching, however can be used effectively for multiple pattern searching. • We can create a variant, using a Bloom filter or a set data structure to check whether the hash of a given string belongs to a set of hash values of patterns we are looking for.

  18. How is this useful Consider the following variant:

  19. How is this useful Consider the following variant: functionRabinKarpSet(string s[1..n], set of string subs, m): set hsubs:=emptySet for each sub in subs insert hash(sub[1..m]) into hsubs hs:=hash(s[1..m]) fori from 1 to n-m+1 ifhs ∈ hsubs and s[i..i+m-1] ∈ subs returni hs:=hash(s[i+1..i+m]) returnnot found

  20. How is this useful • Runs in O(n + k) time, compared to O(nk) time when searching each string individually.

  21. How is this useful • Runs in O(n + k) time, compared to O(nk) time when searching each string individually. • Note that a hash table checks whether a substring hash equals any of the pattern hashes in O(1) time on average.

  22. Questions?

More Related