1 / 31

Randomized Algorithms CS648

Randomized Algorithms CS648. Lecture 12 Hashing - II. Recap of Last Lecture. Problem Definition. called universe and Examples: , Aim Given a set , build a data structure storing s.t. we can answer in O ( 1 ) time : “ Does ?” for any given. Hashing. Hash table:

terris
Download Presentation

Randomized Algorithms CS648

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Randomized AlgorithmsCS648 Lecture 12 Hashing - II

  2. Recap of Last Lecture

  3. Problem Definition • called universe • and Examples: , Aim Given a set , build a data structure storing s.t. we can answer in O(1) time : “Does ?” for any given .

  4. Hashing • Hash table: : an array of size . • Hash function :  Answering a Query:“Does ?” • ; • Search the list stored at . Properties of : • computable in O(1) time. • Space required by : O(1). Elements of 0 1 How many bits needed to encode ?

  5. Collision Definition: Two elements are said to collide under hash function if Worst case time complexity of searching an item : No. of elements in colliding with . 0 1

  6. Universal Hash Family Definition: A collection of hash-functions is said to be universal if there exists a constant such that for any , This definition appears strange in the beginning! But we shall soon see that there is a very natural way to arrive at this definition.

  7. Perfect hashing using O() space Let be Universal Hash Family. Let : the number of collisions for when ? Question:What is ?

  8. Perfect hashing using O() space Let be Universal Hash Family. Let : the number of collisions for when ? Lemma1: Lemma2:For , there will be no collision with probability at least . Algorithm1: Perfect hashing for Fix ; Repeat • Pick ; •  the number of collisions for under . Until . Build the hash table. Theorem: A perfect hash function can be computed for in expected O() time.

  9. Hashing with Optimal space And Worst case O(1) search time

  10. Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1:. Question: What is ] when = ? Answer: .

  11. Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Algorithm: Fix; Repeat • Pick; •  no. of collisions for under ; Until ; Build the hash table; //primary hash table For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table; 0 1

  12. Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Algorithm: Fix; Repeat • Pick; •  no. of collisions for under ; Until ; Build the hash table; //primary hash table For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table; 0 1

  13. Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Algorithm: Fix; Repeat • Pick; •  no. of collisions for under ; Until ; Build the hash table; //primary hash table For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table; 0 1

  14. Optimal space hashing with worst case O(1) search time be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Algorithm: Fix; Repeat • Pick; •  no. of collisions for under ; Until ; Build the hash table; //primary hash table For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table; 0 1

  15. 0 1 2 . . . be Universal Hash Family. : no. of collisions for when ? Lemma1: when . Let : number of elements in [] Extra Space required: = • = + •  0 1 2 . . . Is there any relation between and ’s?

  16. Theorem: A given set can be preprocessed in expectedO() time to build a data structure (2-level hash table) of O() size such that any search query can be answer in worst case O(1) time.

  17. Why such a definition for Universal Hash family ?

  18. Why does hashing work so well in Practice ? A simple hash function:. • works so well in practice because the set is usually a uniformly random subset of . As a result • It is easy to fool this hash function such that it achieves O(s) search time. This makes us think: “Can we achieve expected O(1) search time for any given set .” similar question while Quick Sort Randomized Quick Sort

  19. Universal Hash Family A simple hash function:. Definition: A collection of hash-functions is said to be universal if there exists a constant such that for any ,

  20. A simple and Compact Universal Hash family

  21. The starting point The simple hash function:. Problem: Two elements in are bound to collide if divides || . Is there some operation which when applied over any distributes || randomly uniformly over [0,1,…,] ?

  22. mod operation : a non-negative integer : a positive integer mod{0,1,…,}. Question: How is |mod| related to ||mod ? Consider some Examples: • | 55mod3143mod31 | = ?? and | 55 43| mod 31 = ?? • | 91mod31102mod31 | = ?? and |91 102| mod 31 = ?? Answer: Let = || mod. Then|mod| = ?? 12 12 20 11 {, }

  23. mod operation : a prime number : {} Consider any . Question: What can we say about set = {} ? Example: ,. 3 6 2 5 1 4

  24. mod operation : a prime number : {} Consider any . Question: What can we say about set = {} ? Example: ,. Fact: = for all . Proof: = • divides • divides • divides ordivides 6 2 5 1 4 3 4 1 5 2 6 3 Not possible

  25. mod operation : a prime number : {} Consider any . Define set = {}? Fact: = for all . Question:If , then what can we say about ? Answer: distributed randomly uniformly over . Can you now see, that the above answer plays the key role in formulating the hash function ?

  26. Good fact: An element is mapped to a random element in {}. Slightly bad fact : Once element is mapped to a location, the mapping of is no more random. So it is not clear whether | - | is mapped uniformly randomly over {0,…,}. …So let us see () a bit more closely… 1 2 . . .

  27. Probability of collision between and Let and will collide under if |modmod| is divisible by . Question: What is relation between |modmod| and mod ? Answer:|modmod| is either mod or .

  28. Probability of collision between and Let Lemma: If andcollide under , then either mod is divisible by or is divisible by . {mod| } = ?? Let . Probability of collision between and = P(mod is divisible of or is divisible by ) 2P(mod is divisible of ) = Students must realize that it is a necessary condition and not sufficient condition for collision. To get an idea, study the example given at the last slide of this lecture. {,…,}

  29. Theorem: Let, then H={| } is universal.

  30. Example , . Observe that =1 Question: How many collisions between nd? Answer: two (for =3,4). Here for =4. And for =3 Answer:No collisions! (although for here.) 1 2 3 4 5 6 1 2 3 4 5 6 Table storing

  31. Homework: Let, Then prove that H={| } is universal. In particular, show that for any , Hence it is slightly better than the hash family discussed just now.

More Related