1 / 42

CS 221 Guest lecture: Cuckoo Hashing

CS 221 Guest lecture: Cuckoo Hashing. Shannon Larson March 11, 2011. Learning Goals. Describe the cuckoo hashing principle Analyze the space and time complexity of cuckoo hashing Apply the insert and lookup algorithms in a cuckoo hash table Construct the graph for a cuckoo table.

amal
Download Presentation

CS 221 Guest lecture: Cuckoo Hashing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 221Guest lecture: Cuckoo Hashing Shannon Larson March 11, 2011

  2. Learning Goals • Describe the cuckoo hashing principle • Analyze the space and time complexity of cuckoo hashing • Apply the insert and lookup algorithms in a cuckoo hash table • Construct the graph for a cuckoo table

  3. Remember Graphs? • A set of nodes • A set of edges • Here:

  4. Graph Cycles • A graph cycle is a path of edges such that the first and last vertices are the same

  5. Recall Hashing • A hash function • Takes the target • Hashes x to a bucket • Perfect hashing is ideal: • O(1) lookup • O(1) insert • Perfect hashing is not realistic!

  6. Cuckoo Hashing: the idea • Remember the cuckoo bird? • Shares a nest with other species… • …then kicks the other species out! • Same idea with cuckoo hashing • When we insert , we “kick out” what occupies the nest, • Then finds a new, alternate home

  7. Why is this cool? • Perfect hashing guarantees • O(1) lookup, O(1) insert • Cuckoo hashing guarantees • O(1) lookup • O(1) insert** • Other hashing strategies can’t guarantee this! • Also, it’s an option for your final project ** There’s a caveat here, but we’ll see it later

  8. Cuckoo Hashing: Two Nests • Suppose we have TWO hash tables • they each have a hash function • we prefer, but if we have to move we’ll go to • if we’re in and have to move, we’ll go back to • This is our collision strategy for cuckoo hashing • Different from linear probing/open addressing • Different from trees

  9. Cuckoo Hashing: Example • We want to insert • There are no conflicts anywhere x

  10. Cuckoo Hashing : Example • Now we want to insert • There are no conflicts anywhere x y

  11. Cuckoo Hashing : Example • To insert , • Move to y x oh no! z

  12. Cuckoo Hashing : Example • Now we insert into y x NOW we’re fine! z

  13. Cuckoo Hashing : Example • The final table after inserting in order y x z

  14. Why two tables? • Two tables, one for each hash function • Simple to visualize, simple to implement • But, why two? • One table works just as well! • Just as simple to implement (all one table)

  15. One Table Example • Let’s insert again, with • Again, preferred x

  16. One Table Example • Now insert • No conflicts, no problem x y

  17. One Table Example • Now insert • But, another conflict with : oh no! x y z

  18. One Table Example • First, move to x y z

  19. One Table Example • Now we move to x z y

  20. One Table Example • Final table after inserting in order x z y

  21. Graph Representation • How can we represent our table? • Why not a graph? • Nodes are every possible table entry • Edges are inserted entries • This is a directed graph • Direction from current location TO alternate location

  22. Graph Example • Remember our one-table example? 1 2 x 1 2 z y 3 3 4 4

  23. Infinite Insert • Suppose we insert something, and we end up in an infinite loop • Or, “too many” displacements • Some pre-defined maximum based on table size

  24. Example: Loops • Remember our one-table example? x 1 1 2 2 z y 3 3 4 4

  25. Example: Loops • Let’s insert : no conflicts still x 1 1 2 2 z y 3 3 4 w 4

  26. Example: Loops • Now let’s insert : displace x 1 1 2 2 z y 3 3 4 a w 4

  27. Example: Loops • Now is placed, and is displaced (put in 4) a 1 1 2 2 x y 3 3 4 z w 4

  28. Example: Loops • Now is placed, and is displaced (put in 3) a 1 1 2 2 x y 3 3 4 w z 4

  29. Example: Loops • Notice what happens to the graph • We keep going and going and going…. 1 2 3 4

  30. Analysis: Loops • Remember infinite loops in a new insert? • In the graph, this is a closed loop • We might forever re-do the same displacements • The probability of getting a loop increases dramatically once we’ve inserted elements • N is the number of buckets (size of table) • This is from the research on cuckoo hashing

  31. Analysis: Loops • What can we do once we get a loop? • Rebuild, same size (ok solution) • Double table size (better solution) • We’ll need new hash functions for both

  32. Analysis • Lookup has O(1) time • At MOST two places to look, ever • One location per hash function • Insert has amortized O(1) time • Think of this as “in the long run” • In practice we see O(1) time insert • You’ll see amortized analysis in CPSC 320 • Remember the “grass and trees” analysis?

  33. Lookup: The Code Return the position of (either or ) Otherwise, return false lookup(x) return T[h1(x)] = x or T[h2(x)] = x

  34. Insert: The Code Given a table (array) T and item to insert: insert(x) if lookup(x) return; // if it’s already here, done pos <- h1(x); // store h1(x) for i <- 1 to M // loop at most M times if T[pos] empty T[pos] <- x return; // if T[pos] empty, done swap x and T[pos]; // put x in T[pos] if pos = h1(x) // now we’re displacing pos <- h2(x) else pos <- h1(x) rehash(); // if we couldn’t stop, rehash insert(x); // then insert currently displaced end

  35. Analysis: Load Factor • What is load? • The average fill factor (% full) the table is • What about cuckoo hash tables? • For two hash functions, load factor • Remember loops? • For three hash functions, we get • That’s pretty great, actually!

  36. More hash functions • What would this look like? • We would have three tables (simple case) • One hash function per table • Or, we would have two alternates (one table)

  37. More hash functions • What would this look like? • Each entry has TWO alternates, not one x z y

  38. More hash functions • When something comes in new (insert) • Put it in • If it’s displaced, check • If that’s full, go to • To lookup, we just look in or • Still constant time!

  39. Even better load? • Currently we’ve only put one item per bucket • What if we had two cells per bucket? x,w z y,a

  40. Even better load? • Currently we’ve only put one item per bucket • What if we had two cells per bucket? • What about collision strategies? • Round-robin (cells take turns swapping out) • FIFO (oldest resident gets kicked out)

  41. Even better load?

  42. Links & Resources • http://en.wikipedia.org/wiki/Cuckoo_hashing • http://www.ru.is/faculty/ulfar/CuckooHash.pdf • http://www.it-c.dk/people/pagh/papers/cuckoo-undergrad.pdf • No neat animations on the internet…yet! • Possible personal project? • Brownie points? • Pre-coop project?

More Related