440 likes | 731 Views
CS 221 Guest lecture: Cuckoo Hashing. Shannon Larson March 11, 2011. Learning Goals. Describe the cuckoo hashing principle Analyze the space and time complexity of cuckoo hashing Apply the insert and lookup algorithms in a cuckoo hash table Construct the graph for a cuckoo table.
E N D
CS 221Guest lecture: Cuckoo Hashing Shannon Larson March 11, 2011
Learning Goals • Describe the cuckoo hashing principle • Analyze the space and time complexity of cuckoo hashing • Apply the insert and lookup algorithms in a cuckoo hash table • Construct the graph for a cuckoo table
Remember Graphs? • A set of nodes • A set of edges • Here:
Graph Cycles • A graph cycle is a path of edges such that the first and last vertices are the same
Recall Hashing • A hash function • Takes the target • Hashes x to a bucket • Perfect hashing is ideal: • O(1) lookup • O(1) insert • Perfect hashing is not realistic!
Cuckoo Hashing: the idea • Remember the cuckoo bird? • Shares a nest with other species… • …then kicks the other species out! • Same idea with cuckoo hashing • When we insert , we “kick out” what occupies the nest, • Then finds a new, alternate home
Why is this cool? • Perfect hashing guarantees • O(1) lookup, O(1) insert • Cuckoo hashing guarantees • O(1) lookup • O(1) insert** • Other hashing strategies can’t guarantee this! • Also, it’s an option for your final project ** There’s a caveat here, but we’ll see it later
Cuckoo Hashing: Two Nests • Suppose we have TWO hash tables • they each have a hash function • we prefer, but if we have to move we’ll go to • if we’re in and have to move, we’ll go back to • This is our collision strategy for cuckoo hashing • Different from linear probing/open addressing • Different from trees
Cuckoo Hashing: Example • We want to insert • There are no conflicts anywhere x
Cuckoo Hashing : Example • Now we want to insert • There are no conflicts anywhere x y
Cuckoo Hashing : Example • To insert , • Move to y x oh no! z
Cuckoo Hashing : Example • Now we insert into y x NOW we’re fine! z
Cuckoo Hashing : Example • The final table after inserting in order y x z
Why two tables? • Two tables, one for each hash function • Simple to visualize, simple to implement • But, why two? • One table works just as well! • Just as simple to implement (all one table)
One Table Example • Let’s insert again, with • Again, preferred x
One Table Example • Now insert • No conflicts, no problem x y
One Table Example • Now insert • But, another conflict with : oh no! x y z
One Table Example • First, move to x y z
One Table Example • Now we move to x z y
One Table Example • Final table after inserting in order x z y
Graph Representation • How can we represent our table? • Why not a graph? • Nodes are every possible table entry • Edges are inserted entries • This is a directed graph • Direction from current location TO alternate location
Graph Example • Remember our one-table example? 1 2 x 1 2 z y 3 3 4 4
Infinite Insert • Suppose we insert something, and we end up in an infinite loop • Or, “too many” displacements • Some pre-defined maximum based on table size
Example: Loops • Remember our one-table example? x 1 1 2 2 z y 3 3 4 4
Example: Loops • Let’s insert : no conflicts still x 1 1 2 2 z y 3 3 4 w 4
Example: Loops • Now let’s insert : displace x 1 1 2 2 z y 3 3 4 a w 4
Example: Loops • Now is placed, and is displaced (put in 4) a 1 1 2 2 x y 3 3 4 z w 4
Example: Loops • Now is placed, and is displaced (put in 3) a 1 1 2 2 x y 3 3 4 w z 4
Example: Loops • Notice what happens to the graph • We keep going and going and going…. 1 2 3 4
Analysis: Loops • Remember infinite loops in a new insert? • In the graph, this is a closed loop • We might forever re-do the same displacements • The probability of getting a loop increases dramatically once we’ve inserted elements • N is the number of buckets (size of table) • This is from the research on cuckoo hashing
Analysis: Loops • What can we do once we get a loop? • Rebuild, same size (ok solution) • Double table size (better solution) • We’ll need new hash functions for both
Analysis • Lookup has O(1) time • At MOST two places to look, ever • One location per hash function • Insert has amortized O(1) time • Think of this as “in the long run” • In practice we see O(1) time insert • You’ll see amortized analysis in CPSC 320 • Remember the “grass and trees” analysis?
Lookup: The Code Return the position of (either or ) Otherwise, return false lookup(x) return T[h1(x)] = x or T[h2(x)] = x
Insert: The Code Given a table (array) T and item to insert: insert(x) if lookup(x) return; // if it’s already here, done pos <- h1(x); // store h1(x) for i <- 1 to M // loop at most M times if T[pos] empty T[pos] <- x return; // if T[pos] empty, done swap x and T[pos]; // put x in T[pos] if pos = h1(x) // now we’re displacing pos <- h2(x) else pos <- h1(x) rehash(); // if we couldn’t stop, rehash insert(x); // then insert currently displaced end
Analysis: Load Factor • What is load? • The average fill factor (% full) the table is • What about cuckoo hash tables? • For two hash functions, load factor • Remember loops? • For three hash functions, we get • That’s pretty great, actually!
More hash functions • What would this look like? • We would have three tables (simple case) • One hash function per table • Or, we would have two alternates (one table)
More hash functions • What would this look like? • Each entry has TWO alternates, not one x z y
More hash functions • When something comes in new (insert) • Put it in • If it’s displaced, check • If that’s full, go to • To lookup, we just look in or • Still constant time!
Even better load? • Currently we’ve only put one item per bucket • What if we had two cells per bucket? x,w z y,a
Even better load? • Currently we’ve only put one item per bucket • What if we had two cells per bucket? • What about collision strategies? • Round-robin (cells take turns swapping out) • FIFO (oldest resident gets kicked out)
Links & Resources • http://en.wikipedia.org/wiki/Cuckoo_hashing • http://www.ru.is/faculty/ulfar/CuckooHash.pdf • http://www.it-c.dk/people/pagh/papers/cuckoo-undergrad.pdf • No neat animations on the internet…yet! • Possible personal project? • Brownie points? • Pre-coop project?