Multiple choice hash tables with moves on deletes and inserts
Download
1 / 19

Multiple Choice Hash Tables with Moves on Deletes and Inserts - PowerPoint PPT Presentation


  • 95 Views
  • Uploaded on

Multiple Choice Hash Tables with Moves on Deletes and Inserts. Adam Kirsch Michael Mitzenmacher. Hashing : Modern Perspective. For many situations (e.g., hardware for routers) multiple choice hash tables are state-of-the-art. Each item gets d possible hash locations, placed in one.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Multiple Choice Hash Tables with Moves on Deletes and Inserts' - crevan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Multiple choice hash tables with moves on deletes and inserts

Multiple Choice Hash Tables with Moves on Deletes and Inserts

Adam Kirsch

Michael Mitzenmacher


Hashing modern perspective
Hashing : Modern Perspective Inserts

  • For many situations (e.g., hardware for routers) multiple choice hash tables are state-of-the-art.

    • Each item gets d possible hash locations, placed in one.

  • Moving items among choices (e.g., cuckoo hashing) greatly improves space utilization.

    • Only cost : may take many moves per insert.


Previously
Previously Inserts

  • Schemes that move at most 1 item per insertion.

    • Limit cost of cuckoo hashing.

  • Schemes that batch move operations in a queue.

    • Amortize cost of cuckoo hashing.

  • Using content addressable memories (CAMs) to reduce chance of overflow.

    • Small CAMs yield big gains.


Contributions
Contributions Inserts

  • Consider potential of moving items on deletions.

    • Focus on one move per deletion/insertion.

  • Examine alternative approach using weaker hashing from [KTC, Peacock Hashing].

    • Analyze limits of performance.


Multilevel hash table bk90
Multilevel Hash Table [BK90] Inserts

  • Use a multilevel hash table (MHT)

    • Can store n elements with d = log log n + O(1) levels in O(n) space with high probability

    • Example with d = 4 hash functions

Level

1

2

x

3

Skew: more elements placed

by early hash functions

(double exponential decay)

4


Second chance sc scheme
Second Chance (SC) Scheme Inserts

  • Standard MHT fills from top down

    • elements cascade from table to table.

    • We try to slow cascade at every step.

x

Standard MHT Insertion


Second chance sc scheme1
Second Chance (SC) Scheme Inserts

  • Standard MHT fills from top down

    • elements cascade from table to table.

    • We try to slow cascade at every step.

x


Second chance sc scheme2
Second Chance (SC) Scheme Inserts

  • Standard MHT fills from top down

    • elements cascade from table to table.

    • We try to slow cascade at every step.

x


CAMs Inserts

  • Last few collisions hard to stop.

    • Can waste lots of space on few items.

  • Solution : content addressable memory.

    • CAMs fully asociative.

    • Hold small numbers of items.


Moves on deletions
Moves on Deletions Inserts

  • Harder to manage.

  • What item to move up?

Level

1

2

x

3

4


Hint based approach
Hint-Based Approach Inserts

  • Each cell stores hint for where an item to move on delete is held.

  • Hints can be kept fairly small.

    • About log n bits.

  • Various hint approaches possible.

    • We found “replace hint on any collision” works well.

    • May depend on item lifetime distribution, etc.

    • One move, recursive move variations.


Simulation data
Simulation Data Inserts

  • No current method of analysis for hints.

    • Use simulations. 10,000 trials per data point.

  • MHT levels decreasing in size by factor of 2. Plus small CAM.

  • With n items, top level has size n.

    • Space usage just above 50%.

  • Load table to n elements, alternate inserts/deletes for 218 steps.

    • Exponentially distributed lifetimes.

  • Goal : how many hash functions needed?



Lessons from simulations
Lessons from Simulations Inserts

  • No moves very weak.

  • Second Chance (move on insert) more powerful than hint-based move on delete.

  • But the two combine well.

    • Four hash functions: better than 50% load, small CAM.


Alternative weak hashes
Alternative : Weak Hashes Inserts

  • To avoid hints, overflow at each bucket splits to two buckets at next level.

    • Each bucket receives from four buckets.

  • Less spreading of items, but know where to look on deletes.

  • Conjecture : loss of randomness implies weak performance.



Two idealized schemes
Two Idealized Schemes Inserts

  • Each bucket holds random item, splits rest.

  • Each bucket counts items passed to bucket A and bucket B at next level, greedily holds item from bucket with larger count.

  • Assume invariants kept over insertions/deletions at all times.

  • Can be analyzed recursively level by level.

    • Get distribution of bucket loads at each level.

    • Obtain average case peformance.


Results
Results Inserts


Conclusions
Conclusions Inserts

  • Weak hashes, based on buckets, much less effective than hints.

    • Even under optimistic assumptions.

  • One move approaches effective.

    • Move on insert/delete complement each other.

  • Need methods for analysis.

    • Challenging dependencies; hard to get exact numbers.


ad