Hash Tables
130 likes | 562 Views
Hash Tables. Hash tables. Definition: A data structure that uses a hash function to map keys into index of an array element. k1. k5. k2. k3. k4. Some properties of hash table. Size of hash table (Example will be shown.)
Hash Tables
E N D
Presentation Transcript
Hash tables • Definition: A data structure that uses a hash function to map keys into index of an array element. k1 k5 k2 k3 k4
Some properties of hash table • Size of hash table (Example will be shown.) • Hash function: map keys into index of an array element. (To be continued…) • Multiplication Hash • Division Hash • Input to build a hash table: array of keys to store in the hash table • int [] input = {1,2,3,4,5,6,7,8} 1 2 / 3 4 / 5 6 / 7 8 /
Example • Hash table size is 10 20 110 / 103 13 53 / 10 69 /
Division Hash • h1(k) = k mod m • Returns the index of array • k is the key • m is the size of the hash table. • Good values of m: prime numbers smaller than and closest to the size of the input. See Table 1. • Java syntax of mod is %. Table 1.
Multiplication hash • h2(k) = floor(m (kA mod 1) ) • m is size of hash table • Good values of m: prime numbers smaller than and closest to the size of the input. See table 1. • k is key • A = 0.61803 (Came from (sqrt(5) - 1)/2 ) • Hints: Use the decimal in your program is better, it may reduce your bugs.
Collisions • When hashing a key, if collision happens the new key is stored in the linked list in that location • Number of collisions of a location = Number of elements in that location - 1 # of collisions = 2-1=1 20 110 / # of collisions = 3-1=2 103 13 53 /
Collision Metrics • maxCollisions: Maximum number of collisions of all locations in a hash table • minCollisions: Minimum number of collisions of all locations in a hash table • totalCollisions: Total collisions of all locations in a hash table • Examples on the next slide
maxCollisions = 2 • minCollisions = 1 • (** Note that the minCollisions will be at least 1 if there exists collisions in some locations, even if there are locations with 0 collisions. If there is no collisions at all, return 0. ) • totalCollisions = 4 # of collisions = 1 20 110 / # of collisions = 2 103 13 53 / 105 15 / # of collisions = 1
Discussion • Why metrics? • It can tell us which hash is better according to the collision metrics • Why 3 metrics, why not just measure totalCollisions? • Let’s see an example.
Which hash table is better? 20 110 / 103 13 / Hash table 1: totalCollisions = 4 103 13 / 103 13 / 20 110 103 13 103 / 13 / 103 / 13 / Hash table 2: totalCollisions = 4
We not only want less collisions, but also want to distribute the collisions evenly into the hash table. That is why hash table 1 is better than hash table 2. • This lab is to implement two hash functions, division and multiplication and use metrics of collisions to demonstrate which hash is better.