1 / 19

Lecture 18 Nov 3 Goals: hashing

Lecture 18 Nov 3 Goals: hashing dictionary operations general idea of hashing hash functions chaining closed hashing. Dictionary operations search insert delete Applications: compilers web searching

lmcnair
Download Presentation

Lecture 18 Nov 3 Goals: hashing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 18 Nov 3 • Goals: • hashing • dictionary operations • general idea of hashing • hash functions • chaining • closed hashing

  2. Dictionary operations • search • insert • delete • Applications: • compilers • web searching • game playing, state space search • spell checking etc.

  3. Hashing • An important and widely useful technique for implementing dictionaries • Constant time per operation (on the average) • Worst case time proportional to the size of the set for each operation (just like array and linked list implementation)

  4. General idea U = Set of all possible keys: (e.g. 9 digit SS #) If n = |U| is not very large, a simple way to support dictionary operations is:map each key e in U to a unique integer h(e) in the range 0 .. n – 1. Boolean array H[0 .. n – 1] to store keys.

  5. General idea

  6. Ideal case not realistic • U the set of all possible keys is usually very large so we can’t create an array of size n = |U|. • Create an array H of size m much smaller than n. • Actual keys present at any time will usually be smaller than n. • mapping from U -> {0, 1, …, m – 1} is called hash function. Example: D = students currently enrolled in courses, U = set of all SS #’s, hash table of size = 1000 Hash function h(x) = last three digits

  7. Example (continued) Insert Student “Dan” SS# = 1238769871 h(1238769871) = 871 Dan NULL hash table ... 0 1 2 3 871 999 buckets

  8. Example (continued) Insert Student “Tim” SS# = 1872769871 h(1238769871) = 871, same as that of Dan. Collision Dan NULL hash table ... 0 1 2 3 871 999 buckets

  9. Hash Functions If h(k1) =  = h(k2): k1 and k2 have collision at slot  There are two approaches to resolve collisions.

  10. Collision Resolution Policies Two ways to resolve: (1) Open hashing, also known as separate chaining (2) Closed hashing Chaining: keys that collide are stored in a list (which we can implement as a vector in Matlab). Closed hashing: find an alternate location to store the key.

  11. Previous Example Insert Student “Tim” SS# = 1872769871 h(1238769871) = 871, same as that of Dan. Collision Tim NULL Dan hash table ... 0 1 2 3 871 999 buckets

  12. Open Hashing The hash table is a pointer to the head of a linked list All elements that hash to a particular bucket are placed on that bucket’s linked list Records within a bucket can be ordered in several ways by order of insertion, by key value order, or by frequency of access order, in the order of most recent access etc.

  13. Open Hashing Data Organization ... 0 1 ... 2 3 4 ... D-1

  14. Choice of hash function • A good hash function should: • be easy to compute • distribute the keys uniformly to the buckets • use all the fields of the key object.

  15. Example: key is a string over {a, …, z, 0, … 9, _ } Suppose hash table size is n = 10007. (Choose table size to be a prime number.) Good hash function: interpret the string as a number to base 37 and compute mod 10007. h(“word”) = ? “w” = 23, “o” = 15, “r” = 18 and “d” = 4. h(“word”) = (23 * 37 + 15 * 37 + 18 * 37 + 4) % 10007

  16. Computing hash function for a string Horner’s rule: (( … (a0 x + a1) x + a2) x + … + an-2 )x + an-1) function indx = hash1(key, size) indx = 0; for i = 1:length(key) indx = 37 * indx + key(i); end; indx = mod(indx, size);

  17. Implementation of open hashing – search function hash_search(table, key) indx = myhash(key, table.size); lst = table.keys{indx}; any(strcmp(lst,key));

  18. Insert into an open hash table Key x is inserted in front of the list to which h hashes function [out, succ] = hash_insert(hash, key) indx = myhash(key, hash.size) succ = false; sindx = find(hash.keys{indx}, key); if isempty(sindx) hash.keys{indx} = [key]; succ = true; end; out = hash;

  19. Implementation of open hashing - delete function [out, succ] = hash_delete(table, x) % removes x if it is present in hash table indx = myhash(x, table.size) arr = table.keys{indx} indx1 = find(arr==x) if isempty(indx1) succ = false; else arr(indx1) = []; table.keys{indx} = arr; succ = true; end; out = table;

More Related