1 / 11

A lecture on hashes

A lecture on hashes. By Charles Morris. What is a hash table?. A hash table is an array-like data structure that associates its input (the key) with the associated output (the record, or value). They use a ‘Hashing Function’ to create the association; more on this later.

Download Presentation

A lecture on hashes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A lecture on hashes By Charles Morris

  2. What is a hash table? • A hash table is an array-like data structure that associates its input (the key) with the associated output (the record, or value). • They use a ‘Hashing Function’ to create the association; more on this later. • Hashes were first known as “Associative Arrays” (and you may think of them as so); but using seven syllables is tiring.

  3. What is a hash table?

  4. Why a hash? • Hashes have many distinct benefits. • Hashes are designed so that you may find a variable without knowing its location. • Computational complexity for lookup varies; Almost always O(1) or O(2), but in the case of `c` collisions, (where c <= n) it is O(c) (like searching an array).

  5. What is a ‘hash function’? • A hash function is simply a function that generates a fingerprint based on it’s input. • Hashing functions are used in many fields, in cryptography one-way hash functions are used to create a small pseudo-random checksum out of (normally) much larger data. This checksum is used for authentication and secure network transmissions, amongst other applications.

  6. Hash functions and Collisions • In a one-way cryptographic hash function, the amount of collisions is not as important as the randomness of the collisions. These functions try to minimize the computational feasibility of finding a key ‘j’ such that j = f(k); where k is the original key.

  7. Hash functions and Collisions • In a hash function used to store data in a hash table, collisions should be minimized as much as possible; as every time a collision happens for key ‘j’ where j = f(k); it increases the O(lookup of f(k)). • This assumes that your hash table watches for collisions, if it doesn’t, the old value will be trampled with the new value.

  8. Collision mitigation • As was stated, if two keys ‘k’ and ‘j’ both hash to the same index ( f(j) == f(k) ), this causes a collision. • Collisions are avoided by using a good hashing algorithm, however they always happen to some degree when a hash function is given enough data to run on.

  9. Collision resolution • Chaining • Instead of one value being at the location f(k), there is a chain (maybe a linked-list) of values. • Linear or Quadratic Probing is a similar solution, where space is reserved at certain locations. • Double hashing • Double hashing requires another hash computation, O(2n), however the records become so sparse that collisions become very rare.

  10. Hash function example • The hashing function used in Perl 5.005: • (close relative of the popular ‘djb2’ algorithm) // (Defined by the PERL_HASH macro in hv.h) // ported by Charles Morris (me) for C++ programmers unsigned long hashingfunction( string key ) { unsigned long fingerprint = 0; for( int j = 0; j < key.length(); j++) //for each letter in the string { fingerprint = fingerprint * 33 + (int)key.at(j); //sum } return fingerprint; }

  11. Hash function example • hf(‘abc’) using the previous function fingerprint = 0 //fingerprint = fingerprint * 33 + (int)’a’; fingerprint = 0 * 33 + 97; //fingerprint = 97 //fingerprint = fingerprint * 33 + (int)’b’; fingerprint = (97 * 33) + 98; //fingerprint = 3299 //fingerprint = fingerprint * 33 + (int)’c’; fingerprint = (3299 * 33) + 99; //fingerprint = 108966

More Related