Hashing

1 / 15

# Hashing - PowerPoint PPT Presentation

Hashing. Main ch. 11.2-11.5 Background: We want to store a collection of data. We want to add to, delete from, and search in the collection What is the average case complexity of add, delete, and search if: The collection is stored as an unsorted array

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Hashing' - darena

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Hashing
• Main ch. 11.2-11.5
• Background:
• We want to store a collection of data. We want to add to, delete from, and search in the collection
• What is the average case complexity of add, delete, and search if:
• The collection is stored as an unsorted array
• The collection is stored as a sorted array
• The collection is in a binary search tree
We Want to Do Better
• Hashing has good average case behavior
• Suppose
• You want to keep track of students via their student ID’s
• If students ID’s range from 0-99 this is easy
• What if SS type numbers are used?
• This type of data is known as sparse data
• The SS# becomes a key for obtaining the data
Suppose
• We want to keep track of only a small number of students in an array of size 10, and suppose we use SS#’s as keys
• We need a function that maps key values (SS #’s) to array indices (integers between 0 and 9)
• Such a function is called a hash function.
• An example hash function could be: hash(ssn) = ssn%10
Choosing a HashingFunction
• For the previous example, we could have used:
• hash(ssn) = the first number in the ssn
• We want a hash function that uniformly distributes the keys throughout the array. This is called uniform hashing.
• If you use a division hash function (remainder of division), it is best to have a table size that is a prime number of the form 4k+3.
• see Main, p. 552 for other kinds of hash functions
What could go wrong?
• If possible, store an object with key value key in array[hash(key)].
• This is not always possible: you may want to add an object whose key value hashes to an index that’s already in use.
• This is called a collision.
• What is the big-oh time complexity of a hash table lookup (search) if there are no collisions?
Handlihng Collisions
• Linear probing
• Place the object in the next open spot
• How would you find an object in a hash table that uses linear probing?
• How would you delete an object from a hash table that uses linear probing?
• See the example in Main p. 550-551
Linear probing
• Performance isn’t all that great
• Easy to implement
• As the hash table gets fuller, larger and larger consecutive stretches of the array will be filled. This is called clustering.
Double Hashing
• If there is a collision, hash the key again, using a second hash function.
• Double hashing is also called rehashing
Chained Hashing
• Each element in the array can hold a list of elements.
• Hash the key and put the object in the list in array[hash(key)]
• See the demo at: http://www2.ics.hawaii.edu/~richardy/project/hash/applet.html
A Hash Function forNames

private int hashFunction( String name ) {

int hashValue = 0;

char cName = name.toCharArray();

for (int j=0; j < cName.length; j++) {

hashValue += cName[j];

}

return hashValue % size;

}

Note that size is previously defined as the size of the hash table.

Time Analysis
• The load factor of a hash table is defined as follows:
Searching with LinearProbing
• In a non-full hash-table with no removals, and using uniform hashing, the average number of table elements examined in a successful search is approximately:
Searching withDouble Hashing
• In a non-full hash-table with no removals, and using uniform hashing, the average number of table elements examined in a successful search is approximately:
Searching withChained Hashing
• In a non-full hash-table, using uniform hashing, the average number of table elements examined in a successful search is approximately: