lecture 6
Download
Skip this Video
Download Presentation
Lecture 6

Loading in 2 Seconds...

play fullscreen
1 / 17

Lecture 6 - PowerPoint PPT Presentation


  • 218 Views
  • Uploaded on

Lecture 6 Hashing Motivating Example Want to store a list whose elements are integers between 1 and 5 Will define an array of size 5, and if the list has element j, then j is stored in A[j-1], otherwise A[j-1] contains 0. Complexity of find operation is O(1) Hash table

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Lecture 6' - emily


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
lecture 6

Lecture 6

Hashing

motivating example
Motivating Example

Want to store a list whose elements are integers between 1 and 5

Will define an array of size 5, and if the list has element j, then j is stored in A[j-1], otherwise A[j-1] contains 0.

Complexity of find operation is O(1)

hash table
Hash table

The objective is to find an element in constant time ``on an average.’’

Supposing we know the elements belong to 1,2…U, and we are allowed an overall space of U, then this can be done as described before.

But U can be very large.

Space for storage is called ``hash table,’’ H

slide4
Assume that the hashtable has size M

There is a hashfunction which maps an element to a value p in 0,….M-1, and the element is placed in position p in the hashtable.

The function is called h[j], (the hash value for j is h[j])

If h[j] = k, then the element is added to H[k].

Suppose we want a list of integers, then an example hash function is h[j] = j modulo M.

Note down example from board

slide5
We may want to store elements which are not numbers, e.g., names.

Then we use a function to convert each element to an integer and hash the integer.

We want to store string, abc

Represent each symbol by the ASCII code, choose a number r, integer value for abc is ASCII(a)r2 + ASCII(b)r + ASCII ( c )

implementation
Implementation

Hashtables are arrays.

Size of a hash table is normally a prime number

Two different elements may hash to the same value (collision)

Hashing needs collision resolution

Hash functions are chosen so that the hash values are spread over 0,…..M-1, and there are only few collisions.

separate chaining
Separate Chaining

Store all the elements mapped to the same position in a linked list.

Note down the illustration from the board.

H[k] is the list of all elements mapped to k.

To find an element j, compute h(j). Let h(j) = k. Then search in link list H[k]

To insert an element j, compute h(j). Let h(j) = k. Then insert in link list H[k]

To delete an element, delete from the link list.

slide8
Note down example from the board.

Insertion is O(1).

  • Worst case searching complexity depends on the maximum length of a list H[p]
    • O(q) if q is the maximum length.

We are interested in average searching complexity.

slide9
Load factor  is the average size of a list.

 = number of elements in the hash table/ number of positions in the hash table(M)

Average find complexity is 1 + 

Want  to be approximately 1

To reduce worst case complexity we choose hash functions which distribute the elements evenly in the list.

open addressing
Open Addressing

Separate chaining requires manipulation of pointers and dynamic memory allocation which are expensive.

Open addressing is an alternate scheme.

Want to insert key (element) j

Compute h(j) = k

If H[k] is empty store in H[k], otherwise try H[k+1], H[k+2], etc. (increment in modulo size)

Linear Probing

slide11
Every position in hash table contains one element each.

Note down example from board.

Can always insert a key as long as the table is not full

Finding may be difficult if the table is close to full.

slide12
The idea is to declare a hash table large enough so that it is never full.

Initially, all slots are empty.

Elements are inserted as described.

When an element is deleted, the space is marked deleted (empty and deleted are different).

During the find operation, one looks for element k starting from where it should be (H[h(k)]), till the element is found, or an empty slot is found.

In the latter case, we conclude that the element is not in the list.

slide13
Any problem if empty and deleted are not distinguished?

When we insert an element k, then start from H[h(k)] and move till an empty or deleted slot can be found.

An element can be inserted as long as the hash-table is not full.

If hash values are clustered, then even if hash table is relatively empty, finding may be difficult.

quadratic probing
Quadratic Probing

Alternative to linear probing.

To insert key k, try slot h(k). If the slot is full try slot h(k) + 1, then h(k) + 4, then h(k) + 9 and so on.

Advantage?

Are we guaranteed to be able to insert as long as the hash table is not full?

slide15
If size of hash table M is a prime number greater than 3, then we can always insert a new element if the table is at most half full.

We want to insert element k. h(k) = j. Let n = M/2

If the locations j, j + 1, j + 4,…..,j + n2 are all distinct modulo M, then we can insert an element in the hash table. Why?

Proof by contradiction.

Suppose there is p, q, 0  p < q  n with

j + p2 = j + q2 mod M

slide16
p2 = q2 mod M

(p – q)(p + q) = 0 mod M

Then either p = q mod M or p + q = 0 mod M.

Is that right?

Since p and q are distinct and less than M/2, neither p = q mod M nor p + q = 0 mod M

rehashing
Rehashing

If the hash table is close to full, then a hash table of bigger size is used.

The old hash table is copied into a new one.

The old hash table is subsequently deleted.

Should be done infrequently.

Chapter 5 of Weiss

ad