CS5539 Data Structures and Algorithms - PowerPoint PPT Presentation

Cs5539 data structures and algorithms l.jpg
Download
1 / 20

  • 259 Views
  • Uploaded on
  • Presentation posted in: Pets / Animals

CS5539 Data Structures and Algorithms. Lecture 19 Hashing. Reading. Watt and Brown:Chapter 12. Time Complexities. OperationKey-indexed Parallel Single BSTBST arraysortedLinked well ill arrayList balanced getO(1)O(log n)O(n)O(log n)O(n)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

CS5539 Data Structures and Algorithms

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cs5539 data structures and algorithms l.jpg

CS5539 Data Structures and Algorithms

Lecture 19

Hashing


Reading l.jpg

Reading

Watt and Brown:Chapter 12


Time complexities l.jpg

Time Complexities

OperationKey-indexed Parallel Single BSTBST

arraysortedLinked well ill

arrayList balanced

getO(1)O(log n)O(n)O(log n)O(n)

removeO(1)O(n)O(n)O(log n)O(n)

putO(1)O(n)O(n)O(log n)O(n)


Perceived problems with key indexed array l.jpg

Perceived problems with key-indexed array

  • Potential size – much memory

  • Keys may be strings: cannot be used as index

    • Conversion of strings to integers using ASCII character codes

    • Large numbers result

    • Map large numbers to small numbers


Implementation l.jpg

Implementation

Aim:

obtain time complexity O(1) without restriction on key type

Hashing

  • Gives superior performance

  • O(1) performance for the following operations:

    • get()

    • remove()

    • put()


Hashing l.jpg

Hashing

  • The key field is changed into a small integer by the application of a function to the key

  • Hash function: the function used to transform the key into a small integer

  • Hash value: the derived small integer used as index


Hash table l.jpg

Hash Table

  • Hash table: one-dimensional structure consisting of indexed buckets where values stored according to index determined by hash value.

  • Every element of hash table should be initialised to “empty”


Calculating a hash table index from an key l.jpg

Calculating a Hash Table index from an Key

1.Create integer hash code

Derived from the value of the key

Ideally unique hash code for each key

2. Map hash code on to the index range

0..Size-1 of the table

Typically uses modulo arithmetic

hashcode(key)

% size

index = hashcode(key) % size


Graphical representation of hash table l.jpg

Hashtable [0]

Hashtable [1]

Hashtable [2]

. . .

. . .

Hashtable [n-2]

Hashtable [n-1]

v

Represents an empty bucket

v

Graphical Representation of Hash Table


Hashing to a bucket l.jpg

Hashtable [0]

Hashtable [1]

Hashtable [2]

. . .

. . .

Hashtable [n-2]

Hashtable [n-1]

v

Hashing to a Bucket

keyvalue3

Hashfunction(keyvalue1) 2

Hashfunction(keyvalue2) n-1

keyvalue1

Hashfunction(keyvalue3) 0

Hashfunction(keyvalue4) n-2

keyvalue4

keyvalue2


Simple example 1 l.jpg

cat

Bucket 2

cougar

coyote

horse

Bucket 7

hippopotamus

Simple Example 1

Use alphabet position of first letter of word. (Start at 0)

Hash Table has 26 buckets

Cat

Dog

Elephant

Frog

Grasshopper

Hippopotamus

Horse

Cougar

Coyote

Zebra


Simple example 2 l.jpg

Simple Example 2

A hash function adding up the values of the characters in the key - letters are given a value using their position in the alphabetintegers are given their integer value.

Taking an table of size 10 the code S101 is converted as follows:

S=19

1= 1

0= 0

1= 1

TOTAL =21

21 modulus 10 = 1element should be placed at bucket 1


Find the bucket location for each of the following l.jpg

collision

Find the Bucket Location for each of the Following

S= 19

S101 bananas

S123 potatoes

S592tomatoes

S199plums

S102apples

S213pears

S541peaches

bucket 1

bucket 5

bucket 5

bucket 8

bucket 2

bucket 5

bucket 9

Problem: several keys hash to the same location.


A hash function l.jpg

A Hash Function

Hash(key) = (2 * int(key) modulus 10)

Cat

Dog

Elephant

Frog

(2* (3+1+20)) % 10 = 48 % 10 = 8

(2*(4+15+7)) % 10 = 52 % 10 = 2

(2*(5+12+5+16+8+1+14+20) )% 10 = 162 % 10 = 2

(2*(6+18+15+7)) % 10 = 92 % 10 = 2

Any problems with this function ?


Hash function l.jpg

Hash Function

  • Perfect hash function where each distinct key produces a different value: very rare

  • Collision: occurs when two keys hash to the same location

  • Collisions unavoidable:

    • Number of keys > size of hash table

  • Collision avoidance: choose hashing function which will place keys uniformly over rows of the hash table


Collision avoidance multiple congruency method l.jpg

Collision Avoidance:Multiple Congruency Method

  • Key changed to integer value

  • Multiply this by a large prime number

  • Divide the result obtained at 2 by the size of the hash table

int(key)

primeNumber * int(key)

(primeNumber * int(key)) % TableSize


Hashing17 l.jpg

Hashing

  • A Hash Function must

    • Be Simple to compute

    • Distribute keys as equally as possible

  • Too many collisions  degradation in performance

  • Result may be 0 so hash tables are indexed from 0


Open bucket hash table l.jpg

Open Bucket Hash Table

Open-bucket hash table: where a bucket is a storage location for a single data element.

The result of transforming a key will give the home bucket.


Closed bucket hash table chaining l.jpg

Closed Bucket Hash Table: Chaining

Closed-bucket hash table: where a bucket is a storage location for a collection of data elements


Summary l.jpg

Summary

  • Hashing is an efficient technique

  • Care must be taken in choosing a hash function so that elements are as easily spread throughout the hash table

  • Collisions are inevitable

  • A Strategy must be developed to avoid problems with collisions


  • Login