chapter 8 hashing
Download
Skip this Video
Download Presentation
Chapter 8 Hashing

Loading in 2 Seconds...

play fullscreen
1 / 16

Chapter 8 Hashing - PowerPoint PPT Presentation


  • 152 Views
  • Uploaded on

Chapter 8 Hashing. Part II. Dynamic Hashing. Also called extendible hashing Motivation Limitations of static hashing When the table is to be full, overflows increase. As overflows increase, the overall performance decreases.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Chapter 8 Hashing ' - oro


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
dynamic hashing
Dynamic Hashing
  • Also called extendible hashing
  • Motivation
    • Limitations of static hashing
      • When the table is to be full, overflows increase. As overflows increase, the overall performance decreases.
      • We cannot just copy entries from smaller into a corresponding buckets of a bigger table.
      • The use of memory space is not flexible.

Hash table

Keys

0

k1

k2

k3

1

h (Hash function)

2

n

properties of dynamic hashing
Properties of Dynamic Hashing
  • Allow the size of dictionary to grow and shrink.
    • The size of hash table can be changed dynamically.
    • The term “dynamically” implies the following two things can be modified:
      • Hash function
      • The size of hash table

Hash table

Keys

Keys

Hash table

0

k1

k2

k3

k1

k2

k3

0

h

h’

m

m

m’

8 3 2 dynamic hashing using directories
8.3.2 Dynamic Hashing Using Directories
  • Use an auxilinary table to record the pointer of each bucket.

Disk

(Directory)

Bucket 1

Auxilinary table

Keys

k1

k2

k3

Bucket 2

Bucket 3

d

dynamic hashing using directories
Dynamic Hashing Using Directories
  • Define the hash function h(k) transforms k into 6-bit binary integer.
  • For example:
dynamic hashing using directories1
Dynamic Hashing Using Directories
  • The size of d is 2r, where r is the number of bits used to identify all h(x).
    • Initially, Let r = 2. Thus, the size of d = 22 = 4.
  • Suppose h(k, p) is defined as the p least significant bits in h(k), where p is also called dictionary depth.
  • E. g.
    • h(C5) = 110 101
    • h(C5, 2) = 01
    • h(C5, 3) = 101
process to expand the directory
Process to Expand the Directory
  • Consider the following keys have been already stored. The least r is 2 to differentiate all the input keys.

Directory of pointers to buckets

00

A0

B0

01

A1

B1

10

C2

11

C3

d

when c5 110101 is to enter
When C5 (110101) is to enter
  • Since r=2 and h(C5, 2) = 01, follow the pointer of d[01].
  • A1 and B1 have been at d[01]. Bucket overflows.
    • Find the least u such that h(C5, u) is not the same with some keys in h(C5, 2) (01) bucket.
      • In this case, u = 3.
      • Step 2-1

Since u > r, expand the size of d to 2u and duplicate the pointers to the new half (why?).

when c5 110101 is to enter1
When C5 (110101) is to enter
  • Table的size變大後照理說每個entry都必須重新根據新的hash function重算其所在的bucket。但由於重算耗費的時間太多,所以暫時保留在原來的bucket,在新的bucket中保留舊的pointer,直到下次出現overflow時再重算。

000

A0

B0

001

A1

B1

010

C2

011

C3

100

101

110

111

when c5 110101 is to enter2
When C5 (110101) is to enter
  • Step 2-2
    • Rehash identifiers 01 (A1 and B1) and C5 using new hash function h(k, u).
  • Step 2-3
    • Let r = u = 3.

000

A0

B0

001

A1

B1

010

C2

011

C3

100

101

C5

110

111

when c1 110001 is to enter
When C1 (110001) is to enter
  • Since r=3 and h(C1, 3) = 001, follow the pointer of d[001].
  • A1 and B1 have been at d[001]. Bucket overflows.
    • Find the least u such that h(C1, u) is not the same with some keys in h(C1, 3) (001) bucket.
      • In this case, u = 4.
      • Step 2-1

Since u > r, expand the size of d to 2u and duplicate the pointers to the new half.

slide12

0000

A0

B0

0001

A1

B1

0010

C2

0011

C3

0100

C5

0101

0110

0111

1000

1001

1010

1011

1100

1101

1110

1111

slide13

0000

A0

B0

0001

A1

C1

0010

  • Step 2-2
    • Rehash identifiers 001 (A1 and B1) and C1 using new hash function h(k, u).
  • Step 2-3
    • Let r = u = 4.

C2

0011

C3

0100

C5

0101

0110

0111

1000

1001

B1

1010

1011

1100

1101

1110

1111

when c4 110100 is to enter
When C4 (110100) is to enter
  • Since r=4 and h(C4, 4) = 0100, follow the pointer of d[0100].
  • A0 (100000) and B0 ((101000)) have been at d[0100]. Bucket overflows.
    • Find the least u such that h(C1, u) is not the same with some keys in h(C1, 4) (0100) bucket.
      • In this case, u = 3.
      • Step 2-1

Since u = 3 < r = 4, d is not required to expand its size.

slide15

A0

B0

0000

A1

C1

0001

C2

0010

C3

0011

C4

0100

C5

0101

0110

0111

1000

B0

1001

B1

1010

1011

1100

1101

1110

1111

advantages
Advantages
  • Only doubling directory rather than the whole hash table used in static hashing.
  • Only rehash the entries in the buckets that overflows.
ad