Chapter 8 Hashing - PowerPoint PPT Presentation

Chapter 8 hashing
Download
1 / 16

  • 136 Views
  • Uploaded on
  • Presentation posted in: General

Chapter 8 Hashing. Part II. Dynamic Hashing. Also called extendible hashing Motivation Limitations of static hashing When the table is to be full, overflows increase. As overflows increase, the overall performance decreases.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Chapter 8 Hashing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Chapter 8 hashing

Chapter 8 Hashing

Part II


Dynamic hashing

Dynamic Hashing

  • Also called extendible hashing

  • Motivation

    • Limitations of static hashing

      • When the table is to be full, overflows increase. As overflows increase, the overall performance decreases.

      • We cannot just copy entries from smaller into a corresponding buckets of a bigger table.

      • The use of memory space is not flexible.

Hash table

Keys

0

k1

k2

k3

1

h (Hash function)

2

n


Properties of dynamic hashing

Properties of Dynamic Hashing

  • Allow the size of dictionary to grow and shrink.

    • The size of hash table can be changed dynamically.

    • The term “dynamically” implies the following two things can be modified:

      • Hash function

      • The size of hash table

Hash table

Keys

Keys

Hash table

0

k1

k2

k3

k1

k2

k3

0

h

h’

m

m

m’


8 3 2 dynamic hashing using directories

8.3.2 Dynamic Hashing Using Directories

  • Use an auxilinary table to record the pointer of each bucket.

Disk

(Directory)

Bucket 1

Auxilinary table

Keys

k1

k2

k3

Bucket 2

Bucket 3

d


Dynamic hashing using directories

Dynamic Hashing Using Directories

  • Define the hash function h(k) transforms k into 6-bit binary integer.

  • For example:


Dynamic hashing using directories1

Dynamic Hashing Using Directories

  • The size of d is 2r, where r is the number of bits used to identify all h(x).

    • Initially, Let r = 2. Thus, the size of d = 22 = 4.

  • Suppose h(k, p) is defined as the p least significant bits in h(k), where p is also called dictionary depth.

  • E. g.

    • h(C5) = 110 101

    • h(C5, 2) = 01

    • h(C5, 3) = 101


Process to expand the directory

Process to Expand the Directory

  • Consider the following keys have been already stored. The least r is 2 to differentiate all the input keys.

Directory of pointers to buckets

00

A0

B0

01

A1

B1

10

C2

11

C3

d


When c5 110101 is to enter

When C5 (110101) is to enter

  • Since r=2 and h(C5, 2) = 01, follow the pointer of d[01].

  • A1 and B1 have been at d[01]. Bucket overflows.

    • Find the least u such that h(C5, u) is not the same with some keys in h(C5, 2) (01) bucket.

      • In this case, u = 3.

      • Step 2-1

        Since u > r, expand the size of d to 2u and duplicate the pointers to the new half (why?).


When c5 110101 is to enter1

When C5 (110101) is to enter

  • Table的size變大後照理說每個entry都必須重新根據新的hash function重算其所在的bucket。但由於重算耗費的時間太多,所以暫時保留在原來的bucket,在新的bucket中保留舊的pointer,直到下次出現overflow時再重算。

000

A0

B0

001

A1

B1

010

C2

011

C3

100

101

110

111


When c5 110101 is to enter2

When C5 (110101) is to enter

  • Step 2-2

    • Rehash identifiers 01 (A1 and B1) and C5 using new hash function h(k, u).

  • Step 2-3

    • Let r = u = 3.

000

A0

B0

001

A1

B1

010

C2

011

C3

100

101

C5

110

111


When c1 110001 is to enter

When C1 (110001) is to enter

  • Since r=3 and h(C1, 3) = 001, follow the pointer of d[001].

  • A1 and B1 have been at d[001]. Bucket overflows.

    • Find the least u such that h(C1, u) is not the same with some keys in h(C1, 3) (001) bucket.

      • In this case, u = 4.

      • Step 2-1

        Since u > r, expand the size of d to 2u and duplicate the pointers to the new half.


Chapter 8 hashing

0000

A0

B0

0001

A1

B1

0010

C2

0011

C3

0100

C5

0101

0110

0111

1000

1001

1010

1011

1100

1101

1110

1111


Chapter 8 hashing

0000

A0

B0

0001

A1

C1

0010

  • Step 2-2

    • Rehash identifiers 001 (A1 and B1) and C1 using new hash function h(k, u).

  • Step 2-3

    • Let r = u = 4.

C2

0011

C3

0100

C5

0101

0110

0111

1000

1001

B1

1010

1011

1100

1101

1110

1111


When c4 110100 is to enter

When C4 (110100) is to enter

  • Since r=4 and h(C4, 4) = 0100, follow the pointer of d[0100].

  • A0 (100000) and B0 ((101000)) have been at d[0100]. Bucket overflows.

    • Find the least u such that h(C1, u) is not the same with some keys in h(C1, 4) (0100) bucket.

      • In this case, u = 3.

      • Step 2-1

        Since u = 3 < r = 4, d is not required to expand its size.


Chapter 8 hashing

A0

B0

0000

A1

C1

0001

C2

0010

C3

0011

C4

0100

C5

0101

0110

0111

1000

B0

1001

B1

1010

1011

1100

1101

1110

1111


Advantages

Advantages

  • Only doubling directory rather than the whole hash table used in static hashing.

  • Only rehash the entries in the buckets that overflows.


  • Login