distributed database systems n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Distributed Database Systems PowerPoint Presentation
Download Presentation
Distributed Database Systems

Loading in 2 Seconds...

play fullscreen
1 / 25

Distributed Database Systems - PowerPoint PPT Presentation


  • 129 Views
  • Uploaded on

Distributed Database Systems. Definitions: Distributed Database : is a collection of multiple logically interrelated databases distributed over a computer network.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Distributed Database Systems' - ward


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
distributed database systems

Distributed Database Systems

Dr. Mohamed Osman Hegazi

slide2

Definitions:

  • Distributed Database : is a collection of multiple logically interrelated databases distributed over a computer network.
  • Distributed database management systems (DDBMS): The software that permits the management of DDBS and makes the distribution transparent to the users.
  • Distributed database system (DDBS) = DDB + D–DBMS
  • The two important terms in this definitions are:
  • Logically interrelated. (The Application)
  • Distributed over a network.

Dr. Mohamed Osman Hegazi

slide3

Motivation for Distributed Database

  • The development of computer network promotes de-centralization
  • In a company, the database organization might reflect the organizational structure, which is distributed into units. Each unit maintains its own database
  • Sharing of data can be achieved by developing a distributed database system which:
    • Makes data accessible by all units
    • Stores data close to where it is most frequently used

Dr. Mohamed Osman Hegazi

slide4

DDBMS Advantages:

  • Data are located near “greatest demand” site
  • Faster data access
  • Faster data processing
  • Growth facilitation
  • Improved communications
  • Reduced operating costs
  • User-friendly interface
  • Less danger of a single-point failure
  • Processor independence

Dr. Mohamed Osman Hegazi

slide5

DDBMS Disadvantages

  • Complexity of management and control
  • Security
  • Lack of standards
  • Increased storage requirements
  • Greater difficulty in managing the data environment
  • Increased training cost

Dr. Mohamed Osman Hegazi

slide6

The concept of DDB:

A DDBS is not a collection of files that can be individually stored at each node of computer network. To form a DDBS, files should not only be logically related, but there should be structure among the files, and access should be via a common interface.

Dr. Mohamed Osman Hegazi

an example
An Example

EMP(ENO, ENAME, TITLE)

ASG(ENO, PNO, DUR, RESP)

PROJ(PNO, PNAME, BUDGET)

PAY(TITLE,SAL)

Dr. Mohamed Osman Hegazi

distributed query
Distributed Query
  • If these table is stored in one place then we can “for example” using the following query to get the name and the salary of the employee who works more than 12 months.

SELECT ENAME, SAL

FROM EMP, ASG, PAY

WHERE ASG. DUR >12

AND EMP.ENO=ASG.ENO

AND PAY.TITLE=EMP.TITLE

 But if these table are distributed over deferent site then the execution of this query needs allot of process to be done , DDMS do this process and let the end user feel like database’s only user (transparence)

Dr. Mohamed Osman Hegazi

slide10

Distributed Database Transparency

  • The concepts of DDB is to fragment the data and store each fragment on its site.
  • Data may be replicated on different site (replication)
  • DDBMS hide these details from the user and makes the distribution transparent to the users. Distributed Database Transparency Features
  • Distribution transparency
  • Transaction transparency
  • Failure transparency
  • Performance transparency
  • Heterogeneity transparency

Dr. Mohamed Osman Hegazi

distributed db design
Distributed DB Design

Top-down approach:

  • have a database
  • how to split and allocate to individual sites

Two issues in top-down design

  • Fragmentation
  • Allocation

Multi-databases (or bottom-up):

  • combine existing databases
  • how to deal with heterogeneity & autonomy

Dr. Mohamed Osman Hegazi

fragmentation
Fragmentation
  • Horizontal Primary

depends on local attributes

R Derived

depends on foreign relation

  • Vertical

R

Dr. Mohamed Osman Hegazi

example

Motivation: Two sites: Sa, Sb

Qa  Qb

Sa

Sb

Example

Employee relation E (#,name,loc,sal,…)

40% of queries: 40% of queries:

Qa: select * Qb: select *

from E from E

where loc=Sa where loc=Sb

and… and ...

Dr. Mohamed Osman Hegazi

slide14

5

Joe

Sa

10

E

7

Sally

Sb

25

8

Tom

Sa

15

# Name Loc Sal

..

..

F = {F1,F2}

5

Joe

Sa

10

7

Sally

Sb

25

At Sb

..

8

Tom

Sa

15

At Sa

..

F1 = loc=Sa(E)

F2 = loc=Sb(E)

 primary horizontal fragmentation

Dr. Mohamed Osman Hegazi

slide15

Qa: Select … loc = SA ...

Loc=SA

sal < 10

Qb: Select … loc = SB ...

Loc=SA

sal  10

F3

F2

Prefer F2 to F1 and F3

Loc=SB

sal < 10

F1

Loc=SB

sal  10

Dr. Mohamed Osman Hegazi

slide16

Horizontal Fragmentation :

Peer to peer relationship – brothers

Dr. Mohamed Osman Hegazi

vertical fragmentation
Vertical fragmentation

Example:

E

E2

E1

R[T]  R1[T1], R2[T2],…, Rn[Tn] Ti  T

 Just like normalization of relations

Dr. Mohamed Osman Hegazi

vertical fragmentation example

PROJ

PNO

PNAME

BUDGET

LOC

P1

Instrumentation

150000

Montreal

P2

Database Develop.

135000

New York

New York

P3

CAD/CAM

250000

New York

New York

P4

Maintenance

310000

Paris

P5

CAD/CAM

500000

Boston

PNO

PNAME

LOC

P1

Instrumentation

Montreal

P2

Database Develop.

New York

P3

CAD/CAM

New York

P4

Maintenance

Paris

P5

CAD/CAM

Boston

Vertical Fragmentation example

PROJ1: information about project budgets

PROJ2: information about project names and locations

PROJ1

PROJ2

PNO

BUDGET

P1

150000

P2

135000

P3

250000

P4

310000

P5

500000

Dr. Mohamed Osman Hegazi

grouping attributes
Grouping Attributes

E1(#,NM,LOC)

E2(#,SAL)

Example:

E(#,NM,LOC,SAL) E1(#,NM)

E2(#,LOC)

E3(#,SAL)

Which is the right vertical fragmentation?

…..

Dr. Mohamed Osman Hegazi

slide20

Vertical Fragmentation :

branch relationship – parents and son

Dr. Mohamed Osman Hegazi

hybrid fragmentation
Hybrid Fragmentation

R

HF

HF

R1

R2

VF

VF

VF

VF

VF

R11

R12

R21

R22

R23

Dr. Mohamed Osman Hegazi

allocation
Allocation

Example: E  F1 = loc=Sa(E); F2 = loc=Sb(E)

Fragment E

F1

F1

Site c

Site a

F2

Site b

Do we replicate fragments?

Where do we place each copy of each fragment?

Dr. Mohamed Osman Hegazi

allocation alternatives

read - only queries

1

update queries

Allocation Alternatives
  • Non-replicated
    • partitioned : each fragment resides at only one site
  • Replicated
    • fully replicated : each fragment at each site
    • partially replicated : each fragment at some of the sites
  • Rule :

If replication is advantageous,

otherwise replication may cause problems

Dr. Mohamed Osman Hegazi

optimization problem

Very hard problem

Optimization problem
  • What is the best placement of fragments and/or best number of copies to:
    • minimize query response time
    • maximize throughput
    • minimize “some cost”
    • ...
  • Subject to constraints
    • Available storage
    • Available bandwidth, processing power,…
    • Keep 90% of response time below X
    • ...

Dr. Mohamed Osman Hegazi

replication
Replication

Replication is to store copies of the same data in more than one location (site) and then these copies must be consistency updated "Despite the distance from each other"

Controlling the updating of these copies is done by one of two techniques:

Lazy replication: it is to update the data after the completion of work on one of the copies (master copy). This means that update is done outside the boundaries of transaction Eager replication: is to update the replicated data within the transaction boundaries while working on one of the copies.

  • central update(initial copy primary copy): update the primary copy first and then update the secondary copy. This method leads to lack of synchronization of the update, which facilitates control of consistency, but may lead to the problems of the bottleneck
  • Or update everywhere: ​​updating the copies in all places make all the copies of equal opportunities for the update.

Dr. Mohamed Osman Hegazi