Distributed dbmss concepts and design
This presentation is the property of its rightful owner.
Sponsored Links
1 / 40

Distributed DBMSs – Concepts and Design PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on
  • Presentation posted in: General

Distributed DBMSs – Concepts and Design. Chapter 24 in Textbook. Overview. Concepts. What is a distributed DBMS? Distributed Processing. Homogeneous vs. Heterogeneous. Functions of a DDBMS. Components of a DDBMS. Advantages and Disadvantages. DDBMS Design. Fragmentation. Replication.

Download Presentation

Distributed DBMSs – Concepts and Design

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Distributed dbmss concepts and design

Distributed DBMSs – Concepts and Design

Chapter 24 in Textbook


Overview

Overview

  • Concepts.

    • What is a distributed DBMS?

    • Distributed Processing.

    • Homogeneous vs. Heterogeneous.

  • Functions of a DDBMS.

  • Components of a DDBMS.

  • Advantages and Disadvantages.

  • DDBMS Design.

    • Fragmentation.

    • Replication.

    • Allocation.

  • DDBMS Transparencies.

  • Date’s 12 Rules for a DDBMS.


Concepts

Concepts

  • Centralized DBMS systems with a single logical database located at one site under the control of a single DBMS.

  • Distributed DBslogically interrelated collection of shared data physically distributed over a computer network.

  • Applications can be classified into:

    • Local applications.

    • Global applications.


Distributed dbms

Distributed DBMS

  • Distributed DBMS The software system that:

    • manages the distributed DBs.

    • makes distribution transparent to users.

    • allows users to access data on their own site as well as remote sites.

  • Transparent distribution is the fundamental principle of DDBMS.


Characteristics of ddbms

Characteristics of DDBMS

  • A collection of logically related shared data.

  • The data is split into a number of fragments.

  • Fragments may be replicated.

  • Fragments/replicas are allocated to sites.

  • The sites are linked by a communications networks.

  • The data at each site is under the control of a DBMS.

  • The DBMS at each site can handle local applications.

  • Each DBMS participates in at least one global application.


Distributed dbms topology

Distributed DBMS Topology

Site 1

Site 4

Site 2

Computer

Network

Site 3

Data itself is distributed and access to it can be local or remote.


Distributed processing

Distributed Processing

Site 1

Site 4

Site 2

Computer

Network

Site 3

Data itself is centralized but access to it can be local or remote.


Homogeneous vs heterogeneous ddbms

Homogeneous vs. Heterogeneous DDBMS

  • Homogenous system: all sites use the same DBMS product.

  • Heterogeneous system: sites may run different DBMS products & data model.

    • Possible differences between data in different DBS:

      • Data type difference.

      • Value difference.

      • Semantic difference.


Functions of a ddbms

Functions of a DDBMS

  • Provide access to remote sites and allow transfer of queries & data among the network’s site.

  • Store data distribution details.

  • Distributed data processing.

  • Security control.

  • Concurrency control.

  • Recovery services.


Components of a ddbms

Components of a DDBMS

Global system catalog

Site 1

DDBMS

DC LDBMS

GSC

DDBMS

DC

Computer

Network

GSC

DB

Data communication component

Site 3


Advantages of ddbms

Advantages of DDBMS

  • Reflects organizational structure.

  • Improve sharability & local autonomy.

  • Improved availability.

  • Improved reliability.

  • Improved performance.


Disadvantages of ddbms

Disadvantages of DDBMS

  • Complexity.

  • Cost.

  • Security.

  • Integrity control more difficult.

  • Lack of standards.

  • Lack of experience.

  • DB design more complex.


Distributed relational db design

Distributed Relational DB Design

  • We have a group of tables and we want to distribute them between a group of sites.

  • Consists of 3 major steps:

  • Fragmentation divide a relation into a number of sub-relations (fragments). (Horizontal & vertical).

  • Replication make a copy of a fragment.

  • Allocation decide where (which site) each of the fragments and replicas are to be stored.


Distributed relational db design1

Distributed Relational DB Design

  • When we fragment, replicate and allocate, we try to achieve:

    • Locality of reference.

    • Improved reliability and availability.

    • Good performance.

    • Balanced storage capacities and costs.

    • Minimal communication costs.


Rules of fragmentation

Rules of Fragmentation

  • Completeness: Nothing (rows or columns) gets lost while we fragment.

  • Reconstruction: We can get back the original table after we fragmented it.

  • Dis-jointness: No row or column appears in 2 fragments (there is 1 exception).


Types of fragmentation

Types of Fragmentation

Horizontal

fragmentation

Vertical

fragmentation

Mixed

fragmentation


Original propertyforrent table

Original PropertyForRent Table


Distributed dbmss concepts and design

Horizontal Fragmentation

Based on type of property.

P1: Type=‘House’ (PropertyForRent)

P2: Type=‘Flat’ (PropertyForRent)

Fragment P1

Fragment P2

BranchNo


Original staff table

Original Staff Table


Distributed dbmss concepts and design

Vertical Fragmentation

S1: staffno,Position,sex,DOB, Salary(STAFF)

S2: staffno,fname,lname,BranchNo(STAFF)

Fragment S2

Fragment S1


Distributed dbmss concepts and design

Mixed Fragmentation – Vertical then Horizontal

Fragment S2.1

S1: staffno,Position,sex,DOB, Salary(STAFF)

S2: staffoo,fname,lname,BranchNo(STAFF)

StaffNo

LName

BranchNo

FName

S2.1:  BranchNo=‘B005’ (S2)

S2.2:  BranchNo=‘B003’ (S2)

S2.3:  BranchNo=‘B007’ (S2)

Fragment S2.2

StaffNo

LName

BranchNo

Fragment S1

Fragment S2.3

LName

BranchNo

FName


Derived horizontal fragmentation

Derived Horizontal Fragmentation

  • Derived Horizontal Fragmentation is the horizontal fragmentation of a table (child), T1, because we horizontally fragmented another related table (parent), T2.

  • It is not explicitly specified in design but implied from fragmentation of T2.

  • T1 (child) has a foreign key that belongs to T2 (parent).

  • Relationship between T1 and T2 either 1-to-1 or Many-to-1.

  • Use Semi-join operation: 


Derived horizontal fragmentation1

Derived Horizontal Fragmentation

  • You were required by the design to horizontally fragment Staff table.

    • S1:  BranchNo=‘B003’ (Staff)

    • S2:  BranchNo=‘B005’ (Staff)

    • S3:  BranchNo=‘B007’ (Staff)


Derived horizontal fragmentation2

Derived Horizontal Fragmentation

Fragment S1

Fragment S2

Fragment S3


Derived horizontal fragmentation3

Derived Horizontal Fragmentation

  • After we fragmented Staff, we found out that there is a table related to it, PropertyForRent.

  • Because Staff is now fragmented, it makes sense to fragment PropertyForRent too.

Staff

PropertyForRent

handles

1

N

S1:  BranchNo=‘B003’ (Staff)

S2:  BranchNo=‘B005’ (Staff) Pi: PropertyForRent staffNo Si

S3:  BranchNo=‘B007’ (Staff)


Original propertyforrent table1

Original PropertyForRent Table


Derived horizontal fragmentation4

Derived Horizontal Fragmentation

Fragment P1

Fragment P2

Fragment P3


Transparencies in a ddbms

Transparencies in a DDBMS

  • 4 main transparencies:

    • Distribution Transparency.

      • Fragmnetation.

      • Location.

      • Replication.

      • Local Mapping.

      • Naming.

    • Transaction Transparency.

    • Performance Transparency.

    • DBMS Transparency.


1 distribution transparency

1. Distribution Transparency

  • Allows the user to perceive the DB as a single, logical entity.

  • Types:

    • Fragmentation:the user does not need to know the data is fragmented.

    • Location: the user does not need to know the location of fragments.

    • Replication: the user does not need to know the fragments are replicated.

    • Local Mapping: the user specifies the fragment and its location.

    • Naming:DDBMS makes sure every item name is unique.

  • Consider the distribution of the STAFF relation:

    • S1: staffno,Position,sex,DOB, Salary(STAFF)

    • S2: staffno,fname,lname,BranchNo(STAFF)

    • S21:  BranchNo=‘B003’ (S2)

    • S22:  BranchNo=‘B005’ (S2)

    • S22:  BranchNo=‘B007’ (S2)


A fragmentation transparency

a. Fragmentation Transparency

  • Highest level of distribution transparency.

  • The user does not need to know that the data is fragmented.

  • User treats DDB like a centralized DB.

  • The database access are based on the global schema.

  • Fragmentation of the data can be changed without impacting the user.

  • Example:

    SELECTFname, Lname

    FROM Staff

    WHERE position = ‘Manager’;


B location transparency

b. Location Transparency

  • The middle level of distribution transparency.

  • The user must know that the data is fragmented but still does not need to know the location of the data.

  • Data location can be changed without impact on the user.

  • Example:

    SELECTFname, LnameFROMS21

    WHEREstaffNo IN (SELECT staffNo FROM S1 WHERE position=‘Manager’)

    UNION

    SELECTFname, LnameFROMS22

    WHEREstaffNo IN (SELECT staffNo FROM S1WHERE position=‘Manager’)

    UNION

    SELECTFname, LnameFROMS23

    WHEREstaffNo IN (SELECT staffNo FROM S1WHERE position=‘Manager’)


C replication transparency

c. Replication Transparency

  • User unaware of replication and location but knows that data is fragmented.

  • On the same level with location transparency.


D local mapping transparency

d. Local Mapping Transparency

  • The lowest level of distribution transparency.

  • The user knows that the data is fragmented and the location of the data.

    Example:

    SELECTFname, LnameFROMS21AT SITE 3

    WHEREstaffNo IN

    (SELECT staffNo FROM S1AT SITE 5 WHERE position=‘Manager’)

    UNION

    SELECTFname, LnameFROMS22AT SITE 5

    WHEREstaffNo IN

    (SELECT staffNo FROM S1AT SITE 5 WHERE position=‘Manager’)

    UNION

    SELECTFname, LnameFROMS23AT SITE 7

    WHEREstaffNo IN

    (SELECT staffNo FROM S1AT SITE 5 WHERE position=‘Manager’)


E naming transparency

e. Naming Transparency

  • Each item in distributed database must have a unique name.

  • DDBMS must ensure that no two sites violate that.

  • Solutions

    • Create a central name server.

      • Bottleneck.

      • against local autonomy.

    • Prefix an object with the identifier of the site.

      • loss of distribution transparency.


2 transaction transparency

2. Transaction Transparency

  • All transactions must ensure the consistency and integrity of the DDB.

  • Each transaction that needs to access data in multiple sites is divided into multiple sub-transactions.

  • Even if transaction is split, atomicity has to be maintained.


3 performance transparency

3. Performance Transparency

  • DDBMS performs as if it were a centralized DBMS.

  • Should not suffer because it is distributed (network communication cost).

  • When a site issues a query, the system must figure out the fastest way of executing it.

  • Distributed Query Processor (DQP) must figure out:

    • Which fragment to access.

    • Which copy of fragment to access (if replication is used).

    • Where are the fragments.


3 performance transparency1

3. Performance Transparency

  • Consider the following distributed DB:

    • Property(PropertyNo, city)10,000 records in London

    • Client(ClientNo, maxPrice)100,000 records in Glasgow

    • Viewing(PropertNo, ClientNo)1,000,000 records in London

  • London site wants to list properties in Aberdeen that have been viewed by clients who have a maximum price limit greater than 200,000.

    SELECTp.propertyNo

    FROM Property P INNER JOIN

    (Client c INNER JOIN Viewing v ONc.clientNo = v.clientNo)

    ONp.propertyNo = v.propertyNo

    WHEREp.city = ‘Aberdeen’ AND

    c.maxprice > 200000;


3 performance transparency2

3. Performance Transparency

  • After the query is issued, DDBMS must determine the most cost-effective strategy to execute the query.

  • Strategies:

  • Move Client table to London and process query there.

  • Move Property and Viewing relation to Glasgow and process query there then return result.

  • Join Property and Viewing at London, project only property number and client number and move result to Glasgow to join with clients with maxPrice > 200,000 then return results.

  • Select clients at Glasgow with maxPrice> 200000, move them to London and join with viewing and Aberdeen property.


4 dbms transparency

4. DBMS Transparency

  • Hides the fact that different sites have different local DBMSs.

  • Heterogeneous DDBMSs.


Date s 12 rules for a ddbms

Date’s 12 Rules for a DDBMS

  • Local autonomy.

  • No reliance on a central site.

  • Continuous operation.

  • Location independence.

  • Fragmentation independence.

  • Replication independence.

  • Distributed query processing.

  • Distributed transaction processing.

  • Hardware independence.

  • Operating system independence.

  • Network independence.

  • Database independence.


  • Login