Architectures of Distributed Database Systems

Architectures of Distributed Database Systems PowerPoint PPT Presentation

  • Updated On :
  • Presentation posted in: General

2. Architecture. Defines the structure of a systemidentify the components in the system define the functions of each componentdefine the interrelationships and interactions among the componentsReference Modelan idealized architectural model of the system for referencea conceptual framework (functional and not physical)its purpose is to divide standardization work into manageable (smaller) pieces and to show at a general level how these pieces are related to one another.

Download Presentation

Architectures of Distributed Database Systems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

1. 1 Architectures of Distributed Database Systems What is an architecture? What is a reference model? Architecture Dimensions: autonomy, distribution and heterogeneity Differences between Client/server, peer-to-peer, multi-DBS architectures Directory Management

2. 2 Architecture Defines the structure of a system identify the components in the system define the functions of each component define the interrelationships and interactions among the components Reference Model an idealized architectural model of the system for reference a conceptual framework (functional and not physical) its purpose is to divide standardization work into manageable (smaller) pieces and to show at a general level how these pieces are related to one another

3. 3 An architecture showing the main components in a database system

4. 4 Reference Model Based on Components (internal) What are the (functional units) components in the systems What are the relationships amongst the components Based on functions (got users) What are the different classes of users What are the functions for each class of users Based on Data What are the different types of data in the system How do the functional units use and access the data and what are the output data of each functional unit

5. 5 ANSI/SPARC Architecture

6. 6 Architecture based on Data Model Internal view deal with the physical definitions and organization of data External view concern about how users view the database each user may have its own view a view may be shared by several users Conceptual view is an abstract definition of the database used by the system (applications)

7. 7 DDBS Implementation Alternatives

8. 8 Dimensions of implementation Autonomy refers to the distribution of control (NOT data) in the system (among the sites) (I.e., manage the access of data and transaction commit) indicates the degree of which individual site can operate independently I.e., the process at a site is less affected (and dependent on) by the processes at other sites I.e., individual site may make the processing decision for a transaction without asking the permission from other sites Tight integration (low autonomy) a single-image of the entire database is available to all users similar to a single database system to the users the sites work together closely to complete a transaction

9. 9 Dimensions of implementation Semi-autonomous (some degrees of autonomy): consists of DBSs that can operate independently, but have decided participate in a federation to make their local data sharable with other sites Total isolation (high autonomy) (multi-DBSs) the individual systems are stand-alone DBSs but they are connected mechanisms are provided to users to access other DBSs using their own language or a different language mechanisms (i.e., for conversion) are provided to access other DBSs How to access remote data in different architectures? Which one we prefer?

10. 10 Dimensions of implementation Distribution refer to the physical distribution (or migration) of data and software components (for processing transactions) over multiple sites (from one server to another) client/server (distributed when required) data are primarily stored at server server provides services and clients generate requests clients may get the data when they need them peer-to-peer (fully distributed) each site has a similar structure no distinction between client and server (all are at the same level and have the same structure) all are servers (as well as clients) and each DBS at a site maintains a fragment (portion) of the database

11. 11 System concept Local data item Vs. remote copy What is a system? A well-defined function (I.e., transaction processing) A well-defined system boundary Input and output How to define a system boundary? Normally, the components within a system are closely related to each other (tightly coupled) Normally, the components within a system are not (or only loosely) related to the components outside the system boundary System and sub-system Autonomy is the relationship among the sub-systems (which are managing the data items and transaction processing) in a system Distribution is the division and distribution (even or uneven) of the software components (and data) of a system to sub-systems (at different sites)

12. 12 System concept

13. 13 Dimensions of implementation Heterogeneity Differences between the sub-systems at different sites Various levels (hardware, communications, operating system) Related to database: data model, data format, query language, transaction management algorithms When accessing other (remote) DBSs, conversions are required

14. 14 Example architectures Examples: A: autonomy; D: distribution; H: heterogeneity A0, D0, H0 no distribution and data migration (D0), same hardware and data model (H0) a set of logically related multiple DBS (of the same type) (H0) users consider the whole system as a single DBS (A0) A0, D1, H1 no autonomy and the sites work together to process transactions. Users consider the whole system as a single DBS (A0) some components are distributed at multiple sites and data may migrate to other sites (D1) the data formats and the processing methods at different sites may be different and the system needs to provide data and access method conversion to access remote DB (H1)

15. 15 Example architectures A0, D1, H0 no autonomy and users consider the whole system as a single DBS (A0) and the sites work together closely to process transactions certain degree of data distribution, I..e., the data may be distributed between client and server (D1) A1, D0, H0 each DBS has its own DB (D0) each site partly contributes to the processing of (global) transactions in the whole system (A1) each DBS may has its own transactions (local processing) A2, D0, H1 full autonomy different database systems at different sites multi-database systems

16. 16 Client/Server DB (A0, D1, H0) A two-levels architecture: clients and servers Server: performs most of data and transaction management work Client: interface, application, certain degrees of local processing and management of cached data Advantages of Client-Server Architecture More efficient in division of labor Better price/performance on client machines Ability to use familiar tools on client machines Full DBS functionality provided to client workstations Disadvantage: Consistency between the server copy and client copy (cached data) has to be ensured (cached data management). Update at server? How to divide the jobs between client and server? Processing power, network bandwidth and control problems

17. 17 Client/Server Database Architecture

18. 18 Multiple Clients/Single Server

19. 19 Multiple – Client/Single Server Single server problem Server forms the bottleneck Server forms the single point of failure (reliability) Database scaling is difficult Solution Multiple servers to increase the scalability and reliability The server forms a distributed database system Distribution of workload Each server maintains a partition of the database

20. 20 Multiple Servers Heavy client The division of jobs between server and client The client is more powerful and can perform more functions Each client knows the locations of the servers and communicates with other servers as required Clients have to manage a directory of servers (heavy) and manage their cached data Light client Each client manages its own connection to a server which communicates to other servers for the client Very limited amount of jobs will be performed at the client

21. 21 Multiple Clients/ Multiple Servers

22. 22 Server-to-Server

23. 23 Peer-to-Peer (A0, D2, H0) The sites work together to process transactions Data Model Users consider the whole system as a single DBS Individual Local Internal Schema (LIS) at each DB site Global Conceptual Schema (GCS) for the whole system Each Local Internal Schema connects to the GCS through a Local Conceptual Schema (LCS) Location transparency is supported by the GCS and LCS Each user has an External Schema (ES), which is connected to the GCS for its own purposes Global queries are translated into local queries based on the GCS Local queries are executed concurrently at different sites Components at each sites are closely related for the transaction processing

24. 24 Peer-to-Peer Architecture

25. 25 Peer-to-Peer Architecture

26. 26 Peer-to-Peer Architecture User processor handle the interactions with users Data processor data management and query (transaction) processing User interface handler accept user command Semantic data controller check command for execution Global query optimizer optimization (plan for execution)

27. 27 Peer-to-Peer Architecture Execution monitor coordinate the distributed execution of a query Local query optimizer Local recovery manager ensure database consistent (correctness) even after failure Run-time support processor buffer and data management Note there are many different ways to classify the function units in a DDBS

28. 28 Multi-DBS Architecture (A2, D0, H1) Each site is a DBS and they connect with each other to form a global database system (loosely coupled) The components at different sites are loosely connected by a upper layer Two types of transactions in the systems: local and global transactions The Global Conceptual Schema connects to some of the Local Conceptual Schemas Local database is a sub-set of the global database Users may define their local external view on their local database Uni-lingual multi-DBS access the global database using the same data model and language, which may be different from its local ones multi-lingual multi-DBS access the global database using the its local data model and language multi-users may use different data model and language

29. 29 Multi-DBS Architecture

30. 30 Components of a Multi-DBS

31. 31 Directory Issues A directory is required in a distributed database system to access the Global Conceptual Schema (and to access global data) A directory is a meta-data (data about data) about the database It includes information about the locations of the data A directory may be: Global, local or hierarchical A global directory, and each site has a local directory The global directories are organized in a tree structure Distributed (the global directory is partitioned) or centralized Replicated or single copy The choice depends on performance, reliability, size of directory and workload distribution

32. 32 Directory Issues

33. 33 Reference Ozsu: Ch.4

  • Login